Tuesday, April 15, 2025

Hashicorp Vault Monitoring / Observability | Overview & Hands-On.

Hashicorp Vault Monitoring / Observability - Overview & Hands-On.

Focus:
    • Tailored for:
      • Devops, 
      • DevSecOps, 
      • Cloud Engineers.
Scope:
  • Intro,
  • Key Observability Data Sources,
  • Integration with External Tools,
  • Best Practices,
  • Link to official documentation,
  • How to enables vault audit log in Json format,
  • The concept: vault_audit.json,
  • How Vault Audit Works,
  • Sample Log Entry (JSON format),
  • Benefits of Using vault_audit.json,
  • Optional Enhancements,
  • Project: Hands-On.
Intro:
    • HashiCorp Vault monitoring and observability rely primarily on telemetry metrics and audit devices to provide insights into its performance, health, and security. 
    • These outputs can be integrated with third-party monitoring and observability platforms for comprehensive analysis and alerting.
Key Observability Data Sources
Telemetry Metrics
    • Vault collects various runtime metrics about its internal performance, resource usage, and subsystems (e.g., storage backend, core requests). 
    • These metrics can be consumed by monitoring solutions like Prometheus, Datadog, or Grafana. Key metrics to watch include vault.core.handle_request response times and storage backend access times.
Audit Devices:
    •  Audit devices capture a detailed, immutable log of every request and response processed by Vault. 
    • This data is critical for security, compliance, troubleshooting client issues, and understanding access patterns
    • Logs can be sent to security information and event management (SIEM) tools like Splunk for analysis.
Server and Client Logs:
    •  Operational messages from the Vault server and CLI/API clients aid in diagnosing server startup issues, configuration problems, and general errors.
API Endpoints
    • Vault exposes specific HTTP API endpoints (e.g., /sys/health/sys/replication/sys/metrics) that monitoring tools can query to check the service's health and replication status
Integration with External Tools
    • Vault is designed to integrate with industry-standard observability platforms. Operators typically stream Vault's raw data into an aggregation and analysis solution: 
Prometheus & Grafana:
    •  A common open-source stack for collecting and visualizing Vault metrics. Pre-built dashboards are often available for easier setup.
Splunk
    • Useful for aggregating and analyzing audit logs for security events and compliance reporting.
Datadog, LogicMonitor, Google Cloud Observability:
    •  Commercial platforms that offer specific integrations and pre-built dashboards for monitoring Vault performance, usage, and health.
Best Practices
Centralization:
    •  Forward all logs and metrics to a centralized observability solution for unified dashboards, reporting, and alerting.
Establish Baselines:
    • Understand normal operational performance under typical load to set meaningful alerts for anomalies in the future.
Alerting:
    •  Configure alerts on key metrics to proactively address performance bottlenecks (e.g., storage I/O latency) or potential security issues (e.g., high rate of access denials).
Link to official documentation:

https://developer.hashicorp.com/vault/tutorials/monitoring



How twtech enables vault audit log in Json format. 

    •  This allow twtech to know who exactly is accessing the Hashicorp vault , their IPaddress, what  they have done and what they have been denied access to do.
    • The vault_audit.json file comes from Vault’s audit device system, which is crucial for monitoring Vault activities. 
 The concept: vault_audit.json
    • vault_audit.json is typically a file-based audit log in JSON format. 
    • It is created when twtech enable a file audit device in Vault.

Sample:

vault audit enable file file_path=/var/log/vault_audit.json

    • This file then logs every request and response Vault handles (excluding sensitive data like secret values, unless configured otherwise).
 How Vault Audit Works

Vault audit devices capture:

    • Who made the request (identity, token, etc.)
    • When it was made (timestamp)
    • What was requested (e.g., secret read, policy change)
    • Where the request was sent (endpoint)
    • Response status (success, failure)
    • Latency (time taken)

Sample Log Entry (JSON format)

# json
{
  "time": "2025-04-15T13:00:00.123456Z",
  "type": "request",
  "auth": {
    "client_token": "hmac-sha256:abc123...",
    "accessor": "hmac-sha256:def456...",
    "display_name": "userpass-twtech-user",
    "identity": "abcd-1234"
  },
  "request": {
    "id": "12345678-90ab-cdef-1234-567890abcdef",
    "operation": "read",
    "path": "secret/data/twtech-user/credentials",
    "remote_address": "192.168.1.100"
  }
}

NB: 

    • Vault uses HMAC-sha256 hashes to protect sensitive information in logs.

Benefits of Using vault_audit.json

Benefit

Description

 Traceability

See who did what, when, and where — full visibility into Vault operations.

 Security Auditing

Detect suspicious activity like brute force attempts, unauthorized reads, or misconfigurations.

 Compliance

Helps meet compliance standards (e.g., SOC 2, HIPAA, PCI-DSS) with complete audit trails.

 Debugging

Troubleshoot issues like failed authentications, permission errors, or incorrect secret paths.

 Analytics Integration

JSON format makes it easy to integrate with tools like Splunk, ELK (Elasticsearch, Logstash, Kibana), or SIEM systems.

 Optional Enhancements

twtech can also:

    • Rotate or manage log files using tools like logrotate.
    • Forward audit logs to a centralized logging platform (e.g., Fluentd, Logstash).
    • Enable multiple audit devices (e.g., file + syslog + socket) for redundancy.

Project: Hands-On

  • Enable the path to : /var/log/vault_audit.json

vault audit enable file file_path=/var/log/

  • Troubleshooting Permission denied: vault-user need the needed permissions to access the logs.
  • Check who is running the Vault service:

ps aux | grep vault

  • If the file doesn’t exist yet, ensure Vault can write to the directory:
  • Find out if the log directory exit: (don’t use sudo in vault commands)

ls  /var/log/

  • Set ownership of the file or directory to that user (e.g., vault): 
  • touch (CLI-create) the file /var/log/vault_audit.json.
  • Then, grant ownership to vault and vault group.

Switch to root user and run the commads.

exit

# 1. Create the audit log file

sudo touch /var/log/vault_audit.json

# 2. Change ownership to the 'vault' user

sudo chown vault:vault /var/log/vault_audit.json

# 3. Set appropriate permissions

sudo chmod 640 /var/log/vault_audit.json

  • Verify if file was created 

cd /var/log/

  • Verify Who owns the file: vault_audit.json

ls -al /var/log

  • Verify that logs are exported to: /var/log/vault_audit.json
  •  How twtech accesses the log generated (in two approaches):  

# CLI

cat /var/log/vault_audit.json


# UI:

  • vault_audit.json file can be downloaded to the local mechine  for observability with ssh client:  mobaxterm.



  • twtwch Selects location to download the file to:



  • Use any compatible json file viewer to open the file ( notepad is okay to open the file)

Elaborate logs about users:







No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, What EventBridge  Really  Is (Deep...