Tuesday, September 16, 2025

Amazon CloudWatch Metrics | Overview.

Amazon CloudWatch Metrics - Overview. 

Scope:

  •       Overview of Amazon CloudWatch Metrics,
  •       Key Components,
  •       Types of Metrics,
  •       Metric Resolution,
  •       Storage and Retention,
  •       Alarms and Thresholds,
  •       Metric Math,
  •       Integration with Logs and Events,
  •       Best Practices
  •       Architecture.

1. Overview of Amazon CloudWatch Metrics

    • Amazon CloudWatch Metrics is a core feature of Amazon CloudWatch, AWS’s monitoring and observability service.
    • Metrics are time-ordered data points representing the performance or health of AWS resources and applications.
    • Each metric represents a specific aspect of a resource, e.g:
      • CPU utilization, 
      • Network throughput, 
      •  Request latency.
    • Metrics are organized by:
      • Namespace,
      • Identified by metric name / dimensions.

2. Key Components

Component

Description

Namespace

A container for metrics. Examples: AWS/EC2, AWS/S3, or custom namespaces like MyApp/Metrics.

Metric Name

Identifies the metric, e.g., CPUUtilization, Latency.

Dimensions

Key-value pairs that refine the metric. E.g., for EC2: InstanceId=i-1234567890abcdef0.

Data Points

Timestamped values for the metric.

Unit

Optional, indicates the unit of measurement: Seconds, Bytes, Percent, etc.

3. Types of Metrics

  1. AWS-Provided Metrics
    • Automatically collected for AWS resources.
    • Example: EC2 instances provide CPUUtilization, DiskReadOps, NetworkIn.
  2. Custom Metrics
    • twtech can publish its own application or business metrics using the CloudWatch API, SDK, or CLI.
    • Example: Order count per minute, payment processing latency.

4. Metric Resolution

    • Standard Resolution: 1-minute granularity (default)
    • High Resolution: 1-second granularity (useful for real-time monitoring and alerting)

5. Storage and Retention

Time Period

Granularity

Retention

0–15 days

1 minute

15 days

15–63 days

5 minutes

63 days

63–455 days

1 hour

455 days

6. Alarms and Thresholds

    • CloudWatch allows twtech to create alarms that trigger when metrics breach thresholds.
    • Alarm actions can include:
      • Sending SNS notifications
      • Auto Scaling actions
      • EC2 instance recovery
    • Supports statistical thresholds
    • Supports statistical thresholds (average, sum, min, max, percentile)

7. Metric Math

CloudWatch Metric Math allows twtech to:

    • Combine multiple metrics into a single expression.
    • Perform operations like:
      • Sum, average, min, max
      • Rate of change
      • Percentiles
    • Example:
    • Example: Compute total CPU utilization across multiple EC2 instances.

8. Integration with Logs and Events

    • CloudWatch Logs Can generate metrics from log data using metric filters.
    • CloudWatch Events (EventBridge) Can trigger workflows based on metric thresholds.

9. Best Practices

    1. Use meaningful namespaces and dimensions for easy filtering.
    2. Publish custom metrics sparingly (excess metrics increase cost).
    3. Leverage high-resolution metrics for latency-sensitive applications.
    4. Combine metric math with alarms to monitor trends instead of just thresholds.
    5. Integrate metrics with dashboards for visual monitoring.

10. Typical Architecture Using CloudWatch Metrics

Application / AWS Service

         |

   (Generate metrics)

         |

        v

  Amazon CloudWatch Metrics

        |

  +------------------------+ 

|          Alarms                 |

|     Dashboards             |

| Metric Math / Insights |

  +------------------------+

        |

  Notification / Auto-Scaling / Remediation






No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, Insights. Intro: Amazon EventBridg...