Amazon CloudWatch Metrics - Overview.
Scope:
- Overview of Amazon CloudWatch
Metrics,
- Key Components,
- Types of Metrics,
- Metric Resolution,
- Storage and Retention,
- Alarms and Thresholds,
- Metric Math,
- Integration with Logs and Events,
- Best Practices
- Architecture.
1.
Overview of Amazon CloudWatch Metrics
- Amazon CloudWatch Metrics is a core feature of Amazon CloudWatch, AWS’s monitoring and observability service.
- Metrics are time-ordered data points representing the performance or health of AWS resources and applications.
- Each metric represents a specific aspect of a resource, e.g:
- CPU utilization,
- Network throughput,
- Request latency.
- Metrics are organized by:
- Namespace,
- Identified by metric name / dimensions.
2. Key
Components
|
Component |
Description |
|
Namespace |
A container for metrics. Examples: AWS/EC2,
AWS/S3, or custom namespaces like MyApp/Metrics. |
|
Metric Name |
Identifies the metric, e.g., CPUUtilization,
Latency. |
|
Dimensions |
Key-value pairs that refine the
metric. E.g., for EC2: InstanceId=i-1234567890abcdef0. |
|
Data Points |
Timestamped values for the metric. |
|
Unit |
Optional, indicates the unit of
measurement: Seconds, Bytes, Percent, etc. |
3.
Types of Metrics
- AWS-Provided Metrics
- Automatically collected for AWS
resources.
- Example: EC2 instances provide CPUUtilization,
DiskReadOps, NetworkIn.
- Custom Metrics
- twtech can publish its own
application or business metrics using the CloudWatch API, SDK, or
CLI.
- Example: Order count per minute, payment
processing latency.
4.
Metric Resolution
- Standard Resolution: 1-minute granularity (default)
- High Resolution: 1-second granularity (useful for real-time monitoring and alerting)
5.
Storage and Retention
|
Time
Period |
Granularity |
Retention |
|
0–15
days |
1 minute |
15 days |
|
15–63
days |
5 minutes |
63 days |
|
63–455
days |
1 hour |
455 days |
6. Alarms and Thresholds
- CloudWatch allows twtech to create alarms that trigger when metrics breach thresholds.
- Alarm actions can include:
- Sending SNS notifications
- Auto Scaling actions
- EC2 instance recovery
- Supports statistical thresholds
- Supports statistical thresholds
(average, sum, min, max, percentile)
7.
Metric Math
CloudWatch Metric Math allows twtech to:
- Combine multiple metrics into a single expression.
- Perform operations like:
- Sum, average, min, max
- Rate of change
- Percentiles
- Example:
- Example: Compute total CPU utilization across multiple EC2 instances.
8.
Integration with Logs and Events
- CloudWatch Logs → Can generate metrics from log data
using metric filters.
- CloudWatch Events (EventBridge) → Can trigger workflows based on metric thresholds.
9. Best
Practices
- Use meaningful namespaces
and dimensions for easy filtering.
- Publish custom metrics sparingly (excess metrics increase cost).
- Leverage high-resolution metrics for latency-sensitive applications.
- Combine metric math with alarms to monitor trends instead of just thresholds.
- Integrate metrics with dashboards for visual monitoring.
10.
Typical Architecture Using CloudWatch Metrics
Application / AWS Service
|
(Generate metrics)
|
v
Amazon CloudWatch Metrics
|
+------------------------+
| Alarms |
| Dashboards |
| Metric Math / Insights |
+------------------------+
|
Notification / Auto-Scaling / Remediation
No comments:
Post a Comment