Amazon CloudWatch Container Insights Service - Overview.
Scope:
- Intro,
- Key Features & Capabilities,
- Enabling Container Insights,
- Links to official documentation,
- The concept: CloudWatch Container Insights,
- How Container Insights Works,
- Data Collection,
- Metrics Processing,
- Storage & Visualization,
- Alarms & Automation,
- Data Collected Samples,
- Security & Permissions,
- Advanced Use Cases,
- Architecture Diagram,
- Final thoughts.
Intro:
- Amazon CloudWatch Container Insights is a monitoring and observability service that:
- Collects,
- Aggregates,
- Summarizes metrics / logs
- From containerized applications and microservices.
- Amazon CloudWatch Container Insights Service provides visibility into the performance and health of container management services, including:
- Amazon Elastic Container Service (Amazon ECS)
- Amazon Elastic Kubernetes Service (Amazon EKS)
- AWS Fargate
- Standalone Kubernetes.
- Automatic Dashboards: Container Insights provides pre-built, automatic dashboards that display key metrics such as CPU utilization, memory usage, network data, and disk I/O at various levels (cluster, node, pod, task, and service).
- Troubleshooting and Diagnostics: It helps in isolating issues quickly by providing diagnostic information, such as container restart failures.
- Enhanced Observability: Recent updates offer enhanced observability features, including new landing pages and the ability to view resource consumption sorted by different parameters, aiding in proactive risk mitigation.
- Log and Metric Collection: Operational data is collected as performance log events using the embedded metric format, from which CloudWatch then generates aggregated metrics.
- Integration with OpenTelemetry: AWS Distro for OpenTelemetry (ADOT) can be used to collect system metrics, providing a secure and supported distribution of APIs and agents for unified observability.
- Alarms and Notifications: Users can set CloudWatch alarms on the collected metrics to notify teams of performance issues or resource constraints.
- twtech can enable Container Insights through the AWS console, AWS CLI, CloudFormation, or CDK.
- EKS and Kubernetes Container Insights service, is often set up quickly using the Amazon CloudWatch Observability EKS add-on.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-EKS-quickstart.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContainerInsights.html
The concept:
CloudWatch Container Insights
- CloudWatch Container Insights is an observability feature that
automatically collects, aggregates, then summarizes metrics / logs from containerized
applications and microservices which are running on:
- Amazon ECS
(EC2 & Fargate)
- Amazon EKS
- Kubernetes (self-managed on EC2)
- AWS App Runner.
NB:
- CloudWatch Container Insights Service helps twtech to monitor, troubleshoot, and optimize container workloads without
building custom telemetry pipelines.
How Container Insights Works
1. Data
Collection
- Uses CloudWatch Agent (with embedded
Fluent Bit) deployed as:
- DaemonSet on EKS/Kubernetes
- Sidecar or embedded agent in ECS/ECS on
Fargate
- Collects:
- Performance Metrics: CPU, memory, disk, network per
container/pod/service/node/cluster
- Logs: stdout/stderr logs, kube-system logs
- Diagnostic Data: container restart counts, OOM events,
throttling
2. Metrics
Processing
- Metrics are aggregated at multiple
levels:
- Cluster
- Node/EC2 instance
- Pod/Task
- Container
- Service/Namespace
- Automatically published into CloudWatch
Metrics namespace:
ContainerInsights/ClusterName
3. Storage
& Visualization
- Metrics → CloudWatch Metrics
- Logs → CloudWatch Logs
- Logs can be queried using CloudWatch
Logs Insights
- Dashboards auto-created in CloudWatch
for:
- Cluster overview
- Nodes
- Pods/Tasks
- Services
4. Alarms
& Automation
- Create CloudWatch Alarms on
container metrics (e.g., CPU > 80%, OOMKills > 0).
- Integrates with EventBridge for automated responses (scaling, healing, notifications).
Data
Collected Samples
|
Level |
Metrics
Examples |
|
|
Cluster |
CPU reservation %, memory
reservation %, pod count, service count |
|
|
Node |
CPU utilization, memory utilization,
disk I/O, network traffic, running pods |
|
|
Pod/Task |
CPU/memory limits vs requests,
restarts, OOM events |
|
|
Container |
CPU usage (cores, %), memory (RSS,
working set), network bytes in/out |
|
|
Service |
Aggregated CPU/memory usage across
replicas |
|
Security &
Permissions
- IAM
permissions required for:
- Writing to CloudWatch Logs
- Publishing metrics to CloudWatch
- Best
practice: use least privilege IAM roles for
service accounts (IRSA in EKS).
Advanced Use Cases
- Performance Troubleshooting
- Find “noisy neighbors” in multi-tenant
clusters
- Detect throttled pods (CPU throttling metrics)
- Spot OOM-killed containers
- Cost Optimization
- Identify underutilized nodes or
over-provisioned requests/limits
- Right-size pods and tasks to reduce
EC2/Fargate costs
- Scaling Decisions
- Feed Container Insights metrics into Application
Auto Scaling
- Custom autoscaling policies (beyond CPU/memory)
- Compliance & Auditing
- Centralized logs across all
namespaces/clusters
- Integrate with CloudWatch Contributor
Insights for anomaly detection
Architecture Diagram
[Cluster
(EKS/ECS)]
CloudWatch Agent/Fluent Bit DaemonSet (Collects metrics & log then Sends Metrics) → CloudWatch Metrics (Sends Logs) → CloudWatch Dashboards, Alarms, Logs
Insights → EventBridge, Lambda,
Auto Scaling (Integration).
Final thoughts:
- CloudWatch Container
Insights Service is the observability
out-of-the-box for containerized workloads.
- CloudWatch Container Insights Service has minimal setup, tightly integrated with AWS monitoring and
automation.
No comments:
Post a Comment