Amazon DocumentDB (with MongoDB compatibility) - Deep Dive.
Scope:
- Intro
- Architecture,
- Features,
- Security,
- Performance,
- Operations,
- Integrations,
- Use cases.
- Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service provided by Amazon Web Services (AWS).
- Amazon DocumentDB is designed to be compatible with MongoDB APIs, allowing developers to use existing MongoDB applications, drivers, and tools with minimal or no code changes.
1. Overview
- Amazon DocumentDB is a fully managed document database service that is MongoDB-compatible (works with MongoDB APIs, drivers, and tools).
- Designed for scalability, availability, and durability of JSON-like document data.
- Ideal for workloads requiring semi-structured data, flexible schema, hierarchical
data, and rapid iteration.
2. Architecture
- Storage & Compute Separation
- Storage layer is distributed, fault-tolerant,
SSD-backed, and automatically replicated across 3 AZs.
- Compute layer (instances)
handles query processing & caching.
- Cluster Components
- Primary instance → handles read & write.
- Replica instances (up to 15) → read scaling + failover target.
- Cluster volume
→ shared storage backend, 6-way replicated (2 copies in each AZ).
- Automatic Failover
- If primary fails, replicas auto-promote (failover in
~30s).
3. Key Features
- MongoDB API compatibility (3.6, 4.0, 5.0 versions supported).
- Elastic scaling (scale reads via replicas, scale storage automatically up to 128 TiB).
- Serverless → recently introduced DocumentDB Elastic Clusters for petabyte-scale, auto-sharding.
- Global Clusters → cross-region read replicas for global apps.
- Continuous backup with PITR (Point-In-Time Recovery, up to last 5 min).
- Change Streams → event-driven integration (similar to MongoDB change streams).
- Monitoring & Performance Insights →
integration with CloudWatch, Performance Insights, CloudTrail.
4. Security
- Encryption at Rest →
AWS KMS managed keys.
- Encryption in Transit →
TLS 1.2 support.
- Authentication & Access Control
- IAM
authentication.
- Database
users + roles.
- Security
groups, VPC isolation.
- Audit & Logging →
CloudTrail, CloudWatch logs, VPC Flow logs.
5. Performance
- Designed for high throughput & low latency with distributed storage architecture.
- Automatic SSD-based replication reduces I/O bottlenecks.
- Read scaling via up to 15 replicas.
- Connection pooling with Amazon RDS Proxy (optional).
- Best suited for read-heavy, document-based workloads.
- Less efficient for high-write
workloads compared to
DynamoDB.
6. Operations & Management
- Fully managed (patching, backups, upgrades handled by AWS).
- Backups & PITR → continuous snapshots with 1–35 day retention.
- Monitoring → CloudWatch, Enhanced Monitoring, Performance Insights.
- Automation → CloudFormation, Terraform, CDK, CLI.
- Migration Tools
- AWS
Database Migration Service (DMS).
- MongoDB dump/restore tools.
7. Integrations
- Event-driven →
integrate Change Streams with Lambda, SQS, Kinesis.
- Analytics → integrate with Athena, Glue, QuickSight via
federation.
- ETL & Migration →
AWS DMS, Data Pipeline, Glue.
- DevOps
→ IaC (Terraform, CDK), CI/CD
pipelines.
8. Use Cases
- Content Management Systems (CMS)
→ dynamic JSON storage.
- Catalogs & User Profiles →
flexible schema for e-commerce apps.
- IoT & Event Data →
ingesting JSON device telemetry.
- Mobile & Gaming Apps →
store user/game state in flexible documents.
- Knowledge Graphs / Metadata →
store hierarchical metadata with nested JSON.
9. Best Practices
- Use replicas for reads, don’t overload the primary.
- Enable auto-scaling for instance classes.
- Choose Elastic Clusters for large, sharded workloads.
- Use indexes wisely (avoid
large unbounded arrays).
- Enable IAM auth + TLS for security.
- Monitor via Performance Insights to tune queries.
No comments:
Post a Comment