Think - with -Tech: Amazon DynamoDB

Tuesday, August 26, 2025

Amazon DynamoDB | Deep Dive.

Amazon DynamoDB - Deep Dive.

Scope:

The concept: DynamoDB,
Core Architecture,
Data Model,
Storage & Scaling,
Consistency,
Replication,
Performance Model,
Advanced Features,
Best Practices,
Real-World Use Cases,
Sample – High-Traffic Production Architecture,
Createing Amazon DynamoDB & Integrating with other AWS services (UI / CLI),
Integrate DynamoDB with Other AWS Services,
Sample policy for Lambda,
When to Use DynamoDB & when not to use DynamoDB.

Intro:

Amazon DynamoDB is a fully managed, serverless NoSQL, key-value and document database service provided by Amazon Web Services (AWS).
Amazon DynamoDB is designed for single-digit millisecond performance at any scale and offers built-in security, continuous backups, automated multi-Region replication, and in-memory caching.

1. The concept: DynamoDB

Amazon DynamoDB is a fully managed, serverless NoSQL database service designed for high-performance applications at scale.

It provides:

Single-digit millisecond latency at any scale.
Automatic scaling of throughput & storage.
Multi-Region replication for global applications.
Event-driven integration (via DynamoDB Streams + Lambda).
Pay-per-request or provisioned pricing models.

It’s based on Amazon’s internal "Dynamo" paper (2007), which powered Amazon.com’s shopping cart and inspired many modern distributed databases.

2. Core Architecture

Data Model

Tables → Items → Attributes
Items are schemaless (can vary by row).
Each item is identified by a Primary Key:

Partition Key (PK) → Determines item’s placement.
Partition Key + Sort Key (Composite Key) → Allows range queries.

Storage & Scaling

Data is partitioned across multiple servers.
Partitioning is based on a hash of the partition key.
Auto-scaling adjusts partitions when storage or throughput grows.

Consistency

Supports eventual consistency (default) and strong consistency (optional, per-request).

Replication

Data is replicated 3x within a Region (across AZs).
Global Tables replicate across Regions for multi-region apps.

3. Performance Model

Single-digit millisecond response time.
Performance depends on Read/Write Capacity Units (RCUs/WCUs):

1 RCU = 1 strongly consistent read/sec (up to 4KB item).
1 WCU = 1 write/sec (up to 1KB item).

On-Demand mode
On-Demand mode (pay-per-request, no capacity planning).
Provisioned mode (reserve throughput with optional auto-scaling).
DAX (DynamoDB Accelerator) → In-memory caching for microsecond latency.

4. Advanced Features

DynamoDB Streams → Change data capture (CDC) for real-time triggers (integrates with Lambda, Kinesis).
TTL (Time-to-Live) → Auto-expire items.
Transactions → ACID across multiple items & tables.
Indexes

Global Secondary Index (GSI): Alternate queryable key across partitions.
Local Secondary Index (LSI): Query on different sort key (must share same partition key).

Global Tables
Global Tables → Multi-region active-active replication.
Point-in-time recovery (PITR) → Continuous backups (35 days).
Fine-grained security with IAM + encryption at rest.

5. Best Practices

Choose a good partition key to avoid “hot partitions.”

Example: Instead of userId, use userId#timestamp to spread writes.

Use GSIs for flexible querying
Use GSIs for flexible querying, but design them carefully (indexes cost capacity).
Enable on-demand mode if workload is unpredictable.
Leverage Streams + Lambda for event-driven microservices.
Use DAX if twtech needs sub-millisecond response times.
Avoid large item sizes (limit: 400KB per item).
BatchWriteItem / BatchGetItem to optimize throughput.

6. Real-World Use Cases

E-commerce → Shopping carts, product catalogs, inventory.
Gaming → Player session state, leaderboards.
IoT → Device telemetry ingestion at scale.
Finance → Fraud detection, transaction logs.
Social Media → User profiles, activity feeds.
Serverless apps → Backend for Lambda + API Gateway.

7. Sample – High-Traffic Production Architecture

Imagine a global ride-sharing app:

DynamoDB (Global Tables) → Stores user profiles, trips, and real-time driver availability.
DAX → Accelerates location queries (sub-ms).
Streams + Lambda → Update trip status, trigger billing workflows.
API Gateway + Lambda → Expose REST API to mobile apps.
Kinesis Firehose + S3 → Archive trip history for analytics.
CloudWatch + Auto-scaling → Manage costs & performance.

This architecture ensures:

Low-latency lookups (who’s the nearest driver?).
Global consistency (multi-region replication).
Event-driven workflows (trip completion triggers invoice generation).

8, Createing Amazon DynamoDB & Integrating with other AWS services

NB:

It is a common requirement in cloud applications for building scalable, serverless, or event-driven architectures.

Here’s twtech detailed step-by-step guide:

1. Create a DynamoDB Table

Via AWS Management Console

Go to DynamoDB in the AWS console.
Click Create table.
Provide:

Table name:e.g., twtech-Users)
Primary key:Partition key (PK) and optional sort key (SK))

Optionally configure:

Read/write capacity (Provisioned or On-demand)
Encryption
TTL (Time-to-Live) for automatic record deletion

Click Create table.

Via AWS CLI

# bash

aws dynamodb create-table \

--table-name twtech-Users \

--attribute-definitions AttributeName=twtech-UserID,AttributeType=S \

--key-schema AttributeName=twtech-UserID,KeyType=HASH \

--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

Via AWS SDK (Python Example with boto3)

import boto3

dynamodb = boto3.resource('dynamodb')

table = dynamodb.create_table(

TableName='twtech-Users',

KeySchema=[

{'AttributeName': 'UserID', 'KeyType': 'HASH'}

AttributeDefinitions=[

{'AttributeName': 'UserID', 'AttributeType': 'S'}

BillingMode='PAY_PER_REQUEST'

)

table.wait_until_exists()

print("twtech Table created successfully")

2. Integrate DynamoDB with Other AWS Services

A. Lambda (Serverless Function)

Use Lambda to read/write data in DynamoDB.
Example: Trigger a Lambda function when an API Gateway request comes in.

Steps:

Create a Lambda function in the AWS console.
Attach IAM permissions allowing dynamodb:GetItem, PutItem, Scan, etc.
Use SDK in twtech Lambda code:

import boto3

dynamodb = boto3.resource('dynamodb')

table = dynamodb.Table('twtech-Users')

def lambda_handler(event, context):

response = table.put_item(

Item={

'UserID': 'twtech123',

'Name': 'twtechuser-pat'

}

)

return response

B. API Gateway

Create REST or HTTP APIs that interact with DynamoDB through Lambda.
Use Lambda Proxy Integration:

API Gateway receives HTTP request.
Triggers Lambda function.
Lambda performs DynamoDB operation and returns response.

C. S3 (Event-driven Data Ingestion)

twtech can stream new S3 objects to DynamoDB via Lambda:

Configure S3 Event Notifications to trigger a Lambda function.
Lambda reads the file content and stores relevant data in DynamoDB.

D. DynamoDB Streams

DynamoDB Streams capture data changes (insert, modify, delete).
twtech can integrate streams with:

Lambda – to process changes in real-time.
Kinesis Data Firehose – to deliver DynamoDB changes to S3, Redshift, or Elasticsearch.

Sample with Lambda Trigger:

def lambda_handler(event, context):

for record in event['Records']:

if record['eventName'] == 'INSERT':

new_data = record['dynamodb']['NewImage']

print("New record added:", new_data)

E. Step Functions

Use AWS Step Functions for orchestrating workflows with DynamoDB:

Example: Validate order → Store in DynamoDB → Send notification.

F. CloudWatch

Monitor DynamoDB metrics (Read/Write capacity, Throttled requests).
Use CloudWatch Alarms to trigger Lambda or SNS notifications on thresholds.

G. EventBridge

Stream DynamoDB Streams to EventBridge.
Useful for decoupled event-driven architectures.

H. AWS AppSync

Integrate DynamoDB as a backend for GraphQL APIs.
AppSync handles queries, mutations, and subscriptions automatically.

I. Glue / Redshift / Athena

For analytics:

Use Glue ETL to extract DynamoDB data.
Query DynamoDB via Athena using DynamoDB connector.
Load data to Redshift for advanced analytics.

3. IAM Permissions

Ensure the service accessing DynamoDB has the right IAM role.
Sample policy for Lambda:

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Action": [

"dynamodb:GetItem",

"dynamodb:PutItem",

"dynamodb:Scan",

"dynamodb:UpdateItem"

"Resource": "arn:aws:dynamodb:us-east-2:accountID:table/twtech-Users"

}

]

}

Key takeaway:

Create table → Write/Read with Lambda → Trigger via API Gateway, S3, or Streams → Orchestrate with Step Functions → Monitor with CloudWatch.

NB:

DynamoDB integrates seamlessly with:

Eerverless,
Event-driven,
Analytics pipelines across AWS.

9. When to Use DynamoDB & when not to use DynamoDB

When to Use DynamoDB:

Need millisecond performance at scale.
Unpredictable workloads (spiky traffic).
Event-driven apps (serverless).
Require global distribution.

When Not to use DynamoDB:

Need complex queries (joins, aggregations → better in RDS or Redshift).
Data fits strict relational models.
Need large item sizes (>400KB).

Think - with -Tech

Tuesday, August 26, 2025

Amazon DynamoDB | Deep Dive.

No comments:

Post a Comment

Amazon EventBridge | Overview.

Blog Archive