Friday, August 29, 2025

Amazon Keyspaces (for Apache Cassandra) | Deep Dive.


 
Amazon Keyspaces (for Apache Cassandra) - Deep Dive.

Scope:

  • Intro,
  • The Concept of Amazon Keyspaces,
  • Key Features,
  • Architecture,
  • Sample table definition,
  • Write and Read Operations,
  • Security & IAM,
  • Monitoring & Observability,
  • Common Use Cases,
  • Keyspaces vs. Self-managed Cassandra vs. DynamoDB.

Intro:

    • Amazon Keyspaces is a serverless, fully managed Cassandra-compatible database service on AWS.
    • Apache Cassandra is an Open-Source NoSQL distributed database. 
    • Amazon Keyspaces lets twtech to run Cassandra workloads without managing clusters, scaling, or infrastructure.

1.  The Concept of Amazon Keyspaces

    • Managed Cassandra: Offers the same CQL (Cassandra Query Language) APIs as open-source Cassandra.
    • Serverless & Scalable: Automatically scales tables based on request volume.with millisecond latency that can handle 1000s of requests per second.
    • High Availability: Replicated across multiple AZs within a region.
    • Billing Model: Pay only for what twtech uses (reads, writes, storage).

2.  Key Features

a) Cassandra Compatibility

    • Supports CQL v3.11 API.
    • Use existing Cassandra drivers (Java, Python, Go, Node.js, etc.).
    • Works with Apache Cassandra tooling (cqlsh, drivers, ORMs).

b) Serverless Scaling

    • Scales up or down automatically based on throughput demand.
    • No cluster provisioning or manual scaling required.

c) Two Capacity Modes

  1. On-demand Mode
    • Auto-scales based on traffic.
    • Pay-per-request billing.
    • Ideal for spiky workloads.
  2. Provisioned Mode
    • Manually define read/write capacity units (RCUs, WCUs).
    • Can enable auto-scaling.
    • Lower cost for predictable workloads.

d) Replication & Durability

    • Data replicated 3x across multiple AZs.
    • Built-in backup and PITR (Point-In-Time Recovery).

e) Security

    • Encryption at rest with AWS KMS.
    • TLS in transit.
    • IAM-based authentication instead of Cassandra’s username/password.
    • VPC endpoints for private access.

3.  Architecture

    • No cluster nodes exposed to customers (vs. self-managed Cassandra).
    • AWS manages:
      • Cluster provisioning
      • Node replacement
      • Repair/compaction
      • Scaling
    • Applications connect directly to Keyspaces
    • Applications connect directly to Keyspaces via CQL-compatible endpoint.

4. Data Model

Keyspaces follows Cassandra’s partitioned row store model:

    • Keyspace top-level namespace (like a database).
    • Table collection of rows.
    • Row identified by a primary key (partition key + clustering columns).
    • Columns schema-defined, flexible structure.

# Sample table definition:

CREATE TABLE twtech-users (

    user_id UUID,

    first_name text,

    last_name text,

    email text,

    PRIMARY KEY (twtech-user_id)

); 

5.  Write and Read Operations

Writes

    • Append-only (like Cassandra).
    • Immutable SSTables with memtables + background compaction.
    • Highly durable due to multi-AZ replication.

Reads

    • Partition-aware (must supply partition key).
    • Efficient range queries using clustering keys.
    • Supports secondary indexes and materialized views (limited vs open-source Cassandra).

6.  Accessing Amazon Keyspaces (Using CQLSH /Java)

 # Using CQLSH 

cqlsh <keyspaces-endpoint> 9142 \

  --ssl \

  --username "twtech-iam-username" \

  --password "twtech-iam-token"

 # Using Java

CqlSession session = CqlSession.builder()

    .addContactPoint(new InetSocketAddress("cassandra.us-east-2.amazonaws.com", 9142))

    .withLocalDatacenter("us-east-2")

    .withAuthProvider(new SigV4AuthProvider())

    .build();

7.  Security & IAM

    • IAM Authentication replaces Cassandra user roles.
    • Policies control which tables/queries users can run.
    • Example IAM policy for read-only access:

{

  "Effect": "Allow",

  "Action": [

    "cassandra:Select"

  ],

  "Resource": "arn:aws:cassandra:us-east-2:accountID:/keyspace/twtechkeyspace/twtech-table/*"

}

8.  Monitoring & Observability

    • CloudWatch metrics for RCU/WCU usage, latency, and throttling.
    • CloudTrail logs for API activity.
    • Performance tuning via capacity mode & partition design.

9.  Best Practices

  1. Partition key design matters
    • Avoid hot partitions (single partition getting all writes).
    • Distribute keys evenly.
  2. Model queries, not data
    • Like Cassandra, denormalize for query patterns.
  3. Use TTLs for expiring data

4.  INSERT INTO sessions (session_id, user_id, data)

5.  VALUES (uuid(), '123', 'abc') USING TTL 3600;

  1. Choose right capacity mode
    • On-demand: spiky traffic.
    • Provisioned: steady workloads.
  2. Leverage IAM instead of application-managed credentials.

10.  Common Use Cases

    • IoT time-series data (high write throughput, TTL expiry).
    • User profiles & personalization (low latency).
    • Session stores (ephemeral storage with TTL).
    • Message/event logging.
    • Gaming leaderboards.

11.  Keyspaces vs. Self-managed Cassandra vs. DynamoDB

Feature

Amazon Keyspaces

Self-managed Cassandra

DynamoDB

Management

Fully managed

twtech manages

Fully managed

Scaling

Auto / provisioned

Manual

Auto / provisioned

API

CQL (Cassandra)

CQL

Proprietary

Ecosystem

Cassandra-compatible

Native Cassandra

AWS only

Multi-AZ

Built-in

Complex setup

Built-in

Final thoughts:

    • Amazon Keyspaces gives twtech Cassandra power without Cassandra pain.
    •  Amazon Keyspaces is serverless, scalable, secure, and compatible. 
    • Amazon Keyspaces is Perfect if twtech want:
      • Low-latency, 
      • High-throughput workloads 
      • But don’t want to manage Cassandra clusters.


No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, Insights. Intro: Amazon EventBridg...