- Intro,
- High-Level Overview,
- Kinesis Producer Library (KPL) Concept,
- Kinesis Client Library (KCL) concept
- Features,
- When to use Kinesis Producer Library (KPL),
- When to use Kinesis Producer Library
- Limitations,
- Summary Comparison Table,
- When to Use What table.
- Can both KPL & KCL be used
- Here's twtech comparison of Kinesis Producer Library (KPL) vs Kinesis Client Library (KCL).
- Both are essential tools in the Amazon Kinesis ecosystem, but they serve different purposes.
High-Level Overview
|
Library |
Purpose |
Typical
Role |
|
KPL (Kinesis Producer Library) |
Optimizes writing/producing
records to Kinesis Data Streams |
Used by producers |
|
KCL (Kinesis Client Library) |
Simplifies reading and processing
records from Kinesis |
Used by consumers |
Kinesis Producer Library (KPL) concept:
- A high-performance Java library (can be used via native and multi-language wrappers) designed to efficiently write data to a Kinesis Data Stream.
Features:
- Aggregation:
Combines multiple logical records into one Kinesis record to reduce API
calls and cost
- Retries & Failover: Handles network or throughput exceptions gracefully
- Asynchronous put:
High-throughput writes without blocking.
- Time-based buffering:
Controls how long records are batched before being sent
When to use Kinesis Producer Library (KPL):
- twtech needs high-throughput ingestion
- twtech is writing to Kinesis from Java
- twtech wants to optimize cost and network use
Limitations:
- Requires a daemon (Java process) to run (if used
from non-Java apps)
- More complex deployment than just using the AWS
SDK
- Limited to producer role only
Kinesis Client Library (KCL) concept:
- A Java-based consumer library that helps developers build applications that read from Kinesis Data Streams, manage shard coordination, and checkpoint processing.
Features:
- Checkpointing
(via DynamoDB): Tracks where a consumer left off
- Shard management:
Automatically balances shards among multiple workers
- Resharding support:
Handles new shards during scaling
- Multi-lang support
(via MultiLangDaemon): lets twtech use Python, Node.js, etc.
When to use Kinesis Client Library (KCL):
- twtech needs robust consumer applications
- twtech wants to track state (checkpointing) reliably
- twteeh has multiple parallel consumers
Limitations:
- Requires DynamoDB table for checkpointing
- Polling-based,
so may not be as low-latency as Enhanced Fan-Out
- Not event-driven like Lambda (slightly higher
operational overhead)
Summary Comparison Table
|
Feature |
KPL |
KCL |
|
Role |
Producer |
Consumer |
|
Main Language |
Java (Native bindings for others) |
Java (with MultiLangDaemon) |
|
Optimized For |
Efficient record ingestion |
Scalable, stateful record
processing |
|
Record Aggregation |
✅ Yes |
❌ No |
|
Async Writes |
✅ Yes |
❌ No (polling reads) |
|
Checkpointing |
❌ No |
✅ Yes (DynamoDB) |
|
Shard Coordination |
❌ No |
✅ Yes |
|
Fan-Out Support |
❌ No |
❌ But twtech can use with enhanced fan-out (EFO) separately |
|
Lambda Alternative |
❌ No |
✅ Possibly (if twtech wants more
control than Lambda) |
|
Complexity |
Moderate |
Higher (more infra: DynamoDB,
coordination logic) |
When to Use What
|
Use
Case |
Recommendation |
|
twtech is ingesting 1000s of records/sec efficiently. |
KPL. |
|
twtech wants a real-time consumer
that handles checkpointing. |
KCL. |
|
twtech wants a simple event-driven
consumer. |
Consider Lambda instead of
KCL. |
|
twtech is using non-Java language
and doesn’t want extra daemons. |
Use AWS SDK (simpler) or Lambda. |
Bonus: Can both KPL & KCL be used
Yes.
- twtech can use KPL on the producer side and KCL on the consumer side.
- If KPL aggregates records, twtech KCL consumer must
de-aggregate them using the
KinesisDeaggregator. - Kinesis de-aggregation is the process of extracting individual user records from a single Amazon Kinesis Data Streams record that contains multiple, smaller records initally bundled together (aggregated) by the Kinesis Producer Library (KPL) to improve throughput.
No comments:
Post a Comment