Wednesday, July 23, 2025

Kinesis Data Streams vs Amazon Kinesis Data Firehose.

 

Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose are both services for ingesting and processing streaming data in AWS.

But they serve different use cases and levels of complexity.

 Kinesis Data Streams (KDS)

A real-time, fully managed, low-latency streaming service designed for custom processing.

Used When twtech Needs:

  •         Real-time (sub-second) data processing.
  •         To build custom applications (e.g., analytics, anomaly detection).
  •         Fine-grained control over data processing and throughput.
  •         Integration with AWS Lambda, Kinesis Data Analytics, or custom consumers via Kinesis Client Library (KCL).

 Key Features:

  •         Manual scaling or auto-scaling (via enhanced fan-out and on-demand mode).
  •         Up to 7 days data retention (default is 24 hours).
  •         Custom retry logic and multiple consumers.

 Example Use Case:

A fraud detection system that processes transaction events in real-time using custom logic and machine learning models.

 Kinesis Data Firehose

A fully managed, near real-time streaming ingestion service designed for simple, automatic delivery to AWS destinations.

Used When twtech Needs:

  •         A no-code/no-maintenance way to load streaming data into S3, Redshift, OpenSearch, or 3rd-party tools.
  •         Simple data transformation via AWS Lambda or built-in format conversion (e.g., JSON → Parquet).
  •         Near real-time processing (latency: 1–60 seconds).

 Key Features:

  •         Automatic scaling.
  •         Built-in data transformation, compression, and encryption.
  •         No need to write consumer code.

 Example Use Case:

Streaming website logs directly to an S3 bucket in compressed format for later analysis with Athena.

 Quick Comparison Table:

Feature

Kinesis Data Streams

Kinesis Data Firehose

Latency

Sub-second

~1 min

Custom Processing

Yes (Lambda, KCL, custom apps)

Minimal (Lambda transform optional)

Data Retention(storage)

24h–7d

No retention (delivers immediately)

Delivery Targets

Custom apps, KDA, Lambda, etc.

S3, Redshift, OpenSearch, 3rd party

Scalability

Manual or auto-scaling

Fully managed and auto-scaling

Complexity

Higher

Lower (plug-and-play)

Cost

Pay-per-shard

Pay-per-GB ingested

twtech Summary:

  •        Kinesis Data Streams is used if twtech needs real-time, custom processing
  •        Kinesis Data Firehose is used  if  twtech needs simplified delivery to AWS services with minimal operational overhead.

No comments:

Post a Comment

Kubernetes Clusters | Upstream Vs Downstream.

  The terms "upstream" and "downstream" in the context of Kubernetes clusters often refer to the direction of code fl...