Tuesday, July 22, 2025

Amazon Kinesis Data Firehose | Overview & Hands-On.

Amazon Kinesis Data Firehose (alias: Amazon Data Firehose,) - Overview & Hands-On.

Scope:

  • Intro,
  • The concept: Amazon Kinesis Data Firehose,
  • Key Features,
  • Architecture,
  • Typical Data Flow,
  • Common Use Cases,
  • Sample Use Case,
  • Security Posture
  • Project: Hands-on

Intro:

  • Amazon Data Firehose is now officially known as Amazon Kinesis Data Firehose.
  •  Amazon Kinesis Data Firehose is a fully managed service provided by AWS that allows twtech to reliably load streaming data into data lakes, data stores, and analytics services.

 The concept: Amazon Kinesis Data Firehose

  • Amazon Kinesis Data Firehose is part of the Amazon Kinesis family.
  • Amazon Kinesis Data Firehose is specifically designed for real-time data delivery
  • Amazon Kinesis Data Firehose automatically scales to match twtech data throughput and requires no ongoing administration.

 Key Features

Feature

Description

Fully Managed

No need to manage infrastructure or write custom data ingestion code.

Real-Time

Ingests streaming data with minimal latency.

Multiple Destinations

Supports Amazon S3, Amazon Redshift, Amazon OpenSearch Service, Splunk, Datadog, and HTTP endpoints.

Data Transformation

Integrates with AWS Lambda to transform data on the fly before delivery.

Data Format Conversion

Can convert JSON to Parquet/ORC and compress data (e.g., GZIP).

Automatic Scaling

Automatically adjusts to match incoming data volume.

Retries and Error Handling

Automatically retries failed data delivery.


Architecture


Typical Data Flow


 Common Use Cases

  • Real-time log analytics (web server logs, IoT data, etc.)
  • Streaming ETL pipelines
  • Security and compliance monitoring
  • Application telemetry ingestion
  • Machine learning model input pipelines

 Sample Use Case

twtech wants to stream web server logs to Amazon S3 for later analysis in Athena:

  1. Configure a Kinesis Firehose delivery stream.
  2. Send logs (e.g., from Apache or Nginx) to the Firehose stream.
  3. Firehose buffers and optionally transforms data (e.g., adds timestamps, formats JSON).
  4. It delivers the data to an S3 bucket every N seconds or MBs.
  5. Query the data using Amazon Athena.

 Security Posture:

  • Supports IAM roles and policies.
  • Can use KMS for encryption at rest.
  • Supports HTTPS for data in transit.


Project: Hands-on

  • How twtech deploys and use Amazon kinesis data Firehose  streams to ingest streaming data, then reliably load it into data lakes, data stores, and analytics services. 
Step-1:
  • Search for AWS service: kinesis

Step-2:

Create data stream : Amazon Data Firehose

  • Select: Amazon Data Firshose

  • Create a delivery stream for: Kinesis Data Firehose Stream

  • Amazon kinesis Data Firehose: How it works:

Step-3:

Choose source and destination

  • Specify the source and the destination for twtech Firehose stream. 
NB:
  • twtech cannot change the source and destination of its Firehose stream once it has been created.

  • Firehose stream name (is auto-generated) : KDS-S3-zF2LM 

Source settings
  • : twtech-kinesis-data-stream



Destination settings

  • Specify the destination settings for twtech Firehose stream


Buffer hints, compression, file extension and encryption

  • The fields below are pre-populated with the recommended default values for S3. 
  • Pricing may vary depending on storage and request costs.

Advanced settings

  • Server-side encryption not enabled; error logging enabled; IAM role KinesisFirehoseServiceRole-KDS-S3-us-east-2-1753316140xxx; no tags.


  • Create kinesis Firehose stream: KDS-S3-zF2LM

Step-4:


  • twtech accesses Metrics populated for: Monitoring and observability.

Step-5:

  • New Configuration twtech can make to match its desired use cases: Configuraton



Step-6:

  • twtech tests kinesis data Firehose: by putting data
  • Test with demo data (as shown below)
NB:
  • The simulated data is ingest to test the configuration of twtech Firehose stream. Standard Amazon Data Firehose charges apply.
  • For this project, the ingested (streaming) data is delivered to datastore (s3):  twtechs3


  • How twtech streams data into Kinesis data firehose using cloudshell or any terminal configured: CloudShell

Step-7:

  • First, twtech Verifies the version of aws cli in cloudshell:

aws --version 


NB:

  • If Version of aws cli presents is: aws-cli/2.27.53 then, proceed to Next step.

Step-8:

  • twtech sends a records to the created kinesis data streamstwtech-kinesis-data-streams.
  • twtech-user signup with the following command for: aws cli version 2

aws kinesis put-record --stream-name twtech-kinesis-data-stream --partition-key twtech-user1 --data "twtech-user1 signup" --cli-binary-format raw-in-base64-out

Step-9:

twtech-user1 login with the following command for:aws cli version 2

aws kinesis put-record --stream-name twtech-kinesis-data-stream --partition-key twtech-user1 --data "twtech-user1 login" --cli-binary-format raw-in-base64-out

Step-10:

twtech-user1 signout with the following command for: aws cli version 2

aws kinesis put-record --stream-name twtech-kinesis-data-stream --partition-key twtech-user1 --data "twtech-user1 signout" --cli-binary-format raw-in-base64-out

Step-11:

  • twtech verifies if these three records generated (simulated) are:
    • streamed with kinesis data streams
    • Ingested by kinesis data firehose and delivered to the assigned datastore (s3 bucket) : twtechs3 


Step-12:

  • By Clicking on the folder (would show the partitions created): 2025 


  • twtech can click on each record cerated to access with a text editor and read the content.

Step-13:

  • Go into the downloads folder to get the data files created: Downloads

Step-14:

  • twtech Opens the file downloaded with a text editor to read content: twtech-user login message

And 

  •  twtech-user1: signout message




No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, Insights. Intro: Amazon EventBridg...