Friday, September 19, 2025

CloudWatch Logs Export to S3 | Overview.


Amazon CloudWatch Logs  Export to S3 -  Overview.

Scope:

  • Intro,
  • Prerequisites,
  • Architecture,
  • Method 1: One-time Export Task (Console),
  • Method 2: Continuous Export (Kinesis Firehose),
  • Considerations,
  • Overview,
  • Way to Export Logs to S3 (Manual Export Task),
  • Way to Export Logs to S3 (Subscription Filters - Real-Time Alternative),
  • IAM Permissions (policy) Required On the CloudWatch Logs Side,
  • Sample bucket policy to allow CloudWatch Logs write On the S3 Bucket Side,
  • Best Practices,
  • Common Pitfalls,
  • Architecture Comparison.

Intro:

    • twtech can export CloudWatch Logs to an Amazon S3 bucket using a one-time export task via the console or CLI, or continuously using a subscription filter with Kinesis Firehose.

Prerequisites
    • The destination S3 bucket must be in the same AWS Region as the CloudWatch log data.
    • The IAM user or role performing the export must have sufficient permissions for both CloudWatch Logs and S3.
    • The S3 bucket policy must grant CloudWatch Logs permission to write to the bucket. 
Architecture

Method 1:
One-time Export Task (Console)
NB:
  • This method is suitable for historical data and can be easily initiated through the AWS Management Console. 
    1. Sign in to the AWS Management Console and open the CloudWatch console.
    2. In the navigation pane, choose Log groups.
    3. Select the name of the log group twtech wants to export.
    4. Choose Actions, and then select Export data to Amazon S3.
    5. On the Export data to Amazon S3 screen, define the From and To time range for the data.
    6. Under Choose S3 bucket, select the destination S3 bucket name and optionally specify a bucket prefix.
    7. Choose Export to start the task.
NB:
  • To view the task's status, choose Actions and then View all exports to Amazon S3. 
Method 2: Continuous Export (Kinesis Firehose)
NB:
  • For an automated, ongoing solution to export logs as they arrive, use a subscription filter with Kinesis Firehose. This method involves several steps: 
    • Create a Kinesis Firehose delivery stream with the S3 bucket as its destination.
    • Configure a subscription filter on twtech CloudWatch Log Group to send an ongoing stream of log events to the Firehose delivery stream. 
NB:
  • This approach avoids the one active task quota limitation of the one-time export method and allows for more complex processing of the log data before it lands in S3. 
Considerations
    • Permissions: Ensure the required IAM policies are in place for the export to succeed.
    • Encryption: CloudWatch Logs supports exporting to S3 buckets encrypted with SSE-S3 or SSE-KMS keys.
    • Automation: For automated or recurring one-time exports, twtech can use the AWS CLI's create-export-task command, often within a Lambda function triggered by an EventBridge schedule.

1. Overview

    • CloudWatch Logs lets twtech collect logs from EC2, Lambda, ECS, EKS, on-prem agents, etc.
    • Sometimes twtech needs long-term retention or analytics beyond CloudWatch’s native features.
    • Exporting logs to Amazon S3 enables integration with:
      • Amazon Athena (SQL queries on logs)
      • Amazon OpenSearch Service (search & dashboards)
      • Amazon EMR / Glue (ETL processing)
      • SIEM solutions or custom pipelines

NB:

  •  CloudWatch Logs S3 export is not real-time. 
  • It’s a batch job
    • A batch job is a method of running high-volume, repetitive data processes without manual intervention
    • These jobs are scheduled to run, often during off-peak hours, to handle large datasets, such as processing records or system updates. 

2. Ways to Export Logs to S3

(A) Manual Export Task

  • Uses the CreateExportTask API.
  • twtech need to specify:
    • logGroupName which log group
    • from & to time range (epoch timestamps, ms)
    • destination S3 bucket
    • destinationPrefix S3 folder path
  • Limitations:
    • Max 1 concurrent export task per account per region
    • Max 100 export tasks per account per region per day
    • Time range can’t exceed 24 hours per export task
    • Export is asynchronous (can take minutes–hours depending on size)
  • Output format:
    • Logs are gzipped JSON text files
    • Stored in S3 partitioned by:
 <destinationPrefix>/<twtechlogGroupName>/<YYYY>/<MM>/<DD>/<HH>/<twtechfilename>.gz

(B) Subscription Filters (Real-Time Alternative)

    • For near real-time streaming to S3, twtech uses Kinesis Data Firehose as a subscription target.
    • Flow:
      • CloudWatch Logs → Subscription Filter → Kinesis Data Firehose → S3
    • Pros:
      • Continuous delivery (seconds to minutes latency)
      • Compression, buffering, partitioning support
    • Cons:
      • Additional cost for Firehose
      • Some setup complexity compared to export tasks

3. IAM Permissions Required On the CloudWatch Logs Side:

{

  "Effect": "Allow",

  "Action": [

    "logs:CreateExportTask",

    "logs:DescribeExportTasks",

    "logs:CancelExportTask"

  ],

  "Resource": "*"

}

Sample bucket policy to allow CloudWatch Logs write On the S3 Bucket Side:

{

  "Effect": "Allow",

  "Principal": {

    "Service": "logs.amazonaws.com"

  },

  "Action": "s3:PutObject",

  "Resource": "arn:aws:s3:::twtech-log-archive-bucket/*",

  "Condition": {

    "StringEquals": {

      "aws:SourceAccount": "accountID"

    },

    "ArnLike": {

      "aws:SourceArn": "arn:aws:logs:us-east-2:accountID:*"

    }

  }

}

4. Best Practices

 Choose export vs. subscription carefully

    • Use Export Task for historical or bulk one-off exports.
    • Use Firehose Subscription for continuous streaming.

 Partition your S3 data wisely

    • Use destinationPrefix for business-friendly partitions (e.g., /service/app1/).
    • Leverage year/month/day/hour partitioning for Athena queries.

 Enable lifecycle policies on S3

    • Transition logs to cheaper storage (Glacier, Deep Archive) for long-term retention.

 Use Glue/Athena for queries

    • Create Glue tables on top of exported JSON logs.
    • Run ad hoc SQL with Athena instead of re-ingesting logs into CloudWatch.

 Monitor costs

    • Export task API calls are free, but twtech pays for:
      • S3 storage
      • S3 requests
      • Athena queries / Firehose costs (if streaming)

5. Common Pitfalls

    • Not real-time: Export tasks can lag hours for large log groups.
    • Daily limit: 100 export tasks per region/account.
    • Concurrent limit: Only one active export task at a time per region/account.
    • Missing S3 bucket policy: Logs silently fail if CloudWatch Logs can’t PutObject.
    • Querying raw gzipped JSON: twtech needs Glue crawlers or Athena schema definition.

6. Architecture Comparison

Method

Latency

Cost

Use Case

Export Task (Batch)

Minutes–Hours

Low (S3 only)

Historical exports, archive

Firehose Subscription (RT)

Seconds–Minutes

Higher (Firehose + S3)

Continuous ingestion, analytics

Lambda + S3

Near RT

Higher (Lambda + S3)

Custom transformation before S3


No comments:

Post a Comment

Amazon EventBridge | Overview.

Amazon EventBridge - Overview. Scope: Intro, Core Concepts, Key Benefits, Link to official documentation, Insights. Intro: Amazon EventBridg...