Sunday, July 20, 2025

Amazon SQS | With an Auto Scaling Group (ASG).

 

Using Amazon SQS with an Auto Scaling Group (ASG) is a powerful way to automatically scale twtech compute resources based on queue depth(how many messages are waiting to be processed).

This is common in decoupling of applications, event-driven architectures where producers send tasks to SQS and consumers in an ASG process them.

 Use Case

twtech have a system where tasks are pushed to an SQS queue, and you want your EC2 instances (ASG) to scale in/out depending on the number of pending messages.

 Architecture Components

1.     Amazon SQS Queue – Holds tasks/messages.

2.     Auto Scaling Group – Manages EC2 instances.

3.     CloudWatch Metrics & Alarms – Monitors queue size.

4.     Scaling Policies – Triggers ASG to scale based on CloudWatch alarms.

5.     Optional: Lambda/Container/EC2-based workers consuming messages.


Step-by-Step Integration

1. Enable CloudWatch Metrics for the SQS Queue

Enable the "ApproximateNumberOfMessagesVisible" metric (visible messages not yet processed).

2. Create an Auto Scaling Group (ASG)

Create an ASG with a launch template or configuration that spins up EC2 instances running your consumer app (polling SQS and processing messages).

3. Create Scaling Policies

Use CloudWatch alarms to scale EC2 instances in or out:

·        Scale-Out Policy:

o   Triggered when ApproximateNumberOfMessagesVisible > threshold (e.g., 100).

o   Increase desired capacity of ASG.

·        Scale-In Policy:

o   Triggered when ApproximateNumberOfMessagesVisible < threshold (e.g., 10).

o   Decrease desired capacity of ASG.

twtech can also divide the number of visible messages by how many messages each instance can handle to determine desired capacity.

4. Custom Metric (Optional)

To scale more precisely, you can use a custom CloudWatch metric:

# text
Messages per instance = ApproximateNumberOfMessagesVisible / Number of EC2 instances

5. Configure EC2 Consumers

Each EC2 instance in the ASG should:

·        Poll messages from SQS (using long polling).

·        Process and delete the messages.

Use IAM roles to grant permission to access the SQS queue.

 Best Practices

·        Long polling: Use long polling to reduce API calls and costs.

·        Visibility timeout: Tune the visibility timeout based on average processing time.

·        Dead-letter queue (DLQ): Set up a DLQ to handle failed messages.

·        Graceful shutdown: Handle termination lifecycle hooks so that EC2 finishes processing messages before termination.

 Alternative: Lambda + SQS (for serverless)

If twtech workload can be handled by short-running jobs and scales well with serverless, consider using AWS Lambda with SQS, this handles scaling for seamlessly.

twtech Insights:

SQS Queue Depth refers to the number of messages currently in the queue waiting to be processed.

 It's a critical metric when using Amazon SQS in systems that need to scale based on workload, such as those using Auto Scaling Groups (ASGs).

 The concept: Queue Depth

Queue depth is typically represented by these CloudWatch metrics:

Metric Name

Description

ApproximateNumberOfMessagesVisible

The number of messages available for retrieval from the queue.

ApproximateNumberOfMessagesNotVisible

The number of messages that are currently being processed (in flight).

ApproximateNumberOfMessagesDelayed

The number of messages delayed and not yet visible.


For ASG scaling, the most relevant is:

ApproximateNumberOfMessagesVisible – because it tells you how many messages are sitting in the queue, unprocessed.

 Use Case: Scaling Based on Queue Depth

Let’s say each EC2 instance can handle ~10 messages/minute. If twtech has 200 visible messages, it might want its Auto Scaling Group to scale out to 20 instances.

Formula Example:

# bash
Desired Instance Count = ceil(ApproximateNumberOfMessagesVisible / MessagesPerInstance)

 How to Monitor Queue Depth

twtech can monitor queue depth in Amazon CloudWatch:

·        Navigate to CloudWatch > Metrics > SQS > QueueName

·        Use ApproximateNumberOfMessagesVisible as twtech triggers for alarms.

·        Set thresholds (e.g., trigger scale-out if > 100, scale-in if < 10).

 Best Practices

1.     Use Long Polling: Reduces empty responses and cost.

2.     Tune Visibility Timeout: Prevents premature reprocessing of in-flight messages.

3.     Dead-Letter Queues (DLQ): Capture unprocessable messages.

4.     Throttle Consumers Based on Depth: Helps avoid over-scaling when queue drains quickly.

No comments:

Post a Comment

Kubernetes Clusters | Upstream Vs Downstream.

  The terms "upstream" and "downstream" in the context of Kubernetes clusters often refer to the direction of code fl...