Friday, August 8, 2025

Lambda Concurrency & Throttling.

 

Here’s twtech deep dive on AWS Lambda Concurrency & Throttlinghow it works, what limits apply, and how to avoid “Throttled” errors.

1. The concept: Concurrency in Lambda

Concurrency is the number of function instances running at the same time in twtech AWS account per region.
Each concurrent execution is an isolated environment with its own memory, CPU, and /tmp storage.

  • Default account concurrency limit: 1,000 concurrent executions per region (soft limit — can request increase).
  • All functions in a region share this concurrency pool unless you set reserved concurrency for specific functions.

2. Types of Concurrency

Type

Purpose

Key Effect

Unreserved Concurrency.

Shared pool for all functions without specific reservations.

If one function uses too much, others may be throttled.

Reserved Concurrency.

Dedicated concurrency for a function (also caps its max concurrency).

Prevents noisy neighbor problems; guarantees capacity.

Provisioned Concurrency.

Pre-warmed instances always ready to handle requests.

Removes cold starts for predictable workloads; billed even when idle.

3. How Scaling Works

When requests come in:

  1. Lambda starts with burst concurrency:
    • Up to 1,000 concurrent executions per region (most AWS regions) in the first burst.
    • Some regions like US-East-1 have higher initial burst (~3,000).
  2. If requests exceed the current concurrency, Lambda ramps up at:
    • +500 instances every minute (until hitting twtech account limit).
  3. When the account concurrency limit is reached → extra requests are throttled.

4. Throttling

When Lambda can’t process a request because of concurrency limits, it throws a 429 TooManyRequestsException.
Behavior differs depending on invocation type:

  • Synchronous invocation (e.g., API Gateway, SDK calls):
    • Caller immediately gets a 429 error.
    • twtech can retry from the client side.
  • Asynchronous invocation (e.g., S3 events, EventBridge):
    • Lambda automatically retries twice (with delays).
    • After retries fail → event goes to Dead Letter Queue (DLQ) or on-failure destination if configured.

5. Key Quotas Affecting Concurrency

Limit

Default

Adjustable?

Account concurrency.

1,000

✅ Yes

Reserved concurrency per function.

0 → unlimited (bounded by account)

✅ Yes

Provisioned concurrency per function.

Billed limit per account

✅ Yes

Burst concurrency.

1,000 (some regions higher)

❌ No

Ramp-up rate

+500/min

❌ No

6. Avoiding Throttling

  • Use Reserved Concurrency to protect critical functions.
  • Request concurrency limit increases from AWS Support.
  • Use Provisioned Concurrency for low-latency, predictable workloads.
  • Queue or batch requests with services like SQS or Kinesis.
  • Spread load across regions if architecture allows.
  • Monitor metrics in CloudWatch:
    • ConcurrentExecutions
    • Throttles
    • ProvisionedConcurrencyUtilization

7. Quick Example

Let’s say:

  • Account concurrency = 1,000
  • Function A has Reserved Concurrency = 300
  • Function B has no reservation.

If Function A gets 350 requests at once:

  • 300 are processed.
  • 50 are throttled (or retried if async).

Meanwhile, Function B and others share the remaining 700 concurrency.

No comments:

Post a Comment

AWS Lambda | Cold Start, Warm Start & Provisioned Concurrency.

twtech break down of a Cold Start, Warm start   and Provisioned Concurrency for AWS Lambda . when they happen, and how to control them. 1. ...