Lambda Concurrency & Throttling.

Friday, August 8, 2025

Lambda Concurrency & Throttling.

Here’s twtech deep dive on AWS Lambda Concurrency & Throttling — how it works, what limits apply, and how to avoid “Throttled” errors.

1. The concept: Concurrency in Lambda

Concurrency is the number of function instances running at the same time in twtech AWS account per region.
Each concurrent execution is an isolated environment with its own memory, CPU, and /tmp storage.

Default account concurrency limit: 1,000 concurrent executions per region (soft limit — can request increase).
All functions in a region share this concurrency pool unless you set reserved concurrency for specific functions.

2. Types of Concurrency

Type	Purpose	Key Effect
Unreserved Concurrency.	Shared pool for all functions without specific reservations.	If one function uses too much, others may be throttled.
Reserved Concurrency.	Dedicated concurrency for a function (also caps its max concurrency).	Prevents noisy neighbor problems; guarantees capacity.
Provisioned Concurrency.	Pre-warmed instances always ready to handle requests.	Removes cold starts for predictable workloads; billed even when idle.

3. How Scaling Works

When requests come in:

Lambda starts with burst concurrency:

Up to 1,000 concurrent executions per region (most AWS regions) in the first burst.
Some regions like US-East-1 have higher initial burst (~3,000).

If requests exceed the current concurrency, Lambda ramps up at:

+500 instances every minute (until hitting twtech account limit).

When the account concurrency limit is reached → extra requests are throttled.

4. Throttling

When Lambda can’t process a request because of concurrency limits, it throws a 429 TooManyRequestsException.
Behavior differs depending on invocation type:

Synchronous invocation (e.g., API Gateway, SDK calls):

Caller immediately gets a 429 error.
twtech can retry from the client side.

Asynchronous invocation (e.g., S3 events, EventBridge):

Lambda automatically retries twice (with delays).
After retries fail → event goes to Dead Letter Queue (DLQ) or on-failure destination if configured.

5. Key Quotas Affecting Concurrency

Limit	Default	Adjustable?
Account concurrency.	1,000	✅ Yes
Reserved concurrency per function.	0 → unlimited (bounded by account)	✅ Yes
Provisioned concurrency per function.	Billed limit per account	✅ Yes
Burst concurrency.	1,000 (some regions higher)	❌ No
Ramp-up rate	+500/min	❌ No

6. Avoiding Throttling

Use Reserved Concurrency to protect critical functions.
Request concurrency limit increases from AWS Support.
Use Provisioned Concurrency for low-latency, predictable workloads.
Queue or batch requests with services like SQS or Kinesis.
Spread load across regions if architecture allows.
Monitor metrics in CloudWatch:

ConcurrentExecutions
Throttles
ProvisionedConcurrencyUtilization

7. Quick Example

Let’s say:

Account concurrency = 1,000
Function A has Reserved Concurrency = 300
Function B has no reservation.

If Function A gets 350 requests at once:

300 are processed.
50 are throttled (or retried if async).

Meanwhile, Function B and others share the remaining 700 concurrency.

Think - with -Tech

Friday, August 8, 2025

Lambda Concurrency & Throttling.

No comments:

Post a Comment

AWS Lambda | Cold Start, Warm Start & Provisioned Concurrency.

Blog Archive