twtech break down of a Cold Start, Warm start and Provisioned Concurrency for AWS Lambda .
when they happen, and how
to control them.
1. The concept: A Cold Start
A cold
start happens when AWS Lambda needs to create a new execution environment before twtech function can run.
It happens when:
- A function is invoked for the first time.
- AWS has scaled up concurrency (new instances).
- An execution environment has been inactive long
enough that AWS reclaims it.
Cold start process:
- Download & load code (from S3 or ECR).
- Create runtime container (Node.js, Python, Java, etc.).
- Initialize environment (init phase — run global/static code, import
libraries).
- Invoke handler
(twtech actual function logic starts).
Cold starts add extra
latency:
- Java, .NET, large packages → 500ms to multiple seconds.
- Python, Node.js, Go
→ typically 50–300ms.
2. The coneept: A Warm Start.
If an execution
environment is already running (within the idle retention window), Lambda just reuses it.
- No container creation
→ near-instant start.
- Warm start latency: ~1–10ms (depends on function
logic).
3. Provisioned Concurrency
Provisioned Concurrency (PC) keeps execution environments
pre-initialized and ready to respond
instantly.
- How Provisioned Concurrency
works:
- twtech configures N
provisioned concurrency instances for
its function or alias.
- AWS keeps them warm 24/7.
- Requests first go to these pre-warmed instances before
falling back to on-demand capacity (which may cause cold starts).
- Benefits:
- Eliminates cold starts for predictable workloads.
- Keeps latency consistent.
- Costs:
- twtech pays for:
- Provisioned concurrency hours (like reserved compute).
- Invocations (normal Lambda cost).
4. When to Use Provisioned Concurrency
✅ API endpoints where latency matters.
✅ Scheduled
jobs with tight SLAs.
✅ High-traffic events (Black Friday, product
launches).
❌ Sporadic,
low-frequency functions (waste of cost).
5. Cold Start vs. Provisioned Concurrency Flow
Feature |
Cold Start |
Provisioned Concurrency |
Init time. |
50ms–3s+ |
~0ms |
Latency impact. |
Yes, first request after idle |
None |
Cost. |
Pay only for execution |
Pay for warm capacity + execution |
Scaling. |
On-demand |
Pre-allocated |
Use case. |
Sporadic jobs |
Predictable latency workloads |
6. Optimization Without Provisioned Concurrency
If PC cost isn’t
justified:
- Keep functions small
(minimize package size).
- Choose fast runtimes
(Python/Node.js > Java/.NET for startup).
- Avoid heavy init in global scope (lazy-load inside handler).
- Use Lambda SnapStart
(Java only) to snapshot pre-initialized state.
- Send periodic “keep warm” pings (CloudWatch scheduled events).
No comments:
Post a Comment