Intermediate Cloud-Native #serverless #cold-start #faas #lambda #concurrency

Serverless Language

5 exercises — master the vocabulary of serverless computing: cold starts, synchronous vs. asynchronous invocation, execution timeouts, stateless constraints, and concurrency management.

0 / 5 completed

Serverless vocabulary quick reference

Cold start — latency of provisioning a new execution environment (runtime init + package download + init code)
Warm invocation — reuses an existing environment; only the handler runs (much faster)
Synchronous invocation — caller waits for function to complete; return value received (API Gateway, ALB)
Asynchronous invocation — caller fires and forgets; Lambda manages retries + DLQ (S3, SNS, EventBridge)
Execution timeout — maximum wall-clock duration per invocation (AWS Lambda: 15 minutes)
Reserved concurrency — caps a function's max simultaneous executions AND reserves that capacity from the account pool
Provisioned concurrency — pre-warms N environments to eliminate cold starts (costs more)
DLQ (Dead Letter Queue) — receives events after all async retry attempts are exhausted

1 / 5

A developer observes: "The first request to this Lambda function after 15 minutes of inactivity takes 800ms; subsequent requests in the following minutes take 12ms."

What is the 800ms overhead called, and what causes it?

Cold start is the cost of spinning up a new execution environment from scratch — subsequent invocations reuse the warm environment and skip these steps.

What happens during a cold start:

Phase	What occurs	Typical cost
Environment init	Cloud provider allocates a microVM / container for the execution environment	50–200ms
Runtime start	JVM / Node.js / Python / .NET runtime initializes	10–500ms (JVM worst case)
Deployment package download	ZIP or container image is fetched and extracted	Varies by package size
Init code execution	Code outside the handler runs: imports, DB connection setup, SDK initialization	Varies by application
Handler invocation	Your function handler runs	12ms (warm: only this step)

Cold start mitigation strategies:
• Provisioned Concurrency (AWS) — pre-initialize a fixed number of execution environments; no cold start on invocation
• Keep-alive pings — schedule an EventBridge event every few minutes to keep environments warm (not reliable for production)
• Compiled languages — Go and Rust functions have near-zero cold starts vs. JVM cold starts of 2–6s with large frameworks
• Reduce package size — smaller deployment packages initialize faster; use tree-shaking to exclude unused dependencies
• Move heavy init outside handler — DB connections, loaded ML models, and SDK clients initialized at the module level persist across warm invocations

Key vocabulary:
• Cold start — the latency of initializing a new serverless execution environment when no warm instance is available
• Warm invocation — an invocation that reuses an already-initialized execution environment, skipping the cold start phases
• Provisioned Concurrency — a configuration that keeps a specified number of Lambda execution environments initialized and ready at all times
• Execution environment — the isolated runtime container (microVM in AWS Lambda's case) that hosts a function instance

2 / 5

An architect notes: "This order-confirmation Lambda is triggered by an S3 PutObject event when a receipt file is uploaded — not by an API Gateway HTTP request."

Which invocation model does an S3 event trigger use, and what does this mean for how errors and retries are handled?

S3 event notifications use the asynchronous invocation model — S3 does not wait for your function to complete.

Lambda invocation models:

Model	Caller waits?	Return value	Retry on error	Common triggers
Synchronous	Yes	Returned to caller	Caller's responsibility	API Gateway, ALB, SDK `RequestResponse`
Asynchronous	No	Discarded	Up to 2 automatic retries; DLQ on exhaustion	S3 events, SNS, EventBridge, SES
Poll-based (stream)	N/A	Checkpoint position	Retry until success or bisection window expires	SQS, Kinesis, DynamoDB Streams, Kafka (MSK)

Asynchronous Lambda error handling:

S3 event fires → Lambda async queue → attempt 1 → fails
                                      → retry after ~1min → attempt 2 → fails
                                      → retry after ~2min → attempt 3 → fails
                                      → send to Dead Letter Queue (SQS or SNS)
                                      → or Destinations: onFailure → SQS/SNS/EventBridge/Lambda

Practical implications for the S3-triggered function:
• The function may execute 1–3 times if it keeps failing — your logic must be idempotent to avoid duplicate processing
• S3 guarantees at-least-once event delivery — same event can arrive multiple times
• Use a DLQ (Dead Letter Queue) to capture failed events for investigation and reprocessing

Key vocabulary:
• Synchronous invocation — the caller waits for the function to complete and receives the return value
• Asynchronous invocation — the caller hands off the event and does not wait; Lambda manages retries independently
• Dead Letter Queue (DLQ) — an SQS or SNS destination that receives events after all retry attempts have been exhausted
• Idempotent — a function that produces the same outcome whether invoked once or multiple times with the same input

3 / 5

A Lambda function that processes large PDF documents fails with the error: "Task timed out after 900.00 seconds."

What is this limit called in serverless contexts, and what is the recommended architectural approach for operations that regularly approach this boundary?

AWS Lambda's maximum execution timeout is 15 minutes (900 seconds). This is a hard architectural constraint, not a tunable limit.

Execution timeout limits across FaaS platforms:

Platform	Max timeout	Notes
AWS Lambda	15 minutes	Hard limit; cannot be increased
Google Cloud Functions	60 minutes (gen 2)	9 min for gen 1 HTTP functions
Azure Functions	Unlimited (Premium/Dedicated); 10min Consumption	Durable Functions enable long workflows
Cloudflare Workers	30 seconds (CPU time)	Wall time can be longer for I/O operations

Architectural patterns for long-running operations:
• Step Functions (orchestration) — decompose PDF processing into steps: extract text → analyze → generate summary → store. Each step is a separate Lambda invocation. State machine timeout: 1 year
• SQS + long-running worker (ECS/Fargate) — Lambda enqueues the job; a container worker processes it without time constraints
• Async + polling — Lambda starts the job and returns a job ID; the client polls a status endpoint until complete
• Lambda SnapStart (Java) — reduce cold start for JVM functions, but does not extend timeout

Key vocabulary:
• Execution timeout — the maximum wall-clock time a serverless function is permitted to run in a single invocation
• Step Functions — AWS managed workflow service for orchestrating multi-step processes with long timeouts and error handling
• FaaS (Function-as-a-Service) — the serverless computing model where individual functions are the deployment unit (Lambda, Cloud Functions, Azure Functions)
• Orchestration vs. choreography — orchestration uses a central coordinator (Step Functions); choreography uses event reactions between independent services

4 / 5

A developer reports: "I tried storing the user's session token in a global variable so the next request from the same user would be faster — but the next request went to a completely different execution environment and couldn't find it."

Which pattern correctly handles user session state in a serverless architecture?

Serverless functions are stateless by design — execution environments can be created, reused, or terminated at any time. State must live outside the function.

Why in-function state is unreliable:

Request 1 → execution env A (global var set: session = "abc123")
Request 2 → execution env B (new env, global var empty: session = undefined)
Request 3 → execution env A (warm, global var still set: session = "abc123")
Request 4 → execution env B (warm, still empty)

External state store options for serverless:

Store	Use case	Latency
DynamoDB	Session tokens, user preferences, idempotency keys	Single-digit ms
ElastiCache (Redis)	High-frequency session data, rate-limit counters, leaderboards	Sub-millisecond
S3	Large artifacts, intermediate processing results, batch outputs	10–50ms
RDS / Aurora	Relational data; use RDS Proxy to manage connection pooling for Lambda	1–10ms with proxy

The /tmp caveat (option A is wrong):
/tmp (512MB–10GB) persists across warm invocations within the same execution environment, but is not shared across execution environments. It is suitable for caching computed data within a single warm run (e.g., a downloaded ML model), not for user session data that could be handled by any environment.

Stateless function + external state = shared-nothing architecture:
Every execution environment is interchangeable — any instance can handle any request. This enables horizontal scaling with no coordination overhead.

Key vocabulary:
• Stateless function — a function that carries no state between invocations; all state is read from and written to external stores
• Shared-nothing architecture — a design where each execution unit is independent and fully interchangeable, with no shared in-process state
• RDS Proxy — an AWS-managed connection pool for Lambda functions accessing RDS, preventing connection exhaustion during Lambda concurrency spikes

5 / 5

A platform engineer configures reserved concurrency of 50 on a Lambda function that processes database writes for a legacy MySQL database with a connection limit of 60.

What does reserved concurrency do, and why is a limit of 50 intentional in this scenario?

Reserved concurrency serves two purposes: it caps a function's concurrency ceiling AND guarantees a floor of available concurrency within the account.

Concurrency mechanics in AWS Lambda:

Concurrency type	What it does	Effect
Account limit	Default 1,000 concurrent executions per region (adjustable)	Hard cap across all functions in the account-region
Reserved concurrency	Sets per-function max AND reserves that capacity from the account pool	Other functions cannot use the reserved portion; this function cannot exceed it
Provisioned concurrency	Pre-initializes N execution environments (eliminates cold start)	Higher cost; useful for latency-sensitive functions
Unreserved concurrency	Shared pool for functions without a reservation; scales freely up to remaining account limit	One noisy function can starve others

The connection storm problem without reserved concurrency:

Traffic spike → Lambda scales to 400 concurrent executions
400 functions × 1 connection each = 400 MySQL connections
MySQL max_connections = 60 → connection refused, errors cascade

With reserved concurrency = 50:

Traffic spike → Lambda attempts to scale → capped at 50 concurrent executions
50 functions × 1 connection = 50 MySQL connections
MySQL max_connections = 60 → 10 connections spare, system stable
Excess requests → throttled with 429 TooManyRequests (better than DB crash)

Key vocabulary:
• Reserved concurrency — a per-function configuration that simultaneously caps maximum concurrency and reserves that capacity from the account pool
• Provisioned concurrency — pre-initializes execution environments to eliminate cold starts (different from reserved concurrency)
• Connection storm — a scenario where a sudden surge in concurrent function executions exhausts the connection limit of a downstream database
• Throttling (429) — Lambda returns TooManyRequests when reserved concurrency is hit; the caller must retry, typically with exponential backoff