Design patterns: stateless function, fan-out, orchestration vs choreography, Lambda Destinations, dead letter queue
0 / 5 completed
1 / 5
A backend engineer explains a latency complaint to their manager: "Our Lambda function processes image uploads from S3. During business hours it's fast — 50ms. But first thing in the morning, the first few requests take 2-3 seconds. After those, it's fine again. This is a cold start problem. AWS has to allocate a new execution environment: download the deployment package, initialize the runtime, run our initialization code. Once the environment exists and is warm, subsequent invocations reuse it instantly. The fix options are provisioned concurrency — which keeps environments pre-warmed — or reducing the package size and init code." What is a cold start in serverless computing?
Cold start anatomy: when Lambda needs a new execution environment, it goes through: (1) provisioning a micro-VM (Firecracker), (2) downloading and extracting the deployment package or container image, (3) initializing the language runtime (JVM startup is notorious here), (4) running any code outside the handler function (global initialization, database connection setup, SDK client initialization). The handler then runs — this is the invoke phase. Subsequent invocations on the same environment skip steps 1-4 (warm start). Cold start factors: Runtime: interpreted languages (Node.js, Python) start faster than compiled JVM languages (Java, Kotlin). Package size: larger packages take longer to download and extract. VPC configuration: historically added 10+ seconds for ENI attachment (fixed with hyperplane ENIs in 2020, but still adds latency). Memory allocation: more memory = more CPU, which can reduce init time. Mitigations: Provisioned concurrency: pre-initializes N execution environments, eliminating cold starts entirely at cost. Reduce package size: use esbuild/webpack to tree-shake, use Lambda layers for shared dependencies. Move heavy init outside the handler: DB connections, SDK clients — initialize once, reuse across invocations on the same environment. In conversation: 'For user-facing APIs, cold starts are usually unacceptable. Budget for provisioned concurrency on your latency-sensitive functions.'
2 / 5
A solutions architect explains a production incident to new engineers: "During our Black Friday sale, we suddenly got a 429 TooManyRequestsException from Lambda. Our function hit the account-level concurrency limit. By default, AWS Lambda allows 1,000 concurrent executions per region across all functions. We had one function consuming 950 of them, starving everything else. We fixed it by setting reserved concurrency on the high-volume function — capping it at 500 — so other functions always have headroom. For the critical payment function, we set reserved concurrency to 100, guaranteeing it always has capacity." What is the difference between reserved concurrency and provisioned concurrency?
Concurrency: the number of function instances handling requests simultaneously. Each concurrent invocation requires its own execution environment. Reserved concurrency: carves out N units from the account pool exclusively for this function. Dual effect: guarantees the function has at least N capacity (can't be starved); caps the function at N (throttles at that limit, returning 429). Setting reserved = 0 effectively disables a function. No extra cost. Provisioned concurrency: pre-initializes N environments so they are ready before any invocation. Eliminates cold starts for up to N concurrent requests. Has a cost: you pay for the pre-initialized environments even when idle. Useful for latency-sensitive APIs. Can be combined with reserved: reserve 100 for a function, provision 20 of them warm. Lambda scaling vocabulary: Burst limit: initial burst of 500-3,000 concurrent executions (varies by region), then scaling at +500/minute. Account concurrency limit: default 1,000 per region (can be increased via Service Quotas). Throttle: Lambda returns 429 when concurrency limit is exceeded. For synchronous (API Gateway) invocations, the caller receives the error. For async, Lambda retries before sending to DLQ. Lambda Destinations: route async invocation results (success or failure) to SQS, SNS, EventBridge, or another Lambda — a cleaner alternative to DLQ for async flows. In conversation: 'Reserve concurrency for your most important functions first, not as an afterthought after an incident.'
3 / 5
A developer explains a design principle to a junior engineer joining the serverless team: "Serverless functions must be stateless. The execution environment may be reused across invocations on the same container, but you cannot rely on it. Lambda may spin up a new environment at any time — after an update, after a period of inactivity, after scaling. Anything you store in memory between invocations might be gone. State belongs in external storage: DynamoDB for application state, S3 for files, ElastiCache for caching, SQS for work queues. The /tmp directory persists within the same environment but is not shared between instances." Why must serverless functions be stateless?
Stateless design: each invocation should be self-contained, reading any needed state from external sources and writing results back to external stores. This enables the platform to freely scale horizontally (many concurrent instances) and replace environments (after code updates, timeouts, or idle recycling). Execution environment lifecycle: Init: environment created, global code runs once. Invocations 1-N: handler called repeatedly, environment reused. Shutdown: environment recycled after idle timeout (~15 minutes) or platform decision. In-memory data from invocation N is available in invocation N+1 on the same environment (useful for connection reuse) but is unreliable and non-shared across concurrent instances. External state stores for serverless: DynamoDB: key-value / document store, serverless-native (on-demand billing, per-request capacity). S3: object storage for files, large data. ElastiCache / DAX: in-memory caching (requires VPC). SQS: durable work queue with at-least-once delivery. Step Functions: managed workflow state across multiple Lambda invocations. /tmp: 512MB-10GB ephemeral storage local to the execution environment — great for caching downloads within a single environment's lifetime, not reliable across invocations. In conversation: 'If you find yourself setting a module-level variable to cache state between calls, ask: what happens when Lambda scales to 100 concurrent instances? Each has its own copy. You need DynamoDB.'
4 / 5
A cloud architect presents an event-driven pipeline design: "When a user uploads a document, S3 triggers our Lambda. That Lambda fans out: it sends the document to SQS queues consumed by three processing functions — one for text extraction, one for virus scanning, one for metadata indexing. These three run in parallel. None waits for the others. The original upload Lambda is synchronous — API Gateway calls it and waits. The processing Lambdas are async — they're triggered by SQS events. The difference matters for error handling: synchronous failures return to the caller; async failures go to the Dead Letter Queue." What is the difference between synchronous and asynchronous Lambda invocation?
Invocation models: Lambda supports three invocation models. Synchronous (push): caller waits for function to return. Triggers: API Gateway, ALB, Lambda URLs, SDK direct invoke. Error handling: exception propagates to caller. The caller must implement retries. Asynchronous (event): Lambda queues the event internally and returns 202 immediately. Triggers: S3, SNS, EventBridge, CloudWatch Logs. Error handling: Lambda retries twice on failure (configurable). After retries, sends to DLQ (SQS or SNS) or Lambda Destination. Polling (stream/queue): Lambda polls the source on your behalf. Triggers: SQS, Kinesis, DynamoDB Streams, MSK. Error handling: on failure, Lambda retries the batch. For SQS, failed items can be sent to a DLQ. For streams, processing stops at the failed batch until it succeeds or expires. Fan-out pattern: one function triggers multiple downstream functions or services — enables parallel processing. Implement via SNS (broadcast to multiple SQS queues or Lambdas), EventBridge (content-based routing), or Step Functions Map state (parallel branches). Dead Letter Queue vocabulary: DLQ: messages/events that failed all retry attempts land here. Enables manual investigation and reprocessing. Separate from Lambda Destinations (which captures both success and failure outcomes). In conversation: 'Always configure a DLQ for async Lambda triggers. Silent failures — events that disappear after retries — are the hardest bugs to diagnose weeks later.'
5 / 5
A platform engineer compares serverless and container-based deployments: "The key trade-off is control versus operational overhead. With Lambda, you write your handler, set memory and timeout, configure triggers — AWS handles everything else. No EC2 instances, no OS patching, no container orchestration. Billing is per-invocation and per-millisecond of execution. The downside: you're limited to 15-minute execution time, 10GB of memory, and the cold start penalty. For long-running jobs — video transcoding, ML training — containers or spot instances are better. Lambda excels at event-driven glue code, API backends with spiky traffic, and scheduled tasks." What does the term Function-as-a-Service (FaaS) mean and how does it differ from PaaS?
Cloud service model spectrum: IaaS (EC2: you manage OS, runtime, application) → PaaS (Heroku, App Engine, Elastic Beanstalk: you deploy application, provider manages runtime) → FaaS (Lambda, Cloud Functions, Azure Functions: you deploy individual functions, provider manages everything) → SaaS (you just use the service). FaaS characteristics: granular deployment unit (function, not application), event-driven execution model, scale-to-zero (no idle cost), very short-lived execution (milliseconds to 15 minutes max), stateless by design, billing measured in invocations × duration × memory. FaaS vs PaaS: PaaS hosts long-running processes (a Node.js server, a Python Django app) on managed infrastructure — you still have a running server, you just don't manage it. FaaS has no persistent running process — the function only exists during execution. FaaS vendor examples: AWS Lambda, Google Cloud Functions (1st/2nd gen), Azure Functions, Cloudflare Workers (V8 isolates, not micro-VMs — extremely fast cold starts, sub-millisecond). Lambda constraints: 15-minute max timeout, 10GB max memory, 250MB unzipped deployment package (10GB for container image), 512MB-10GB /tmp, 1,000 default concurrent executions per region. In conversation: 'FaaS is not a universal replacement for servers. It shines for event processing, API backends with unpredictable traffic, and scheduled jobs. For WebSocket servers, long-running computations, or stateful workflows, look elsewhere.'