5 exercises — practice structuring strong English answers for gaming infrastructure engineering interviews: authoritative servers, lag compensation, matchmaking, transport protocols, and session management.
How to structure gaming infrastructure interview answers
Authoritative server questions: server-side simulation → client-side prediction → server reconciliation → anti-cheat
Lag compensation questions: client vs. server time → rewind-and-replay → hit detection fairness trade-off
Matchmaking questions: ELO/Glicko → skill bands → queue time vs. quality trade-off → toxicity signals
Transport protocol questions: UDP advantages → head-of-line blocking → QUIC → when TCP is acceptable
The interviewer asks: "Explain the authoritative server model. Why do games use it, and how does client-side prediction work?" Which answer is most complete?
Option B is strongest. The anti-cheat framing opens by explaining WHY the authoritative model exists, which is the correct motivation. The "without client-side prediction" section quantifies the problem (100ms RTT = 100ms input delay) making the need for CSP concrete. The CSP mechanism is explained at the correct level: client simulates immediately, server processes with full game state (the key difference — the server knows about all other entities), and the server's result is authoritative. Server reconciliation is explained precisely: the client snaps to the server position on mismatch (a position correction, not a smooth interpolation). The entity interpolation section correctly explains why OTHER players use interpolation rather than prediction (the client cannot predict other players' inputs). The input sequence number detail explains how deterministic replay prevents position jitter — a specific game networking technique. Game networking vocabulary:Authoritative server — a server that owns the true game state and validates all client inputs. Client-side prediction (CSP) — simulating the local player's own movement before server confirmation. Server reconciliation — correcting the client's predicted state when the server's authoritative result differs. Entity interpolation — smoothing other entities' movements by interpolating between buffered server snapshots. Deterministic replay — re-simulating unacknowledged inputs on top of a corrected server state. Options C and D are accurate but lack the anti-cheat motivation and the entity interpolation rationale (why other players use interpolation, not prediction).
2 / 5
The interviewer asks: "How does lag compensation work in a first-person shooter, and what is the hit-detection fairness trade-off?" Which answer is most precise?
Option B is strongest. The opening problem statement precisely quantifies the disconnect: the player sees the world N ms ago, the server is N ms ahead — this is the correct technical framing. The mechanism section introduces the history buffer at tick rate (64Hz = 64 snapshots/second, 1-second history = 64 frames) — a specific implementation detail. The fairness trade-off is explained with a concrete scenario (150ms latency player, victim already behind cover) that makes the shooter-victim asymmetry tangible. The three mitigations are all used in production games: maximum rewind cap (Valve uses 300ms in CS:GO), hitscan-only lag compensation (not projectile physics), and partial compensation (used in some competitive games). The partial compensation detail is the most nuanced: compensating 50-70% means both shooter and victim share the latency penalty, which is the correct competitive fairness mechanism. Lag compensation vocabulary:Rewind-and-replay — rewinding the server world to a past state for hit detection. Tick rate — the frequency at which the server simulates the game world (64Hz = 64 ticks per second). Hitscan — instant-hit weapons that check hit detection at fire time (vs. projectile weapons with travel time). Maximum rewind cap — a server-side limit on how far back lag compensation will rewind. Partial compensation — reducing lag compensation to split latency penalty between shooter and victim. Options C and D are accurate but lack the quantified disconnect explanation and the partial compensation fairness rationale.
3 / 5
The interviewer asks: "Design a matchmaking system for a competitive online game. What factors determine match quality?" Which answer is most architectural?
Option B is strongest. The Glicko-2 section explains the key advantage over ELO at the mathematical level: Rating Deviation (RD) captures uncertainty, enabling faster calibration for new players (high RD → faster rating change) and stable ratings for veterans (low RD → slower change). This is the correct reason Glicko-2 is preferred. TrueSkill is correctly introduced for team games. The match quality function is a weighted composite with four named factors — this is how production matchmaking systems work (not just skill difference). The expanding search is parameterised with specific time intervals (±25 → ±50 → ±100 → ±200 every 30 seconds) that show operational knowledge. The anti-abuse section introduces behaviour score as a matchmaking input (not just post-match reporting), which is a specific systems detail that senior interviewers at Riot, Valve, or Activision will recognise. Matchmaking vocabulary:Glicko-2 — an extension of ELO with a rating deviation measuring uncertainty. Rating Deviation (RD) — the uncertainty in a player's skill rating; decreases as more games are played. TrueSkill — Microsoft's Bayesian matchmaking algorithm for team games. Expanding search — widening the acceptable skill range as queue time increases. Behaviour score — a metric measuring player conduct (AFK rate, toxicity, report rate) used in matchmaking. Options C and D are accurate but lack the RD mechanism explanation and the expanding search parameterisation.
4 / 5
The interviewer asks: "Why do games use UDP instead of TCP, and when would you consider WebSockets or QUIC?" Which answer is most precise?
Option B is strongest. The HoL blocking section provides a concrete quantification: 60Hz game packets every 16ms, RTT of 50-150ms means a retransmission delays 3-9 newer packets that contain more current data. This transforms an abstract concept into a measurable game problem. The "selective reliability" concept — only retransmitting important messages while dropping stale position updates — is the correct game networking insight that explains WHY UDP is used (not just that it is faster). The congestion control section explains the mechanism: TCP halves its send rate on packet loss (slow start), which directly causes missed game ticks. The WebSocket section correctly identifies browser games as the primary use case, and adds the hybrid architecture (WebSocket for chat alongside UDP for game state). The QUIC section correctly identifies multi-stream as the key feature for games — multiple independent data channels that should not block each other. Game networking vocabulary:Head-of-line blocking — TCP stalling newer packets while waiting for a lost packet to be retransmitted. Selective reliability — retransmitting only important messages while dropping stale positional updates. TCP slow start — TCP's congestion control that halves send rate on packet loss. QUIC — a UDP-based transport with stream multiplexing and TLS 1.3. Tick rate — the frequency of game server simulation steps and state broadcasts. Options C and D are accurate but lack the HoL blocking quantification and the selective reliability concept.
5 / 5
The interviewer asks: "How do you design session management for a multiplayer game, including connection recovery and graceful disconnection?" Which answer is most complete?
Option B is strongest. The five-phase structure maps to the complete lifecycle of a game session, which is the correct architectural framing. The session creation phase introduces Agones on Kubernetes as the current industry standard for game server orchestration — a specific tool that shows production knowledge. JWT/HMAC session token validation is the correct security mechanism (prevents session token forgery). The state persistence phase explains WHY Redis is used (periodic state flushing enables reconnect without full game restart). The reconnect phase is the most technically detailed: the reconnect timer (30-60 seconds, specific to competitive games), the AI takeover mechanism, and the full state sync after reconnect. The graceful disconnection phase includes the complete cleanup chain: Kafka event → PostgreSQL write → ELO update → token invalidation → pool return. This shows understanding of the event-driven architecture around game sessions. Game session vocabulary:Agones — an open-source game server management platform built on Kubernetes. Session token — a signed credential that authorises a player to join a specific game server instance. Reconnect window — the time a server holds a disconnected player's state before removing them. Full state sync — sending the complete current game state to a reconnecting client to catch up on missed snapshots. Hot-standby instance — a pre-allocated game server ready to receive players immediately. Options C and D list the phases correctly but lack the Agones detail and the reconnect state sync mechanism.