Vocabulary for Load Balancing and Traffic Routing

Learn the essential English vocabulary for discussing load balancing algorithms, health checks, and traffic routing strategies.

Load balancing conversations get imprecise fast — “it’s routing traffic weirdly” doesn’t tell a teammate whether the problem is a health check, a routing algorithm, or session affinity. This vocabulary shows up constantly in incident calls, capacity planning discussions, and architecture reviews, so being specific about which mechanism is involved matters for getting to a fix quickly.

Foundational Concepts

1. Load balancer

A component that distributes incoming requests across multiple backend servers, aiming to spread load evenly and avoid overwhelming any single instance.

Usage: “The load balancer is sending disproportionate traffic to one instance — that’s usually a sign its health check or routing algorithm needs a second look.”

2. Health check

A periodic probe the load balancer sends to each backend to determine whether it’s healthy enough to receive traffic, removing unhealthy instances from rotation automatically.

Usage: “This instance kept receiving traffic during its slow restart because the health check endpoint returned success before the app had actually finished initializing.”

3. Backend (upstream)

An individual server or service instance that a load balancer forwards requests to, sometimes called an upstream in configuration contexts like Nginx.

Usage: “We added a new backend to the pool, but forgot to update the health check path, so the load balancer never marked it as ready.”

4. Round robin

A load balancing algorithm that distributes requests sequentially and evenly across all available backends, regardless of each backend’s current load.

Usage: “Round robin works fine here since all our instances are identical in capacity, but it wouldn’t handle a mixed fleet with different instance sizes well.”

5. Least connections

A load balancing algorithm that routes each new request to the backend currently handling the fewest active connections, useful when request processing time varies significantly.

Usage: “We switched to least connections because some of our requests take much longer than others, and round robin was sending new requests to already-busy instances.”

Routing and Distribution

6. Session affinity (sticky sessions)

A configuration where a load balancer routes all requests from the same client to the same backend instance, typically to preserve in-memory session state.

Usage: “We rely on sticky sessions here since session data lives in each instance’s memory — without affinity, a user could get bounced to an instance that doesn’t have their session.”

7. Failover

The automatic redirection of traffic away from a failed or unhealthy component to a working one, ideally with minimal or no disruption to the end user.

Usage: “Failover kicked in within a few seconds of the primary instance failing its health check, and traffic shifted cleanly to the standby without dropping requests.”

8. Circuit breaker

A pattern that stops sending requests to a failing dependency after a threshold of failures, allowing it time to recover instead of continuing to overwhelm it with traffic.

Usage: “The circuit breaker tripped after this dependency’s error rate crossed our threshold, which actually protected the rest of the system from cascading failures.”

9. Rate limiting

The practice of capping the number of requests a client can make in a given time window, protecting a service from being overwhelmed by a single source.

Usage: “We’re rate limiting this endpoint per API key now, after one misbehaving client’s retry loop nearly saturated the whole cluster.”

10. Traffic shaping

The deliberate control of how much and what kind of traffic flows to a system, such as gradually ramping up load during a rollout rather than sending it all at once.

Usage: “We’re using traffic shaping to send only 5% of requests to the new version initially, ramping up gradually as we confirm it’s behaving correctly.”

Deployment Patterns

11. Canary deployment

A rollout strategy where a new version receives a small percentage of production traffic first, allowing issues to be caught before a full rollout.

Usage: “We caught this regression during the canary deployment stage, when it was only affecting five percent of traffic, well before it reached everyone.”

12. Blue-green deployment

A deployment strategy where two identical environments (blue and green) exist, with traffic switched entirely from one to the other, allowing an instant rollback if needed.

Usage: “With blue-green deployment, rolling back this bad release was just a matter of switching the load balancer back to the previous environment — no redeploy required.”

13. Weighted routing

A traffic distribution strategy where different backends or versions receive different proportions of traffic based on assigned weights, rather than an even split.

Usage: “We’re using weighted routing to send 90% of traffic to the stable version and 10% to the experimental one while we validate it.”

14. Draining (connection draining)

The process of allowing existing connections to a backend to finish naturally before removing it from rotation, rather than terminating in-flight requests abruptly.

Usage: “We enabled connection draining before taking this instance out of rotation, so in-flight requests complete instead of getting dropped mid-response.”

15. Latency-based routing

A routing strategy that directs traffic to the backend or region expected to serve the request with the lowest latency, often used across geographically distributed deployments.

Usage: “Latency-based routing sends European users to our Frankfurt region automatically, without needing any client-side logic to pick the closest endpoint.”

Key Takeaways

  • Name the specific mechanism (health check, routing algorithm, session affinity) rather than describing traffic behavior only vaguely as “weird” or “broken.”
  • Distinguish round robin from least connections deliberately based on whether request processing time varies significantly across traffic.
  • Use canary and blue-green deployment vocabulary precisely when proposing a rollout strategy, since they represent genuinely different risk and rollback tradeoffs.
  • Mention connection draining explicitly when discussing how instances are removed from rotation, to avoid abruptly dropping in-flight requests.
  • Reach for circuit breakers and rate limiting as distinct, complementary protective patterns — one protects a failing dependency, the other protects against an overwhelming client.