5 exercises — Practice service mesh vocabulary in English: data plane, control plane, sidecar proxy, mTLS, traffic policies, Istio, Envoy, circuit breaking, and traffic shifting.
Core service mesh vocabulary clusters
Architecture: data plane (sidecar proxies), control plane (istiod/Pilot), sidecar injection, Envoy proxy
Observability: telemetry (metrics, logs, traces), Kiali, Jaeger integration, service graph
0 / 5 completed
1 / 5
A platform engineer explains service mesh architecture to a developer team adopting Istio: "A service mesh has two planes. The data plane is the sidecar proxies — Envoy containers injected into each pod. All traffic to and from your service goes through Envoy. Envoy handles retries, circuit breaking, mTLS, tracing. Your application knows nothing about any of this — it talks to localhost. The control plane — istiod in Istio — distributes configuration to all those Envoy proxies. You write a VirtualService CR in Kubernetes, istiod translates it to Envoy configuration, and pushes it to all proxies. Configuration change, no restarts needed." What is the relationship between the data plane and control plane in a service mesh?
Data plane: the Envoy sidecar proxies running alongside each application container. Every packet flows: [app] → [Envoy sidecar] → [network] → [Envoy sidecar] → [app]. Envoy performs L7 routing, load balancing, retries, timeouts, circuit breaking, mTLS termination, and telemetry — transparent to the app. Control plane (istiod): components: Pilot (service discovery, traffic config), Citadel (certificate authority for mTLS), Galley (config validation). In recent Istio versions, merged into istiod. Istiod uses xDS API (Envoy's discovery service API) to push configs to proxies. Sidecar injection: Kubernetes MutatingAdmissionWebhook automatically adds the Envoy container and an init container (to set up iptables rules intercepting traffic) to pods in labeled namespaces. Envoy proxy vocabulary: Listener: Envoy component that binds to a port and receives connections. Filter chain: the processing pipeline for each connection (HTTP connection manager, TLS inspector, etc.). Cluster: Envoy's representation of an upstream service. Endpoint: an individual backend (pod IP + port) within a cluster. Route: URL-to-cluster mapping rules. In conversation: 'The beauty of the sidecar model: you get observability, security, and traffic control for every service without changing a line of application code. The mesh handles it in the proxy.'
2 / 5
A security engineer explains zero-trust networking in a Kubernetes cluster: "By default in Kubernetes, any pod can talk to any other pod. That's a flat, implicitly trusted network. With Istio's mTLS, every connection between services is mutually authenticated and encrypted. The client proxy presents its certificate — its SPIFFE identity, derived from the Kubernetes service account. The server proxy validates it. Both sides verify each other. This happens in the data plane, transparent to the app. We also apply AuthorizationPolicy: 'only the payments service can call the orders service on path /api/orders'. Everything else is denied." What is mTLS and how does it differ from standard TLS?
TLS (one-way): client verifies server's certificate. Server proves its identity. Client identity is application-level (HTTP headers, JWT). The standard for HTTPS websites. mTLS (mutual TLS): both parties present certificates. Server verifies client identity cryptographically, not just application-level. SPIFFE (Secure Production Identity Framework For Everyone): standard for service identity. SPIFFE ID format: spiffe://trust-domain/ns/namespace/sa/service-account. Istio's Citadel acts as CA, issuing SVID (SPIFFE Verifiable Identity Document) certificates to each service account. Benefits: Cryptographic service identity: no password or token, the certificate IS the identity. Encryption in transit: all inter-service traffic encrypted. Zero-trust micro-segmentation: combined with AuthorizationPolicy, you can enforce "service A may call service B" at the proxy level. Istio PeerAuthentication: configures mTLS mode per namespace or workload. STRICT: only mTLS connections accepted. PERMISSIVE: both plaintext and mTLS accepted (migration mode). AuthorizationPolicy vocabulary: Principal: the identity of the caller (SPIFFE ID). Source: IP range, namespace, service account. Operation: HTTP method, path, host. Condition: request header values. In conversation: 'Once you enable STRICT mTLS across the cluster, you can write AuthorizationPolicies with confidence — you know exactly who is calling you, cryptographically.'
3 / 5
A senior engineer explains an Istio traffic shifting deployment strategy: "We want to roll out v2 of the product service without a full deployment switch. Using Istio VirtualService, we route 90% of traffic to v1 and 10% to v2. We monitor error rates and latency in Kiali. After an hour with no regressions, we shift to 50/50, then 100% v2. This is a canary release implemented entirely in the service mesh — no changes to Kubernetes Deployments, no additional load balancers. The DestinationRule defines v1 and v2 as subsets (label selector: version=v1 vs version=v2). The VirtualService controls the weight distribution." In Istio, what does a VirtualService do and how does it work with a DestinationRule?
VirtualService: defines how requests are routed. Attaches to a service's hostname. Rules can match: URI prefix, headers, source labels, method. Actions: weighted routing (canary), redirect, rewrite, retry policy, timeout, fault injection. DestinationRule: defines subsets (groups of pods by label) and applies policies per subset. Policies include: load balancing algorithm (round robin, least connections, consistent hash), circuit breaker (outlier detection), TLS settings. Together: VirtualService says "send 10% of traffic to subset v2"; DestinationRule says "v2 is pods with label version=v2, use consistent hash load balancing". Traffic management vocabulary: Traffic shifting: weight-based routing between versions. Header-based routing: route users with a specific header (e.g., X-Canary: true) to v2. Mirroring: copy traffic to a second version without serving the response — test v2 with real traffic, no user impact. Fault injection: deliberately inject delays or errors to test resilience. Retry policy: automatically retry on 5xx, configure attempts and retry-on conditions. Circuit breaker (outlier detection): eject unhealthy endpoints from the load balancing pool after too many 5xx errors. Istio Gateway vocabulary: Gateway: configures Envoy at the edge (ingress/egress), managing ports, protocol, TLS. Different from Kubernetes Ingress. In conversation: 'With VirtualService weights, a canary rollout is a one-line YAML change. No new Deployments, no load balancer reconfig. The mesh handles it.'
4 / 5
An SRE explains how they use Istio for resilience during an incident review: "The payment gateway was intermittently returning 503s — about 5% of requests. Without the mesh, those errors hit our checkout service directly, causing checkout failures. With Istio, we have a retry policy on the VirtualService: retry 503s up to 3 times with a 25ms delay. The circuit breaker in the DestinationRule kicks in if more than 10% of requests from a single proxy to payment fail in 1 second — it ejects that endpoint from the pool for 30 seconds. The checkout service saw near-zero errors during the payment degradation because the mesh absorbed the failures." In a service mesh, what does outlier detection (circuit breaking) do?
Outlier detection (Envoy circuit breaker): Istio's implementation of the circuit breaker pattern at the load balancing level. Configured in DestinationRule: consecutiveGatewayErrors: 5 — eject after 5 consecutive 503s. interval: 30s — evaluation window. baseEjectionTime: 30s — minimum ejection duration. maxEjectionPercent: 50 — never eject more than 50% of endpoints (prevents complete unavailability). How it differs from application circuit breaker (Resilience4j, Hystrix): Envoy outlier detection works per-endpoint within a cluster (individual pod IP), not per-service. It's fine-grained. Retry policy vs circuit breaker: Retry: retries a failed request against any available endpoint. Outlier detection: removes a specific bad endpoint from rotation. Both are needed: retry for transient failures, outlier detection for persistently failing instances. Istio observability vocabulary: Kiali: service graph UI for Istio. Shows traffic flow, health, mTLS status. Prometheus integration: Envoy emits metrics (request rate, latency, error rate) scraped by Prometheus. Distributed tracing: Envoy propagates B3/W3C trace headers — applications must forward them for end-to-end traces. Jaeger/Zipkin integration. Access log: Envoy can log every request with rich fields (upstream cluster, response code, bytes, duration). In conversation: 'The mesh circuit breaker saved us during the payment incident. Without it, the 5% error rate would have cascaded into 100% checkout failures as all threads blocked on the failing endpoint.'
5 / 5
A platform engineer discusses service mesh alternatives at an architecture review: "Istio is the full-featured option but it adds operational complexity — istiod, sidecar injector, the CRDs, the debug overhead. For teams that just need mTLS and basic observability, Linkerd is simpler: a Rust-based micro-proxy (not Envoy), lower resource overhead, easier to operate. Cilium with eBPF goes further — it implements the mesh at the kernel level without sidecars, so no added latency from proxy hops. The trade-off: Cilium requires newer kernels and is harder to debug. Choose based on your requirements: features vs simplicity vs performance." What is the main trade-off of using a sidecar-less (eBPF-based) service mesh like Cilium compared to a sidecar mesh like Istio?
Sidecar mesh costs: each pod gets an Envoy sidecar (typically 50-100MB RAM). Per-request latency: each hop adds ~0.5-1ms (ingress sidecar → network → egress sidecar). For microservices with many internal calls, this adds up. Sidecar injection complexity, debugging complexity (two containers per pod). Sidecar-less / eBPF mesh (Cilium, Calico with eBPF): network policies and observability enforced in the Linux kernel using eBPF programs. No sidecar container. Near-zero overhead. Cilium Mesh (Cilium + Hubble): provides L3/L4 network policy, L7 policy (HTTP, Kafka, DNS), flow visibility (Hubble UI), and mTLS via SPIRE. Requires Linux kernel 5.2+ ideally. Service mesh landscape: Istio: full-featured, mature, large community. Envoy data plane. Complex to operate. Linkerd: simpler, Rust micro-proxy (lighter than Envoy), better UX. Less feature-rich. Consul Connect: HashiCorp, cross-platform (VMs + K8s). AWS App Mesh: managed, Envoy-based, AWS-native integrations. Cilium: eBPF-based CNI + mesh. Istio ambient mode (Istio 2023+): moves Istio to a sidecar-less "ambient" model using a per-node proxy (ztunnel) for L4 and optional Waypoint proxies for L7 — aims to reduce sidecar overhead while keeping Istio's features. In conversation: 'We chose Linkerd over Istio because our team can actually understand it. The simpler model means fewer 3am incidents trying to debug why a VirtualService isn't matching.'