Learn vocabulary for Chaos Monkey, Gremlin, AWS FIS, LitmusChaos, ChaosBlade, and runbook-as-code concepts.
0 / 5 completed
1 / 5
What is Chaos Monkey in chaos engineering vocabulary?
Chaos Monkey (Netflix, 2010): one of the original chaos engineering tools. Randomly kills EC2 instances during business hours to ensure Netflix engineers build services that tolerate instance failure. Part of the Simian Army (a collection of chaos tools). Key vocabulary: 'opt-in chaos,' 'random termination,' 'resilience validation.' Chaos Monkey is intentionally simple — it does one thing (terminate instances) well.
2 / 5
What distinguishes Gremlin from open-source chaos tools in the vocabulary?
Gremlin vocabulary: 'attack' (a specific chaos fault), 'attack catalog' (library of fault types), 'target' (which hosts/containers to attack), 'scenario' (composed multi-step experiment). Gremlin's differentiators: broad fault library, halt button (stop all attacks instantly), RBAC, compliance reporting, and integrations with PagerDuty/Datadog. Compared to DIY tools, Gremlin reduces the engineering overhead of building and operating chaos infrastructure.
3 / 5
What is AWS Fault Injection Simulator (FIS) in chaos engineering vocabulary?
AWS FIS vocabulary: 'experiment template' (defines targets, actions, stop conditions), 'action' (specific fault: aws:ec2:terminate-instances, aws:rds:failover-db-cluster), 'stop condition' (CloudWatch alarm that halts the experiment if metrics breach a threshold), 'target' (resource filter by tag/ARN/percent). FIS is tightly integrated with AWS — no agents needed for most AWS resource faults. Pairs well with CloudWatch dashboards for observing experiment impact.
4 / 5
What is LitmusChaos in the Kubernetes chaos engineering vocabulary?
LitmusChaos vocabulary: ChaosHub (catalog of pre-built experiments), ChaosEngine (links application to experiment), ChaosExperiment (defines the fault), ChaosResult (records outcome). Kubernetes-native: experiments are declared as YAML CRDs and executed by chaos operators. Supports pod-level faults (kill, CPU hog, memory hog), node faults (drain, taint), and network faults (packet loss, latency). CNCF sandbox project with active community.
5 / 5
What is 'runbook-as-code' in chaos engineering vocabulary?
Runbook-as-code extends infrastructure-as-code principles to operational procedures: experiment definitions, hypothesis, target scope, stop conditions, rollback steps, and expected outcomes are all stored in version control alongside application code. Tools like Gremlin scenarios, AWS FIS templates, or Litmus ChaosEngine YAMLs implement this. Benefits: peer review via pull requests, repeatability, audit trail, CI/CD integration for automated resilience testing.