Cloud Computing Vocabulary: 30 Essential Terms Explained
The 30 most important cloud computing terms every developer needs to know: regions, availability zones, serverless, IaC, IAM, auto-scaling, managed services, SLA, and more — with real examples.
Cloud computing has its own vocabulary — and it is one of the most important vocabulary sets for any modern developer to master. Whether you are reading AWS documentation, joining a DevOps team, preparing for a cloud certification, or discussing infrastructure in a technical interview, these terms will come up constantly.
This reference covers the 30 most essential cloud computing vocabulary terms, grouped by theme, with clear definitions and real-world example sentences.
Infrastructure Fundamentals
Region
A region is a geographic area where a cloud provider operates one or more data centres. AWS regions include us-east-1 (Virginia), eu-west-1 (Ireland), ap-southeast-1 (Singapore). Google Cloud and Azure have equivalent regional divisions.
Each region is independent — resources in one region do not automatically replicate to another. Your choice of region affects latency (distance to your users), compliance (data residency laws), and service availability (not all services are available in all regions).
“We migrated our European users’ data from us-east-1 to eu-west-1 to comply with GDPR data residency requirements.”
Availability Zone (AZ)
An availability zone is an isolated data centre (or cluster of data centres) within a region. A region typically has 3 or more AZs. They are physically separated but connected by high-speed low-latency links.
Deploying across multiple AZs provides high availability — if a power failure or natural disaster affects one AZ, your application continues serving traffic from the others.
“We run three instances of our API — one per availability zone — so a single AZ outage doesn’t cause downtime.”
Virtual Private Cloud (VPC)
A VPC is a logically isolated network within a cloud provider’s infrastructure. You define your own IP address range (CIDR block), create subnets (public and private), set up route tables, and control traffic with security groups and network ACLs.
Resources in a VPC are not reachable from the internet by default — you must explicitly configure access. This is the foundation of cloud network security.
“All our database instances are in private subnets with no internet route — only the application servers in the public subnet can reach them.”
Subnet (Public vs Private)
A subnet is a range of IP addresses within a VPC. A public subnet has a route to the internet gateway — resources here can send and receive internet traffic. A private subnet has no direct internet route — resources here are not directly reachable from the internet.
Best practice: put web servers and load balancers in public subnets; put databases, caches, and internal services in private subnets.
“The RDS database instance is in a private subnet — it can only be reached from the application servers within the same VPC.”
Compute
Virtual Machine (VM) / Instance
A virtual machine (called an instance in cloud context) is a virtualised server running on shared physical hardware. You choose the instance type (CPU, memory, storage), operating system, and network configuration.
Cloud instances are billed per hour or per second and can be started, stopped, and terminated on demand.
“We upgraded the application server from a t3.medium to a c6i.large instance — the compute-optimised instance type reduced API response times by 35%.”
Serverless / Function as a Service (FaaS)
Serverless is an execution model where you deploy code without managing servers. The cloud provider allocates compute on demand, scales automatically (including to zero), and charges only for execution time. Examples: AWS Lambda, Google Cloud Functions, Azure Functions.
A cold start is the latency overhead when a function is invoked for the first time after being idle — the provider must spin up a new execution environment.
“Our image processing runs as a Lambda function — it only executes when a user uploads a file, so we pay nothing during quiet hours.”
Container
A container is a lightweight, portable unit of software that packages code, runtime, dependencies, and configuration. Containers share the host OS kernel (unlike VMs, which virtualise the entire OS). Docker is the most common container runtime; Kubernetes is the most common orchestrator.
“Every service is containerised — a developer can run the entire stack locally with docker-compose and the production environment is identical.”
Spot Instance / Preemptible VM
A spot instance (AWS) or preemptible VM (GCP) is a heavily discounted cloud instance that can be interrupted (terminated) by the provider with short notice (typically 2 minutes) when they need the capacity back. Usually 60–90% cheaper than on-demand instances.
Suitable for: batch jobs, ML training, CI/CD, stateless workers — anything that can handle interruption and restart.
“Our ML training pipeline uses Spot Instances exclusively — each job checkpoints every 10 minutes, so an interruption only loses a few minutes of work, saving 75% on compute costs.”
Auto-scaling
Auto-scaling automatically adjusts the number of running instances based on demand. Scale out: add instances when load increases. Scale in: remove instances when load drops. Triggered by metrics like CPU utilisation, request rate, or queue depth.
“The auto-scaling group maintains a minimum of 2 instances and scales out to 20 during peak traffic. We saved 40% on EC2 costs by moving from static over-provisioning to auto-scaling.”
Storage
Object Storage
Object storage stores data as objects (files) with a unique key and metadata, accessed via HTTP API. There is no directory hierarchy — objects are just keys in a flat namespace. Infinitely scalable, highly durable (S3 offers 11 nines — 99.999999999% durability), and cheap at scale.
Examples: Amazon S3, Google Cloud Storage (GCS), Azure Blob Storage.
“All user-uploaded media is stored in S3 — files are served via CloudFront CDN, so we never pay egress on cached assets.”
Block Storage
Block storage provides raw disk storage for virtual machines — equivalent to a hard drive attached to a server. Used for operating systems, databases, and application data that requires a filesystem. Examples: AWS EBS (Elastic Block Store), GCP Persistent Disk.
“The PostgreSQL database runs on an instance with a 500GB EBS gp3 volume — we chose gp3 over gp2 because the throughput can be tuned independently of size.”
Egress (and Egress Cost)
Egress is data flowing out of a cloud provider’s network (to the internet or to another cloud). Cloud providers charge for egress; ingress (data in) is typically free. Egress costs can be significant for data-intensive applications.
“Moving 1TB of data out of AWS to our office costs $90 in egress fees — this is a significant budget consideration when planning data migrations or multi-cloud architectures.”
Networking
Load Balancer
A load balancer distributes incoming traffic across multiple backend instances. It performs health checks and routes traffic only to healthy instances. Types: ALB (Application Load Balancer — HTTP/HTTPS, path-based routing), NLB (Network Load Balancer — TCP/UDP, ultra-low latency).
“The ALB routes /api/ requests to the backend service and /* to the frontend service. If any instance fails 3 consecutive health checks, it is removed from the target group automatically.”*
CDN (Content Delivery Network)
A CDN is a globally distributed network of edge servers that cache and serve content close to end users, reducing latency for static assets (images, CSS, JavaScript). Examples: CloudFront, Cloudflare, Fastly, Akamai.
“After configuring CloudFront, time-to-first-byte for European users dropped from 450ms to 35ms, as assets are now served from Frankfurt instead of our US origin.”
DNS (Domain Name System)
DNS translates human-readable domain names (api.example.com) into IP addresses. Cloud providers offer managed DNS with features like health checks, failover routing, geolocation routing, and latency-based routing. AWS Route 53 is a common example.
“We use Route 53 with weighted routing to split traffic between two regions — 90% to us-east-1 and 10% to eu-west-1 as a gradual rollout.”
Managed Services and Databases
Managed Service
A managed service is a cloud provider-operated service where the provider handles all infrastructure management: hardware provisioning, OS patching, backups, failover, and scaling. You consume the service without managing the underlying servers.
Examples: Amazon RDS (managed relational database), Amazon ElastiCache (managed Redis/Memcached), Amazon SQS (managed message queue).
“We replaced our self-managed PostgreSQL cluster with Amazon RDS Multi-AZ — automated failover, daily backups, and minor version upgrades are now handled by AWS. Our team focuses on schema design and query optimisation.”
RDS (Relational Database Service)
RDS is AWS’s managed relational database service. It supports PostgreSQL, MySQL, MariaDB, Oracle, and SQL Server. Key features: automated backups, point-in-time restore, read replicas, Multi-AZ deployment for high availability, and performance monitoring.
“RDS Multi-AZ provides automatic failover — if the primary instance fails, AWS promotes the standby to primary and updates the DNS endpoint within 60–120 seconds.”
Message Queue / SQS
A message queue decouples components by allowing a producer to write messages to a queue that a consumer reads at its own pace. This absorbs traffic spikes and prevents overloading downstream services. AWS SQS (Simple Queue Service) is the most common managed queue.
“The order processing service publishes events to SQS — if the fulfilment service is temporarily down, orders queue up safely and are processed when it recovers. No orders are lost.”
Security and Access
IAM (Identity and Access Management)
IAM is the cloud permission system — it controls who (users, roles, services) can perform which actions on which resources. It follows the principle of least privilege: every service should have the minimum permissions needed for its function.
A common pattern: an EC2 instance has an IAM role attached that allows it to read from S3, but nothing else.
“The Lambda function uses an IAM role that allows only s3:GetObject on one specific bucket. If the function is compromised, the attacker cannot access any other AWS resource.”
Security Group
A security group is a virtual firewall for EC2 instances and other resources. It controls inbound and outbound traffic at the instance level through allow-rules on port, protocol, and source/destination. Security groups are stateful — if you allow inbound traffic, the return traffic is automatically allowed.
“The database security group allows inbound traffic on port 5432 only from the application servers’ security group — not from the internet or from any other service.”
Secret / Secrets Manager
Cloud-based secrets managers (AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault) securely store and control access to credentials, API keys, database passwords, and certificates. They enable automatic secret rotation without redeploying applications.
“Database passwords are stored in AWS Secrets Manager and rotated automatically every 30 days — the application retrieves the current password at startup, so there are no hardcoded credentials in the codebase.”
Infrastructure as Code and Deployment
Infrastructure as Code (IaC)
IaC is the practice of managing cloud infrastructure through code files (YAML, JSON, or domain-specific language) rather than clicking in the console. Changes are version-controlled, peer-reviewed, and applied through automation. Examples: Terraform, AWS CloudFormation, Pulumi, CDK.
“Nobody clicks in the console — all infrastructure is defined in Terraform. Pull requests for infra changes require two approvals. If someone accidentally deletes a resource, terraform apply restores it within minutes.”
CI/CD Pipeline
A CI/CD pipeline is an automated workflow that builds, tests, and deploys code changes. CI (Continuous Integration): automatically build and test on every push. CD (Continuous Delivery/Deployment): automatically deliver tested code to staging or production.
“Every pull request triggers the CI pipeline — unit tests, integration tests, and a security scan. If all checks pass and the PR is approved, merging to main automatically deploys to production via the CD pipeline.”
Blue/Green Deployment
Blue/green deployment maintains two identical production environments. The current production is “blue”; the new version is deployed to “green”. Traffic is switched from blue to green instantly (or gradually). If the new version has issues, rollback is instant — switch back to blue.
“We use blue/green deployments for zero-downtime releases — the green environment runs the new version and passes health checks before we shift traffic. Rollback takes 30 seconds.”
Canary Release
A canary release gradually rolls out a new version to a small percentage of users first (the “canary”) before expanding to everyone. This limits the blast radius of a bad deployment. If the canary shows elevated error rates or degraded performance, the rollout is halted.
“We deployed the new recommendation model as a 5% canary — it ran alongside the existing model for 48 hours. With no degradation in metrics, we expanded to 100%.”
Monitoring and Operations
SLA / SLO / SLI
- SLA (Service Level Agreement): a contractual commitment to customers (e.g., 99.9% monthly uptime)
- SLO (Service Level Objective): an internal target (e.g., 99.95% uptime, 99th percentile latency < 200ms)
- SLI (Service Level Indicator): the metric you actually measure (e.g., measured uptime, measured P99 latency)
“Our SLA promises 99.9% availability. Our internal SLO is 99.95% — we give ourselves a buffer above the customer commitment. SLIs from our monitoring confirm we exceeded the SLO last month.”
Observability (Logs, Metrics, Traces)
Observability is the ability to understand a system’s internal state by examining its outputs. The three pillars:
- Logs: text records of events
- Metrics: numerical measurements over time (CPU %, request rate, error rate)
- Traces: records of a request’s journey across services
“We instrument every service with OpenTelemetry — logs go to CloudWatch, metrics to Prometheus/Grafana, and distributed traces to Jaeger. When an alert fires, we can see the exact trace that caused the error.”
Error Budget
An error budget is the maximum allowable downtime or error rate within a period, derived from the SLO. If your SLO is 99.9% availability per month, your error budget is 0.1% × 43,200 minutes = ~43 minutes of downtime. Once the budget is exhausted, new deployments are paused until reliability is restored.
“We burned through 60% of our error budget in one incident last week — the eng team is now in freeze mode for new features until we ship the reliability improvements that caused the incident.”
Health Check
A health check is a periodic probe sent to a service to verify it is responding correctly. Load balancers and orchestrators (Kubernetes) use health checks to route traffic only to healthy instances. Two types:
- Liveness probe: is the process alive?
- Readiness probe: is the instance ready to accept traffic?
“The readiness probe calls /health/ready — it returns 503 until the database connection pool is warmed up. This prevents the load balancer from sending traffic to an instance that is still initialising.”
Quick reference: cloud vocabulary at a glance
| Term | Short definition |
|---|---|
| Region | Geographic area with one or more data centres |
| Availability Zone | Isolated data centre within a region |
| VPC | Your private network in the cloud |
| Load Balancer | Distributes traffic across instances |
| Auto-scaling | Automatically adjusts instance count to match demand |
| Serverless | Run code without managing servers; pay per execution |
| Object Storage | S3-style flat file storage accessed via HTTP |
| Managed Service | Cloud-operated service: no infrastructure to manage |
| IAM | Cloud permission system — who can do what |
| IaC | Define infrastructure as code; version-controlled |
| SLO / SLA / SLI | Service reliability targets, contracts, and measurements |
| Canary release | Gradual rollout to a small % before full deployment |
| Error budget | Allowed amount of downtime within an SLO period |
| Egress | Outbound data leaving the cloud provider network |
| Health check | Periodic probe to verify a service is responding correctly |
Learning cloud vocabulary pays compound interest — every AWS certification, architecture review, DevOps role, and technical interview draws on these same 30–50 core terms.