Intermediate Vocabulary #storage#cloud#distributed-systems#infrastructure

Storage Systems Vocabulary

5 exercises — Practice the English vocabulary used when discussing storage architecture: block, object, and file storage; SAN and NFS; distributed file systems; performance metrics; and data durability strategies.

Core Storage Systems vocabulary clusters

Storage types: block storage (raw volumes, iSCSI), object storage (S3-compatible, metadata, flat namespace), file storage (NFS, SMB, POSIX hierarchy), SAN (Storage Area Network), NAS (Network Attached Storage)
Distributed file systems: HDFS (Hadoop — namenode/datanode), Ceph (RADOS, RBD, CephFS), GlusterFS — horizontal scale, fault tolerance
Performance metrics: IOPS (input/output operations per second), throughput (MB/s), latency (time to first byte), bandwidth
Durability and redundancy: replication factor, erasure coding (parity shards), availability zone, durability (e.g. 99.999999999%), WAL
Data lifecycle: tiered storage (hot/warm/cold), object lifecycle policy, S3-compatible API, presigned URL, multipart upload

0 / 5 completed

1 / 5

A solutions architect explains storage choices to a new team member:
"For the virtual machines we provision raw block storage volumes — each VM sees it as a local disk. For our media pipeline, we use object storage: files are stored as objects with metadata, addressed by a unique key, and accessed over HTTP. For the shared analytics workspace where data scientists need a standard directory tree, we mount an NFS volume so they can use normal file paths."
Which storage type is best suited for storing billions of images accessed via an HTTP API?

2 / 5

A storage engineer is tuning a database server's disk configuration:
"The bottleneck isn't throughput — we're not saturating the 2 GB/s bandwidth. The bottleneck is __________ : the database is doing 45,000 small random reads per second and the SSD is rated for only 40,000. Each read is tiny, so bandwidth is fine, but the drive can't keep up with the sheer number of operations."
Which metric does the blank refer to?

3 / 5

Match the term to its correct definition.

Term: erasure coding

4 / 5

A cloud architect describes a cost-optimisation strategy to the engineering team:
"We've configured tiered storage with a lifecycle policy on our S3 bucket. Objects accessed in the last 30 days stay in the hot tier — standard storage class, highest IOPS, most expensive. Objects from 30–90 days ago move to the warm tier — infrequent access class, cheaper. Anything older than 90 days drops to the cold tier — Glacier, very cheap, but retrieval takes minutes to hours."
What is the primary motivation for implementing tiered storage?

5 / 5

An infrastructure engineer explains a distributed file system deployment:
"We run Ceph with a replication factor of 3 — every object is written to 3 different OSDs on 3 different hosts. If one host fails, the cluster re-replicates automatically from the remaining two copies. We also separate our OSDs across two __________ so that a single data-centre power outage can only affect at most half the replicas."
What term completes the blank?