AdvancedVocabulary#software-architecture#developer-tools#backend

Content-Addressable Storage Vocabulary

Build fluency in the vocabulary of retrieving data by a hash of its own contents instead of an assigned path.

0 / 5 completed

1 / 5

At standup, a dev mentions a storage system where a piece of data is retrieved using a hash computed from its own contents, rather than a file path or an assigned identifier chosen ahead of time. What is this storage approach called?

2 / 5

During a design review, the team relies on content-addressable storage specifically so two identical pieces of data, uploaded independently by different users, are automatically stored only once instead of duplicated. Which capability does this provide?

3 / 5

In a code review, a dev notices a file-upload feature assigns every uploaded file a freshly generated, random identifier with no relationship to the file's actual contents, so uploading the exact same file twice stores two full, separate copies. What does this represent?

4 / 5

An incident report shows a file-storage service's disk usage grew far faster than expected, because uploaded files were assigned freshly generated, random identifiers unrelated to their contents, so the same popular file, re-uploaded by many different users, was stored as a full, separate copy every single time. What practice would prevent this?

5 / 5

During a PR review, a teammate asks why the team derives every stored file's identifier from a hash of its contents instead of just letting the uploading client choose whatever identifier it wants. What is the reasoning?