Practise vocabulary for data lineage, metadata, business glossary, data stewardship, and PII tagging.
0 / 5 completed
1 / 5
A data catalog is described as the 'Google search for your data' because:
Data catalogs (Alation, Atlan, DataHub, OpenMetadata) solve data discoverability: without a catalog, analysts don't know what data exists, where it is, what it means, or whether it's trustworthy. A catalog indexes assets and makes them searchable with rich context.
2 / 5
A business glossary in a data catalog maps:
Business glossary example: 'Active User' may be defined as 'a user who has logged in at least once in the last 30 days and has completed at least one core action' — with a link to the fact table and the exact query implementing this definition. Without this, different teams use different definitions.
3 / 5
Data stewardship refers to:
Data stewards are typically business domain experts (not engineers) who ensure their domain's data is accurate and well-defined. They answer: 'What does this field mean?', 'Who is responsible for its quality?', 'When should this data be deleted?' — bridging business and technical teams.
4 / 5
PII tagging in a data catalog means:
PII tagging drives automated governance: tagged columns can have access restricted to authorised users, can trigger masking in non-production environments, and can be included in automated data deletion workflows for GDPR erasure requests — without engineers having to remember which columns contain personal data.
5 / 5
Data discoverability in the context of a data catalog means:
Poor discoverability creates shadow analytics: analysts build their own spreadsheets because they can't find the official data. A catalog with good metadata, descriptions, popularity signals, and verified owner contact information makes the official data the path of least resistance.