Database Schema Design Language Exercises
Entity-relationship modelling, normalisation, schema patterns, index design, and schema review vocabulary for database design conversations.
- Entity-Relationship Modelling Vocabulary
- Normalisation & Denormalisation Language
- Schema Design Patterns Vocabulary
- Schema Review Language
- Indexing Strategy Discussion Language
- Data Modelling Interview Language
Frequently Asked Questions
What is database normalisation and why does it matter in schema design?
Normalisation is the process of organising a relational database to reduce data redundancy and improve data integrity. It proceeds through normal forms (1NF, 2NF, 3NF, BCNF), each eliminating a specific type of anomaly — insertion, update, or deletion. A well-normalised schema ensures that every fact is stored in exactly one place, making updates consistent and queries predictable. Engineers discuss normalisation trade-offs when balancing write consistency against read performance.
What vocabulary do engineers use when discussing Entity-Relationship (ER) diagrams?
Common ER vocabulary includes entities (the objects being modelled), attributes (their properties), and relationships (associations between entities). Cardinality terms — one-to-one, one-to-many, many-to-many — describe how instances relate. Engineers also use "primary key", "foreign key", "composite key", "identifying relationship", and "weak entity". In conversations, phrases like "the order entity has a many-to-one relationship with customer" or "we model this as an associative entity to capture the join attributes" are standard.
How do engineers talk about indexing strategy in schema design reviews?
Indexing discussions centre on selectivity, cardinality, and query access patterns. Engineers ask "what columns appear in WHERE clauses and JOIN conditions?" and consider B-tree indexes for range queries, hash indexes for equality lookups, and composite indexes for multi-column predicates. Key terms include covering index, partial index, index scan vs. sequential scan, and write amplification. A common phrase is "we need a composite index on (tenant_id, created_at DESC) to support this time-range query efficiently."
What is denormalisation and when is it appropriate?
Denormalisation intentionally introduces redundancy into a schema to improve read performance. It is appropriate when normalised queries require expensive joins across large tables and read load greatly exceeds write load. Techniques include storing pre-aggregated counts, duplicating frequently joined columns, or materialising views. Engineers signal the trade-off with phrases like "we're denormalising the user_email into the orders table to avoid a join on every checkout page load" and document the reasoning in schema review notes.
What language is used for schema migration conversations?
Schema migration vocabulary covers migration scripts, up/down migrations, zero-downtime migrations, and expand-contract pattern. Engineers discuss "additive migrations" (adding a nullable column) versus "breaking migrations" (renaming or dropping a column). Phrases like "we'll run this behind a feature flag", "backfill the data in batches", and "coordinate the deploy with the migration rollout" are standard. Tools referenced include Flyway, Liquibase, Alembic, and Rails migrations.
How do you explain the difference between relational and NoSQL schema design?
Relational schemas enforce structure upfront via DDL, normalise data into tables, and use SQL for querying with strong consistency guarantees. NoSQL schema design is often schema-on-read: documents, key-value pairs, or wide-column stores allow heterogeneous records. Engineers explain: "in a document store we embed the order lines directly in the order document to optimise for the read path, whereas in a relational model we normalise them into a separate table." The choice depends on access patterns, consistency requirements, and scale.
What does "cardinality" mean in database indexing vs. ER modelling?
In ER modelling, cardinality describes the count relationship between entity instances — one-to-one, one-to-many, many-to-many. In indexing, cardinality refers to the number of distinct values in an indexed column: a high-cardinality column (e.g. user_id) has many unique values and is a good index candidate, while a low-cardinality column (e.g. boolean status flag) provides little selectivity and may not benefit from a standard B-tree index.
What is a composite primary key and when should you use one?
A composite primary key uses two or more columns together to uniquely identify a row. It is natural for junction tables in many-to-many relationships — for example, a user_roles table with (user_id, role_id) as the composite key. Engineers use composite keys when no single column is a sufficient natural identifier and adding a surrogate key would be redundant. The trade-off is that foreign keys referencing this table must include all component columns, increasing join complexity.
How do engineers describe referential integrity and constraint enforcement?
Referential integrity means every foreign key value in a child table must correspond to an existing primary key in the parent table. Engineers enforce it with FOREIGN KEY constraints and discuss ON DELETE CASCADE, ON DELETE RESTRICT, and ON DELETE SET NULL behaviours. In schema reviews, phrases like "we rely on the database to enforce referential integrity rather than handling it in application code" signal a preference for constraint-based correctness over application-level checks.
What is the expand-contract pattern in zero-downtime schema migrations?
The expand-contract (also called parallel-change) pattern breaks a breaking schema change into three phases: Expand — add the new column or table alongside the old one and update application code to write to both; Migrate — backfill existing data into the new structure; Contract — remove the old column or table once all code reads only from the new structure. This pattern allows continuous deployment without downtime or table locks, and is standard language in engineering discussions about large-scale schema evolution.