Database architecture comparison

Relational vs Document Database

SQL or NoSQL? Normalise or embed? Choosing between a relational and a document database is one of the earliest — and most consequential — decisions in any new project. This guide explains the architectural trade-offs in plain English, with the vocabulary and phrases you will hear in real engineering discussions.

TL;DR

  • Relational databases (PostgreSQL, MySQL, SQLite) store data in tables with a fixed schema. Relationships are expressed with foreign keys and queried with JOINs. Full ACID transactions are guaranteed.
  • Document databases (MongoDB, Firestore, CouchDB) store self-contained documents (JSON/BSON) inside collections. Schema is flexible. Related data is often embedded rather than joined. Distributed clusters follow BASE principles by default.
  • Neither is universally superior. Relational databases win on complex queries and data integrity; document databases win on schema flexibility and horizontal scaling for certain access patterns.

Side-by-side comparison

AspectRelational (SQL)Document (NoSQL)
Data unitRow in a tableDocument in a collection
SchemaFixed, enforced at write time (DDL)Flexible — optional validation only
RelationshipsForeign keys + JOINsEmbedding or application-level references
Query languageSQL — declarative, standardisedDB-specific API (MQL, GQL, etc.)
TransactionsFull ACID across many tablesACID on single document; multi-doc ACID available (opt-in)
Consistency modelStrong (ACID)Eventual by default in distributed clusters (BASE)
NormalisationEncouraged — reduce duplication via normal formsDenormalisation common — duplicate data for read speed
Horizontal scalingPossible but complex (sharding, partitioning)Built-in — designed for sharded clusters from the start
Query flexibilityVery high — ad-hoc JOINs across any tablesBest when queries match the document shape
Typical use casesFinance, ERP, e-commerce, reportingContent management, user profiles, catalogues, real-time apps

What is a relational database?

A relational database organises data into tables (also called relations). Each table has named columns with defined data types, and every row must conform to that structure. Relationships between tables are expressed through foreign keys — a column whose value matches the primary key in another table.

The query language is SQL (Structured Query Language), which allows you to combine data from multiple tables using JOINs, filter with WHERE clauses, and aggregate with GROUP BY. Because the schema is known in advance, the database engine can optimise queries, enforce constraints (NOT NULL, UNIQUE, CHECK), and guarantee ACID properties across all writes — even those touching multiple tables in a single transaction.

Normalisation is the practice of splitting data across multiple tables to remove duplication and keep each piece of information in one place. A fully normalised schema has no redundancy but requires JOINs to reassemble related data at query time.

What is a document database?

A document database stores records as documents — typically JSON or BSON objects — inside named collections. Unlike a table row, a document can contain nested objects and arrays, so related data can live together in a single record rather than being split across tables.

This embedding strategy is the key architectural difference. Instead of a users table and a separate addresses table linked by a foreign key, a document database might store the address directly inside the user document. A single read retrieves everything you need with no JOIN required.

The trade-off is denormalisation: the same piece of data may appear in many documents. If a product name changes, you may need to update it in thousands of documents rather than once in a single row. Schema validation is optional, which speeds up development early on but requires discipline as the application grows.

ACID vs BASE — the consistency trade-off

ACID (Atomicity, Consistency, Isolation, Durability) describes the guarantees that relational databases provide. A financial transfer that debits one account and credits another either completes fully or not at all — there is no intermediate state where money is missing from both accounts.

BASE (Basically Available, Soft state, Eventually consistent) describes the approach taken by many distributed document databases. In a cluster spread across multiple data centres, a write to one node may not immediately be visible on another. Most reads will return the correct data most of the time, but there is a brief window where a secondary replica may serve stale data.

This is often acceptable for social feeds or product catalogues, but dangerous for financial ledgers where you need a precise, consistent balance at all times. Modern document databases have narrowed this gap — MongoDB has supported multi-document ACID transactions since version 4.0 — but the default configuration still favours availability and performance over strict consistency.

Schema design side-by-side

Storing a blog post with its author and comments:

Relational — normalised tables

{`-- Three separate tables, linked by foreign keys
CREATE TABLE users (
  id   SERIAL PRIMARY KEY,
  name TEXT NOT NULL
);

CREATE TABLE posts (
  id        SERIAL PRIMARY KEY,
  author_id INT REFERENCES users(id),
  title     TEXT,
  body      TEXT
);

CREATE TABLE comments (
  id      SERIAL PRIMARY KEY,
  post_id INT REFERENCES posts(id),
  text    TEXT
);

-- Reassemble with JOINs at query time
SELECT p.title, u.name, c.text
FROM posts p
JOIN users    u ON u.id     = p.author_id
JOIN comments c ON c.post_id = p.id
WHERE p.id = 42;`}

Document — embedded structure

{`// One document in the "posts" collection
{
  "_id": "post_42",
  "title": "Intro to indexing",
  "body": "...",
  "author": {
    "id": "user_7",
    "name": "Alice"
  },
  "comments": [
    { "text": "Great article!", "by": "Bob"   },
    { "text": "Very helpful.",  "by": "Carol" }
  ]
}

// Single read — no JOIN needed
db.posts.findOne({ _id: "post_42" })`}

How engineers talk about relational vs document databases

These phrases come up in architecture reviews, stand-ups, and technical interviews. Understanding them will help you follow — and contribute to — database discussions in English.

Relational / SQL phrases

  • "We need to run a migration to add the new column — ALTER TABLE blocks writes briefly."
  • "This query is doing a full table scan — we should add an index on created_at."
  • "We normalise the schema to avoid data anomalies when we update a record."
  • "We wrap both inserts in a transaction so they succeed or fail atomically."
  • "The foreign key constraint prevents orphaned orders if a customer is deleted."

Document / NoSQL phrases

  • "We embed the address in the user document — it's always fetched together anyway."
  • "The collection has no fixed schema, so adding new fields doesn't require a migration."
  • "We're hitting eventual consistency — reads from the secondary can be slightly stale."
  • "The document is denormalised — we duplicate the product name for faster reads."
  • "We use a $lookup aggregation stage — it's basically a JOIN, but it's slower."
  • "The query flexibility of SQL is hard to beat when requirements change — you can JOIN any two tables without schema changes."
  • "Our access pattern is always 'fetch the whole order with its items' — embedding makes more sense than normalising here."
  • "We need horizontal scaling — Mongo shards across nodes more easily than Postgres out of the box."

Decision guide

  • Data has clear, stable relationships (orders, invoices, inventory) → Relational
  • Complex ad-hoc reporting with many JOINs → Relational
  • Financial, medical, or legal data requiring strict ACID guarantees → Relational
  • Schema changes frequently or varies per record (product catalogue, CMS) → Document
  • Data naturally lives together and is always read as a unit (user profile + preferences) → Document
  • Write-heavy workload that must scale horizontally across many nodes → Document
  • Rapid prototyping where the schema is still evolving → Document
  • Team already knows SQL and the dataset fits comfortably on one server → Relational
  • Mixed requirements → Both — many production systems use PostgreSQL for transactional data and MongoDB (or Firestore) for flexible, denormalised content

Frequently asked questions

What is the fundamental difference between a relational and a document database?

A relational database stores data in tables with fixed columns and enforces relationships through foreign keys and JOIN operations. A document database stores self-contained JSON (or BSON) documents inside collections, with no fixed schema required. The relational model prioritises integrity and consistency; the document model prioritises flexibility and developer speed.

What does ACID mean, and which databases guarantee it?

ACID stands for Atomicity (a transaction is all-or-nothing), Consistency (the database moves from one valid state to another), Isolation (concurrent transactions do not interfere), and Durability (committed data survives a crash). Traditional relational databases such as PostgreSQL and MySQL provide full ACID guarantees. Most document databases — MongoDB included — offer ACID on single documents by default and multi-document ACID transactions as an opt-in feature added in later versions.

When should I choose a relational database over a document database?

Choose a relational database when your data has clear, stable relationships (orders, order items, products, customers), when you need complex multi-table queries with JOINs, when data integrity and referential constraints are non-negotiable (finance, healthcare, e-commerce inventory), or when your team is comfortable writing SQL.