English for Yjs CRDT Developers
Learn the English vocabulary for Yjs: conflict-free replicated data types, shared documents, and explaining real-time collaboration to a team.
Yjs powers real-time collaborative editing (think shared documents where multiple people type at once), and the vocabulary around it comes from CRDT theory — a field most engineers haven’t touched before, so precision in explaining these terms to teammates matters a lot.
Key Vocabulary
CRDT (Conflict-free Replicated Data Type) — a data structure specifically designed so that concurrent edits from different clients can always be merged automatically into a consistent result, without needing a central authority to arbitrate conflicts. “We don’t need a locking mechanism here — a CRDT guarantees that even if two people edit the same paragraph simultaneously, both sets of changes merge deterministically.”
Shared document / shared type — a Yjs data structure (like a shared text, array, or map) that multiple clients can read and mutate concurrently, with changes automatically propagated and merged across all connected clients. “Each client isn’t managing its own separate copy of the text — they’re all mutating the same shared document, and Yjs handles propagating and merging those changes.”
Awareness / presence — the ephemeral, non-persisted state shared between collaborators, like cursor positions or who’s currently viewing a document, distinct from the actual persisted document content. “Cursor colors and user names don’t need to be part of the document history — that’s awareness state, which is transient and doesn’t get persisted like the actual content does.”
Provider — the transport layer plugin that connects a Yjs document to a specific syncing mechanism, such as WebSocket, WebRTC, or IndexedDB persistence, decoupling the CRDT logic from how updates actually get transmitted. “Switching from WebSocket sync to peer-to-peer sync doesn’t require touching our document logic at all — we just swap the provider.”
Update / delta encoding — the compact binary representation of a single change to a shared document, designed to be small and efficiently transmittable rather than sending the entire document state on every edit. “We’re not resending the whole document on every keystroke — Yjs sends just the update, a small delta encoding the specific change that happened.”
Common Phrases
- “Does this data structure actually need CRDT merge semantics, or is simple last-write-wins good enough here?”
- “Is this state part of the shared document, or is it ephemeral awareness state that shouldn’t be persisted?”
- “Which provider are we using for sync here — WebSocket, WebRTC, or something else?”
- “Are we sending the full document on every change, or are we relying on delta-encoded updates?”
Example Sentences
Explaining the technology to a new engineer: “This isn’t operational transform like older collaborative editors used — Yjs uses CRDTs, which merge concurrent edits without needing a central server to decide ordering.”
Debugging a presence bug: “Cursor positions aren’t disappearing because of a document sync bug — check the awareness state handling, since that’s a completely separate channel from the document content.”
Discussing an architecture decision: “We can swap our sync transport from WebSocket to WebRTC later without touching the document logic at all — the provider is intentionally decoupled from the CRDT itself.”
Professional Tips
- Explain CRDT by contrasting it with locking or operational transform — it’s the fastest way to help someone unfamiliar with the field understand why merges happen without conflicts.
- Keep awareness state clearly separated from document content in both code and conversation — conflating the two leads to unnecessary persistence and sync overhead.
- Treat the provider as a swappable transport detail in architecture discussions — tying document logic to a specific provider creates unnecessary coupling.
- Reference update size and frequency when diagnosing sync performance issues — large or overly frequent updates are a common source of unexpected bandwidth usage.
Practice Exercise
- Explain to a teammate unfamiliar with CRDTs why concurrent edits merge automatically instead of needing a locking mechanism.
- Describe the difference between a shared document and awareness state, with an example of each.
- Write a sentence explaining why swapping a sync provider shouldn’t require changing document logic.