dbt Advanced Patterns: English for Analytics Engineers
Master English vocabulary for advanced dbt patterns — incremental models, snapshots, macros, packages, ref and source functions, and metrics.
Analytics engineers working with dbt spend much of their day discussing data transformations, scheduling strategies, and code reuse patterns. Whether you are reviewing a colleague’s pull request or explaining your incremental logic in a standup, you need precise English to communicate clearly. This guide covers the vocabulary and phrases that experienced dbt practitioners use every day.
Key Vocabulary
Incremental model — a dbt model that only processes new or changed rows rather than rebuilding the entire table on each run, reducing compute cost and run time. “We switched the orders model to incremental because a full refresh was taking forty minutes every night.”
Snapshot — a dbt object that tracks slowly changing dimension data by recording row history over time using a configurable strategy such as timestamp or check. “The customer snapshot lets us see what address was on file at the time of each transaction.”
Macro — a reusable block of Jinja-templated SQL logic defined in the macros/ directory and called inside models or other macros. “I wrote a macro to generate the date spine so every model can reference it consistently.”
ref function — the dbt built-in function used to reference another model by name, which builds the dependency graph and handles environment-specific schema resolution automatically. “Always use ref instead of hard-coding the schema name, otherwise the dependency graph breaks.”
source function — the dbt built-in function used to reference a raw source table declared in a sources.yml file, enabling freshness tests and lineage tracking. “The pipeline writes to the landing zone, and we declare that table as a source so dbt can monitor its freshness.”
Exposure — a dbt object that documents downstream consumers of your models, such as a dashboard, ML model, or report, making data lineage visible beyond the warehouse. “We added an exposure for the finance dashboard so the team knows which models power it.”
Package — a reusable collection of dbt models, macros, and tests published to the dbt Hub or a Git repository and installed via packages.yml. “We installed dbt-utils from the Hub to get access to the surrogate_key macro.”
Metric — a reusable, version-controlled business definition declared in dbt that standardises how a KPI is calculated across all downstream consumers. “Defining revenue as a metric means every dashboard queries the same logic rather than each analyst writing their own.”
Useful Phrases
“We need to decide on the incremental strategy — are we using unique_key with delete+insert, or appending only?”
“The snapshot strategy is set to timestamp, so dbt uses the updated_at column to detect changes.”
“I refactored that logic into a macro so the three models that needed it can all call it without duplicating code.”
“Can you add an exposure entry for the board-level KPI dashboard? The lineage graph is incomplete without it.”
“The package version in packages.yml is pinned to a specific commit, so we won’t get unexpected breaking changes.”
Common Mistakes
Confusing ref with source. Non-native speakers sometimes use these terms interchangeably, but they serve different purposes. Use source only for raw tables that arrive from outside dbt — upstream ingestion pipelines, data feeds, or operational databases. Use ref for every model that dbt itself builds. Mixing them up breaks freshness monitoring and the lineage graph.
Saying “run the snapshot” when you mean “execute a full refresh”. In dbt, running a snapshot and running a full refresh are distinct operations. A snapshot appends history; a full refresh drops and rebuilds the target table. Colleagues may misunderstand your intent if you use these phrases loosely. Say “run the snapshot job” or “trigger a full refresh of the incremental model” to be precise.
Misusing “macro” to mean “model”. Some engineers new to dbt use “macro” to describe any reusable SQL object. In dbt, a model is a SELECT statement that materialises as a table or view, while a macro is a Jinja function that generates SQL. The distinction matters when discussing testing, documentation, and scheduling.
Understanding these terms with precision will help you contribute confidently in code reviews, architecture discussions, and documentation sessions. Analytics engineering teams move fast, and using the right vocabulary signals that you understand not just the syntax but the reasoning behind each dbt design decision.