Master LangSmith's observability vocabulary — traces, evaluators, datasets, and projects for systematic LLM evaluation.
0 / 5 completed
1 / 5
During an incident review, the team can't reproduce a bad LLM output. A colleague says LangSmith traces would have captured it. What does a LangSmith trace contain?
A LangSmith trace is a tree of runs capturing each step's inputs, outputs, latency, token usage, and errors — giving full visibility into complex multi-step LLM chains.
2 / 5
In a PR review, a teammate wraps a function with @traceable. In a standup, a junior asks what this decorator does. The correct answer is:
@traceable is a LangSmith decorator that wraps any Python function as a traced run, automatically capturing inputs, outputs, timing, and nesting it in the parent trace if one exists.
3 / 5
Your team sets up LangSmith datasets to benchmark prompt changes. In a design review, you explain that a dataset in LangSmith is:
A LangSmith dataset is a versioned collection of examples. Running an LLM application over a dataset (via evaluate()) produces scored results comparable across prompt or model versions.
4 / 5
A QA engineer adds an evaluator to the LangSmith evaluation run. A PR reviewer asks what evaluators do. The correct answer is:
LangSmith evaluators are functions (or LLM-as-judge prompts) that score outputs per example. Results are attached as feedback to runs, enabling quantitative comparison across experiments.
5 / 5
In a platform review, the team wants to isolate staging LLM traces from production. You explain that LangSmith projects serve this purpose because:
LangSmith projects are namespaces for traces, evaluations, and datasets. Using separate projects per environment keeps staging noise out of production dashboards and simplifies filtering.