⚖️ Privacy-Utility Trade-off Vocabulary
0 / 5 completed
1 / 5
A data scientist asks: "How synthetic is enough?" when evaluating whether synthetic data can replace real data in a model training pipeline.
What does this question fundamentally address?
"How synthetic is enough?" — the central question of the privacy-utility trade-off:
| Dimension | Meaning |
|---|---|
| Fidelity | How closely the synthetic data mirrors the real data's distributions and correlations |
| Privacy | How difficult it is to re-identify or infer information about real individuals |
| Utility | Whether models trained on synthetic data perform as well as those trained on real data |
Key vocabulary: privacy-utility trade-off, fidelity, downstream utility, re-identification risk, TSTR (Train on Synthetic, Test on Real).