🏭 Synthetic Data Vocabulary
6 exercise sets. Master the vocabulary for synthetic data generation, differential privacy, augmentation pipelines, and privacy-utility trade-offs.
Synthetic Data Generation Vocabulary
GAN, VAE, rule-based and simulation-based synthesis, LLM-generated synthetic data — core vocabulary for synthetic data generation approaches.
Data Augmentation Vocabulary
Augmentation strategy, augmentation pipeline, label-preserving transformations, image/text/tabular augmentation, SMOTE.
Differential Privacy Vocabulary
Differential privacy, epsilon (privacy budget), delta, sensitivity, Laplace and Gaussian mechanisms, local vs. global DP.
Synthetic Data Evaluation Vocabulary
Fidelity, utility, privacy metrics, statistical similarity (KS test, JSD), TSTR evaluation framework.
Utility-Privacy Trade-off Language
Privacy budget allocation, re-identification risk, privacy-utility curve, communicating trade-offs to stakeholders.
Synthetic Test Data Vocabulary
Test data management (TDM), synthetic vs. masked data, referential integrity, data masking, GDPR-compliant test environments.