Practice privacy-preserving machine learning vocabulary: federated learning, differential privacy, secure multi-party computation, synthetic training data, and epsilon-DP guarantees.
0 / 5 completed
1 / 5
'Federated learning avoids sharing raw data.' How does federated learning achieve this?
In federated learning, the model is sent to where the data lives (e.g., mobile devices, hospital servers) and trained locally. Only the model updates (gradients or weights) are sent back to a central aggregator — never the raw data. This allows collaborative model training across sensitive datasets that can't be centralised due to privacy regulations.
2 / 5
'Differential privacy adds calibrated noise.' What is the purpose of this noise?
Differential privacy (DP) provides a mathematical privacy guarantee: the statistical output of a query or model is nearly indistinguishable whether or not any individual's data is included. This is achieved by adding carefully calibrated random noise (Laplace or Gaussian) to the computation. The epsilon parameter controls the privacy-utility trade-off — lower epsilon means stronger privacy but more noise.
3 / 5
What is 'secure multi-party computation' (SMPC) in the context of privacy-preserving ML?
Secure multi-party computation (SMPC) allows multiple parties to collaborate on a computation — such as training a model or computing statistics — where each party holds private data, and no party learns anything about the others' data beyond the final result. It uses cryptographic protocols (secret sharing, garbled circuits) to achieve this.
4 / 5
'The model is trained on synthetic data only.' What privacy benefit does this provide?
Training on synthetic data instead of real data can provide privacy benefits by ensuring the model never directly processes sensitive personal information. However, this only works if the synthetic data generator itself doesn't memorise or leak sensitive patterns from the real data — poorly generated synthetic data can still carry privacy risks through statistical disclosure.
5 / 5
'We achieve epsilon-differential privacy.' What does the epsilon value represent?
Epsilon (ε) is the differential privacy budget — it quantifies the maximum privacy loss guaranteed. A lower epsilon provides stronger privacy (the output changes very little even if one person's data is added or removed) but requires more noise, which reduces model utility. Epsilon values in practice range from 0.1 (very strong privacy) to 10+ (weaker, more utility-preserving).