5 exercises — choose the best-structured answer to common ML Security Engineer interview questions. Focus on adversarial attack types and defences, data poisoning detection, model stealing protection, ML supply chain security, and communicating ML risks to stakeholders.
Structure for ML Security Engineer interview answers
Classify the attack vector: training-time vs inference-time, white-box vs black-box
Name the defence mechanism: adversarial training, input preprocessing, differential privacy
Quantify the risk: attack success rate, impact on model accuracy, exfiltration cost
Translate to governance: model cards, audit trails, responsible disclosure for ML vulnerabilities
0 / 5 completed
1 / 5
The interviewer asks: "What are the main categories of adversarial attacks on ML models, and how do you think about defending against them?" Which answer best covers the full threat model?
Option B correctly establishes the training-time vs inference-time and white-box vs black-box taxonomy, names specific attack families (FGSM, PGD, membership inference, model extraction), and maps each to concrete defences (adversarial training, differential privacy, randomised smoothing, rate limiting). Option A incorrectly equates adversarial ML with SQL injection — they are fundamentally different threat models. Option C creates a false modality-based taxonomy and misses the training/inference distinction. Option D is factually wrong: adversarial attacks have been demonstrated in production contexts including autonomous vehicles, spam filters, and content moderation systems.
2 / 5
The interviewer asks: "How do you detect and prevent data poisoning in a machine learning pipeline?" Which answer demonstrates the deepest understanding?
Option B covers the full pipeline: provenance controls at ingestion, statistical anomaly detection (MMD, isolation forests), influence functions for sample-level auditing, backdoor detection tools (Neural Cleanse, STRIP), Byzantine-robust aggregation for federated learning, pipeline validation gates, and differential privacy for parameter-level impact bounding. Option A conflates transit encryption with data integrity — HTTPS does not prevent a compromised data source from serving poisoned samples. Option C is impractical and does not prevent poisoning from internal data corruption or insider threats. Option D is wrong — a backdoor attack can maintain high clean-data accuracy while exhibiting misclassification only on trigger inputs.
3 / 5
The interviewer asks: "What is model stealing, and what technical controls do you put in place to protect against it?" Which answer best explains the attack and layered defences?
Option B correctly explains the attack mechanism (active learning, knockoff nets), the information value of confidence scores, and layered defences: rate limiting, confidence truncation, output perturbation, watermarking (DAWN), and API anomaly detection. It also distinguishes model extraction from weight theft. Option A conflates extraction attacks with file theft — extraction does not require accessing stored weights. Option C is wrong — NLP model extraction has been extensively demonstrated (e.g., extraction of BERT-based classifiers via text APIs). Option D is incorrect — retraining changes the model but a stolen surrogate already trained on historical queries retains its utility; retraining does not invalidate existing stolen copies.
4 / 5
The interviewer asks: "How do you apply software supply chain security principles to ML systems, and what does an ML SBOM look like?" Which answer best covers the ML-specific dimensions?
Option B correctly extends the SBOM concept to ML-specific artefacts (datasets, pretrained checkpoints, training configs), names relevant tooling (ml-metadata, MLflow, SPDX 3.0 AI profiles, ModelScan), identifies the safetensors vs pickle security distinction, and enumerates realistic ML supply chain attack vectors (poisoned pretrained models, malicious notebooks, dependency confusion). Option A reduces ML supply chain security to Python dependency scanning, missing the dataset and model artefact dimensions entirely. Option C is wrong — even models trained from scratch consume training data, pretrained embeddings, and libraries with their own supply chains. Option D conflates hardware/environment tracking with SBOM and misidentifies the security purpose.
5 / 5
The interviewer asks: "How do you communicate ML security risks to a CTO who has a strong software engineering background but limited ML background?" Which answer best demonstrates effective technical communication?
Option B uses concrete analogies from software security (input validation bypass, supply chain attack, API scraping), frames risks in regulatory and business terms (GDPR, EU AI Act, reputational cost), and proposes controls in familiar engineering process language (CI/CD red-teaming, threat modelling). This is how effective technical communication to senior leadership works. Option A provides a raw reference without translation — MITRE ATLAS is detailed and ML-specialist-oriented, not executive-facing. Option C is impractical and condescending; the goal is to communicate across knowledge gaps, not eliminate them with mandatory training. Option D is not a security recommendation — it misrepresents both ML security and rule-based systems.