Practise vocabulary for red-teaming, robustness testing, audit findings, and writing AI model assessments.
0 / 5 completed
1 / 5
AI red-teaming is defined as:
AI red-teaming (distinct from security red-teaming) focuses on eliciting undesirable model behaviours: harmful content generation, bias amplification, safety bypass, jailbreaking, and factual errors. It is a structured adversarial evaluation, not a code review.
2 / 5
Out-of-distribution (OOD) detection in model robustness refers to:
OOD detection is critical for safe deployment: a model trained on US medical records may perform poorly on data from a different healthcare system. Flagging OOD inputs allows the system to route uncertain cases to human review instead of auto-deciding.
3 / 5
When writing an AI audit finding, the standard structure includes:
Structured audit findings enable efficient remediation: Observation: 'The model responds to [specific prompt category] with harmful content.' Evidence: '[Example inputs and outputs].' Risk: 'Violates content policy; potential legal liability.' Recommendation: 'Add output-layer content filter for [category].'
4 / 5
Covariate shift means a model may underperform when:
Covariate shift is common when training data is not representative of production. Example: a fraud detection model trained on 2021 transactions may underperform in 2024 because spending patterns have changed — the input feature distribution has shifted.
5 / 5
Third-party AI audit vocabulary includes 'attestation', which means:
Attestation is a formal audit output: the auditor 'attests' (formally declares) that the system satisfies the stated criteria. It is stronger than an informal assessment and provides a basis for regulatory compliance claims.