Practise vocabulary for annotation guidelines, inter-annotator agreement, label quality, and active learning.
0 / 6 completed
1 / 6
Inter-annotator agreement (IAA) measures:
IAA quantifies annotation consistency. Low IAA (annotators disagree) suggests ambiguous guidelines or inherently subjective tasks, and should be investigated before training on the data.
2 / 6
A Cohen's kappa score of 0.82 indicates:
Cohen's kappa of 0.81–1.0 is generally interpreted as 'almost perfect' or 'strong' agreement. Kappa accounts for agreement that would occur by chance, unlike raw percentage agreement.
3 / 6
Annotation guidelines serve to:
Good annotation guidelines include label definitions, positive and negative examples, edge cases with decisions, and examples of common errors — all aimed at making annotation consistent and reproducible.
4 / 6
Active learning in the context of data labelling means:
Active learning is an iterative loop: train on labelled data, identify uncertain examples, request human labels for those, retrain — maximising model improvement per annotation cost.
5 / 6
A 'gold set' in annotation quality control refers to:
Gold sets allow quality monitoring without manual review — annotators' responses on hidden gold questions reveal systematic errors or low-quality contributors.
6 / 6
Class imbalance in a labelled dataset means:
Class imbalance (e.g. 95% negative, 5% positive) causes models to favour the majority class. Mitigation strategies include oversampling, undersampling, or class-weighted loss functions.