Machine learning papers and presentations are full of evaluation metrics whose pronunciation is unclear from the spelling. This exercise covers F1 score, BLEU, ROUGE, perplexity, and AUC-ROC — terms you need to say confidently at research meetings.
0 / 5 completed
1 / 5
How is 'F1 score' pronounced?
The F1 score is a harmonic mean of precision and recall, pronounced /ɛf wʌn skɔːr/ — the letter 'F', the numeral 'one', and the word 'score'. 'One' is /wʌn/ with the /ʌ/ vowel as in 'fun'. Stress is distributed naturally across the phrase. In context: 'Our classifier achieved an eff-one score of 0.92 on the test set.'
2 / 5
How is 'BLEU' (MT metric) pronounced?
BLEU (Bilingual Evaluation Understudy) is pronounced /bluː/, exactly like the colour 'blue'. The metric was deliberately named with this French spelling to rhyme with 'blue' — making it easy to remember. Spelling it out 'bee-el-ee-you' is very unusual. In context: 'The translation model scored 38 bloo on the WMT benchmark.'
3 / 5
How is 'ROUGE' (NLP metric) pronounced?
ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is pronounced /ruːʒ/, exactly like the French word for 'red' or the cosmetic term 'rouge'. The /ʒ/ is the same sound as the 's' in 'measure' or 'treasure'. Some speakers say /ruːdʒ/ with /dʒ/ but the /ʒ/ form is the standard in NLP research circles. In context: 'The summarisation model was evaluated using roozh-L against reference texts.'
4 / 5
How is 'perplexity' pronounced?
Perplexity (a measure of how well a language model predicts a sample) is a standard English word: /pəˈplɛksɪti/. Stress falls on the second syllable 'PLEK'. The unstressed first syllable uses the schwa /pə/. The '-ity' suffix is /ɪti/ in careful speech. In context: 'A lower per-PLEK-si-tee indicates the model fits the test corpus better.'
5 / 5
How is 'AUC-ROC' pronounced?
AUC-ROC (Area Under the Curve — Receiver Operating Characteristic) is typically spoken as /eɪ juː siː rɒk/ — 'AUC' spelled out as 'ay-you-see' and 'ROC' as a word 'rok'. Some speakers say 'ay-you-see ar-oh-see' for both parts as initials, but 'rok' is very common. In context: 'The classifier has an ay-you-see-rok of 0.97, indicating excellent discrimination.'