AdvancedVocabulary#data-science-ml#backend#developer-tools

Catastrophic Forgetting Vocabulary

Learn the vocabulary of a model losing prior general knowledge while fine-tuning on a new, narrow task.

0 / 5 completed

1 / 5

At standup, a dev mentions fine-tuning a language model on a new, narrow dataset, only to find it has abruptly lost much of the broad general knowledge it had before, because the new training overwrote the earlier learned weights. What is this phenomenon called?

2 / 5

During a design review, the team uses a low learning rate and a replay buffer of earlier training examples while fine-tuning a model on a new task, specifically because mixing in a sample of earlier data helps preserve the model's prior general knowledge instead of letting new training overwrite it entirely. Which capability does this provide?

3 / 5

In a code review, a dev notices a fine-tuning pipeline trains a general-purpose model on a narrow new dataset with a high learning rate and no replay of any earlier training examples, instead of mixing in a sample of earlier data to anchor the model's prior knowledge. What does this represent?

4 / 5

An incident report shows a fine-tuned model lost most of its general-purpose capability after training on a narrow customer-support dataset, because the fine-tuning ran with a high learning rate and no replay of any earlier training examples. What practice would prevent this?

5 / 5

During a PR review, a teammate asks why the team fine-tunes with a replay buffer and a low learning rate instead of simply fine-tuning aggressively on the new dataset alone to maximize new-task accuracy as fast as possible. What is the reasoning?