🤖 Data Scientist & ML Engineer

14-Day English Crash Course for Data Scientists & ML Engineers
Intensive Sprint

A focused 2-week programme covering the 14 highest-priority vocabulary and communication areas for data scientists and ML engineers working in English-speaking teams. From core ML terminology and statistics vocabulary to data pipeline language, experiment design, model evaluation, and interview preparation — each day delivers practical results in 20–30 minutes.

Intensive 14 days · 42 exercises covered · 20–30 min/day · Full 30-day path →
Start Day 1 →

14-day overview

Week 1: ML Core, Statistics & Data Pipelines

1

Core ML Vocabulary: Model, Feature, Training

2

Neural Networks & Deep Learning Vocabulary

3

Statistics & Mathematics Vocabulary in English

4

Probability & Distribution Language

5

Data Pipeline Vocabulary

6

ETL, Data Quality & Schema Language

7

Experiment Design Vocabulary (A/B testing)

Week 2: Experiments, Model Evaluation, Communication & Career

8

Hypothesis Testing Language

9

Model Evaluation Vocabulary

10

Bias, Variance & Overfitting Language

11

Stakeholder Communication for Data Teams

12

Presenting Data Findings in English

13

ML/Data Science Interview English

14

Salary Negotiation & Offer Phrases

Key phrases to learn this fortnight

overfitting
"The model is overfitting — it performs well on training data but poorly on the holdout set."
feature engineering
"Most of the performance gain came from feature engineering, not from changing the model architecture."
statistically significant
"The lift is statistically significant at p < 0.05, with a 95% confidence interval of 2–8%."
ground truth
"We need to improve our ground truth labels — the current annotation quality is limiting model performance."
data leakage
"There's data leakage in this pipeline — the feature uses future information that won't be available at inference time."
precision / recall trade-off
"We can improve precision by raising the threshold, but it will hurt recall — depends on which matters more for this use case."
baseline
"Let's start with a simple baseline before investing in a more complex model."
cold start problem
"We have a cold start problem for new users — no interaction history to personalise recommendations."
model drift
"We're seeing model drift — accuracy has degraded over the past three months as user behaviour has changed."
ablation study
"The ablation study shows that removing the positional encoding drops accuracy by 4 points."

Frequently asked questions

Who is this 14-day data science English crash course for?

This crash course is for data scientists and ML engineers who need focused, fast improvement in technical English — before a new role, a technical interview, or when presenting findings to English-speaking stakeholders. It covers the 14 highest-priority vocabulary and communication areas for data science and ML engineering work.

What level of English do I need to start?

The course is designed for B1–B2 English learners (intermediate). You should be able to hold basic conversations in English. The course improves your professional and technical English, not general English from scratch. If you are unsure of your level, try Day 1 — if the vocabulary feels completely unfamiliar, build general English skills first.

How long does each day take?

Each day is designed for 20–30 minutes: roughly 10 minutes on vocabulary and 15 minutes on the exercise. The intensive format keeps sessions focused — every day is tied directly to vocabulary and scenarios you encounter in data science and ML work.

What vocabulary does this crash course cover?

The course covers core ML vocabulary (model, feature, training, inference), neural networks and deep learning terms, statistics and mathematics vocabulary in English, probability and distribution language, data pipeline terminology, ETL and data quality language, experiment design vocabulary, hypothesis testing language, model evaluation terms, bias/variance/overfitting language, stakeholder communication phrases, and interview and salary negotiation language.

Is statistics vocabulary covered in English specifically?

Yes. Days 3 and 4 focus on statistics and mathematics vocabulary in English — including distribution, variance, standard deviation, hypothesis, p-value, confidence interval, and the natural phrases used when discussing statistical results with colleagues and stakeholders who may not have a statistics background.

Does the course cover experiment design and A/B testing vocabulary?

Yes. Days 7 and 8 cover experiment design and hypothesis testing vocabulary — the language used when designing A/B tests, setting up treatment and control groups, discussing statistical significance, and presenting experiment results to product and business stakeholders.

Is stakeholder communication included?

Yes. Days 11 and 12 focus on stakeholder communication and presenting data findings in English — the phrases used when explaining model performance to non-technical audiences, communicating data limitations, and presenting charts and dashboards in English during business reviews and sprint demos.

How is this different from the 30-day data science path?

The 14-day crash course covers the 14 highest-priority areas in a condensed format. The 30-day path goes deeper — adding MLOps vocabulary, feature engineering language, model deployment terminology, data governance vocabulary, advanced presentation skills, and a full week of career preparation including technical writing and leadership communication.

Is there interview preparation in this course?

Yes. Days 13 and 14 focus on ML and data science technical interview speaking and salary negotiation phrases — covering model explanation language, case study discussion vocabulary, and the phrases used when discussing experience and compensation at data science job interviews. See /exercises/speaking/technical-interview-speaking/.

What should I do after completing this 14-day crash course?

After the crash course, move to the 30-day data science path for deeper coverage of MLOps, feature engineering, data governance, and advanced communication. You can also browse the full exercise library at /exercises/ or explore the DevOps 14-day path if your role involves model deployment infrastructure.

Ready to start?

Begin with Day 1 and spend 20 minutes today.

Start Day 1 → Full 30-day path All learning paths