Recommendation Systems
Collaborative filtering, content-based filtering, matrix factorisation, cold start, evaluation metrics (NDCG, MAP), candidate generation, and ranking pipeline vocabulary.
- Collaborative Filtering /kəˈlæbərətɪv ˈfɪltərɪŋ/
A recommendation approach that predicts a user’s preferences based on the preferences of similar users (user-user CF) or based on similarities between items that the user has engaged with (item-item CF). Does not require content features — only interaction data.
"Our ‘users like you also bought’ feature uses user-based collaborative filtering: find the 50 most similar users (by cosine similarity of purchase vectors), aggregate their purchases weighted by similarity, filter items the target user already bought. Works well for popular items; struggles with new users and new items (cold start problem)."
- Matrix Factorisation /ˈmeɪtrɪks ˌfæktəraɪˈzeɪʃən/
A technique for latent factor models: decompose the user-item interaction matrix (ratings, clicks) into two lower-rank matrices (user embeddings and item embeddings). The dot product of a user’s and item’s embedding predicts the rating. ALS and SGD are common training algorithms.
"We train a matrix factorisation model on implicit feedback (clicks, dwell time). Each user and item gets a 64-dimensional embedding. To recommend for a user, compute the dot product of their embedding with every item’s embedding, rank, filter already-seen items, return top 10. The embeddings capture latent preferences: ‘users who like distributed systems books also like SRE content.’"
- Cold Start Problem /kəʊld stɑːt ˈprɒbləm/
The difficulty of making recommendations for new users (no interaction history) or new items (no interaction data). Without history, collaborative filtering cannot find similar users or items. Requires fallback strategies: popularity-based, content-based, or exploration.
"New users see a ‘quick taste quiz’ at signup to bootstrap their profile — this addresses the new user cold start. New articles use a content-based fallback: similar to articles the user has read (using text embeddings) until the article accumulates 100 interactions. After 100 interactions, the CF model takes over."
- NDCG (Normalised Discounted Cumulative Gain) /ɛn diː siː dʒiː/
A ranking quality metric. DCG rewards placing relevant items high in the list (discounted by log rank). NDCG normalises DCG by the ideal ordering (IDCG). Scores range from 0 to 1; higher is better. Captures both relevance and position.
"We measure recommendation quality at NDCG@10: the top 10 recommendations are scored. A relevant item at position 1 contributes more than the same item at position 5. Our baseline NDCG@10 was 0.43; after the new ranking model it’s 0.51 — an 18% relative improvement. NDCG is our primary offline evaluation metric before A/B testing."
- Candidate Generation + Ranking Pipeline /kænˈdɪdət ˌdʒɛnəˈreɪʃən ræŋkɪŋ ˈpaɪplɪn/
A two-stage recommendation architecture. Stage 1 (candidate generation): fast retrieval of hundreds of candidates from millions of items using approximate methods (ANN, collaborative filtering, content similarity). Stage 2 (ranking): score candidates with a more complex model considering user context, item features, business rules.
"Our recommendation pipeline: candidate generation (ANN retrieval with user embedding + popularity-based candidates) produces 500 candidates in 10ms. The ranking model (gradient-boosted tree on 50 features) scores each candidate in 50ms. Post-processing applies business rules (no out-of-stock, diversity constraints). Total: under 100ms."
- Filter Bubble / Serendipity /ˈfɪltər ˈbʊbəl/
Filter bubble: the tendency of recommendation systems to show users increasingly narrow content aligned with their existing preferences, reducing exposure to diverse viewpoints. Serendipity: a metric measuring how surprising and delightful recommendations are — deliberately injecting diversity to combat filter bubbles.
"Our listening analysis showed users in a filter bubble: they were recommended only one genre after 3 months. We added a serendipity injection: 10% of recommendations must be from genres the user hasn’t seen in the last 30 days. Short-term click-through dropped 2%, but 30-day retention improved 8% — users were less likely to feel the platform had nothing new."
Quick Quiz — Recommendation Systems
Test yourself on these 6 terms. You'll answer 6 multiple-choice questions — each shows a term, you pick the correct definition.
What does this term mean?