Master Elasticsearch's modern search vocabulary — from dense vectors and kNN to ELSER sparse retrieval and hybrid search.
0 / 5 completed
1 / 5
A PR adds a dense_vector field to an Elasticsearch mapping. In a review, a colleague asks what this field type stores. The correct answer is:
dense_vector stores fixed-dimension float arrays (embeddings from a model like text-embedding-ada-002). Combined with index: true and a similarity metric, it enables kNN search.
2 / 5
During a design review, the team debates kNN search vs hybrid search. The case for hybrid search is:
Hybrid search blends kNN vector similarity with BM25 (or other lexical) scoring, improving recall for queries where both semantic meaning and exact keywords matter.
3 / 5
A teammate enables ELSER (Elastic Learned Sparse EncodeR) in the pipeline. In a standup, you're asked to explain what ELSER produces. The correct answer is:
ELSER produces sparse token-weight vectors. It expands queries with semantically related terms, enabling semantic retrieval using Elasticsearch's existing inverted-index infrastructure.
4 / 5
In a PR review, a mapping includes "type": "semantic_text". A senior engineer asks what this field type does automatically. The answer is:
semantic_text (ES 8.11+) is a convenience type that auto-generates embeddings via an inference endpoint at index time, storing them in a sub-field for semantic or hybrid search.
5 / 5
Your search latency spikes during peak traffic. A colleague proposes reducing the kNN searchnum_candidates parameter. In a code review, you explain the trade-off:
num_candidates controls how many candidates each shard considers during HNSW traversal. Reducing it lowers latency at the cost of recall accuracy — a classic speed-vs-quality trade-off.