English for Elasticsearch Developers
Learn the English vocabulary for Elasticsearch: indices, shards, and relevance scoring, explained for discussing search infrastructure clearly.
Search bugs are often really relevance bugs — “the search doesn’t work” usually means the right documents came back in the wrong order, not that nothing came back at all — and Elasticsearch’s vocabulary around indices, shards, and scoring lets you say precisely what’s wrong.
Key Vocabulary
Index — the Elasticsearch equivalent of a database table, a named collection of documents with a shared mapping, that queries and aggregations target directly. “We’re reindexing the products index tonight to pick up the new mapping, which means search will briefly hit the old index until the swap completes.”
Shard — a horizontal partition of an index’s data, distributed across nodes to enable scaling beyond a single machine’s capacity and to parallelize query execution. “The index has five primary shards, so a single search query actually fans out to five separate shard-level searches before the coordinating node merges the results.”
Mapping — the schema that defines each field’s data type and how it’s indexed (text, keyword, date, etc.), determining what kinds of queries and aggregations are possible on that field.
“The status field was mapped as text instead of keyword, so exact-match filtering wasn’t working the way we expected — it was being analyzed into tokens instead of matched literally.”
Relevance score — the numeric value (_score) Elasticsearch assigns to each matching document, representing how well it matches the query, which determines default result ordering.
“Both documents technically matched the query, but the relevance score ranked the one with the term in the title much higher than the one with it buried in a long description field.”
Analyzer — the pipeline that transforms text into searchable tokens during indexing (lowercasing, stemming, splitting on whitespace), directly shaping what a search term will and won’t match. “The default analyzer was stemming ‘running’ down to ‘run’, which is why a search for ‘running shoes’ was also matching documents that only contained the word ‘run’.”
Common Phrases
- “Is this a relevance issue, or are the documents genuinely missing from the index?”
- “How many shards is this index split into, and is that still the right number at our current data size?”
- “Is that field mapped as keyword or text — do we need exact match or full-text search?”
- “Which analyzer is running on this field at index time?”
- “Are we reindexing, or can this mapping change happen in place?”
Example Sentences
Diagnosing a search-ranking complaint: “Users are saying search feels ‘wrong,’ but the documents they expect are actually there — it’s a relevance scoring issue. The title field isn’t boosted, so exact title matches aren’t ranking above documents that just happen to mention the term once.”
Explaining a mapping bug:
“Filtering by exact product code was failing intermittently because the field was mapped as text and getting tokenized by the analyzer — switching it to keyword fixed the exact-match filtering.”
Describing a scaling decision: “We bumped the index from three shards to eight ahead of the traffic spike, since each shard search runs in parallel and the old shard count was becoming a bottleneck at our current document volume.”
Professional Tips
- Separate relevance problems from missing data problems explicitly when triaging a search complaint — “results are wrong” almost always means one or the other, and they have completely different fixes.
- State whether a field’s mapping is
keywordortextwhen debugging unexpected filter or search behavior — this single distinction explains a large share of “exact match isn’t working” bugs. - Reference the analyzer by name when text matching behaves unexpectedly — stemming and tokenization decisions happen there, invisibly, at index time.
- Mention shard count when discussing scaling or performance — too few shards under-parallelizes large indices, and too many adds coordination overhead on small ones.
Practice Exercise
- Write a sentence distinguishing a relevance issue from a missing-data issue.
- Explain the difference between a
keywordandtextmapping. - Describe what an analyzer does to text during indexing.