English for Milvus Vector Database

Learn the English vocabulary for Milvus, the open-source vector database: collections, indexes, ANN search, and consistency levels.

Milvus conversations mix database vocabulary (collections, partitions, consistency) with search-specific terms (index type, recall, nprobe), and mixing them up — calling an index type a “search mode” or a partition a “shard” — makes it harder for teammates to follow a performance discussion precisely.

Key Vocabulary

Collection — the top-level container in Milvus that holds vectors and their associated scalar fields, roughly analogous to a table in a relational database. “We keep product embeddings and support-ticket embeddings in separate collections since they’re queried independently and have different schemas.”

Index type (HNSW, IVF_FLAT, DiskANN) — the algorithm used to organize vectors for approximate nearest-neighbor search, chosen based on the trade-off between query latency, recall, and memory footprint. “We switched from IVF_FLAT to HNSW because query latency at our scale mattered more than the extra memory it uses.”

Recall — the fraction of true nearest neighbors that an approximate search actually returns, used to measure how much accuracy is being traded for speed. “Recall dropped to 85% after we lowered ef for faster queries — we need to find the right point on that latency-recall curve.”

Partition — a logical subdivision within a collection that lets you scope a search to a subset of the data, improving both organization and query performance. “We partition by tenant ID so a search for one customer’s data never has to scan vectors belonging to another.”

Consistency level — a setting that controls how up-to-date a query’s view of recently written data must be, ranging from strong consistency to eventual, bounded, or session-level guarantees. “We’re using bounded consistency for the search endpoint — a few seconds of staleness is fine, and it’s much faster than requiring strong consistency on every query.”

Common Phrases

  • “Is this collection indexed with HNSW, or are we still doing a brute-force flat scan?”
  • “What recall are we seeing at the current ef setting — is it worth trading some latency to get it higher?”
  • “Should this data live in its own partition, or is a metadata filter on the existing collection enough?”
  • “Are we running under strong consistency here, or can this query tolerate some staleness?”
  • “Is the index built and loaded, or are we still inserting into an unindexed collection?”

Example Sentences

Reviewing a search latency issue: “The p99 latency spike traces back to an unindexed partition — the new data was inserted but the index build hadn’t finished before we started querying it.”

Explaining an indexing decision: “We chose IVF_FLAT over HNSW for this collection because build time mattered more than query speed, and the dataset is small enough that the difference is negligible.”

Describing a consistency trade-off in a design review: “Session consistency is enough here — a user always sees their own writes immediately, and other users’ slight staleness doesn’t affect correctness for this feature.”

Professional Tips

  • Name the specific index type when discussing performance — “search is slow” is far less actionable than “we’re on IVF_FLAT with a high nprobe, which trades speed for recall.”
  • Distinguish recall from raw accuracy explicitly — a vector search “returning wrong results” is often actually returning correct approximate neighbors, just not the exact top-k.
  • Use partition precisely, not interchangeably with “collection” or “shard” — conflating them in a design doc causes real confusion about how data is actually organized.
  • State the consistency level a feature requires up front — deciding this late, after the query pattern is built, often forces a rewrite.

Practice Exercise

  1. Explain the trade-off between HNSW and IVF_FLAT in one sentence.
  2. Describe what recall measures and why 100% recall usually isn’t the goal.
  3. Write a sentence explaining when you’d use a partition versus a separate collection.