OpenAI Assistants API English: Threads, Runs, and Vector Stores

Learn the English vocabulary of the OpenAI Assistants API — threads, runs, vector stores, tool calls, and streaming responses explained for IT professionals.

Introduction

The OpenAI Assistants API introduces a stateful layer on top of the core Chat Completions API, bringing its own set of objects and concepts that engineers discuss in design reviews, pull requests, and documentation. If your team is building AI-powered features using Assistants, you will encounter terms like threads, runs, vector stores, and tool calls daily. Understanding how these terms are used in professional English helps you follow technical discussions and write clear code comments.

Assistants, Threads, and Messages

The Assistants API is built around three core objects:

An Assistant is a configured AI entity with instructions, a model, and tools. Engineers say “we create an assistant with a system prompt and the file search tool enabled.” Assistants are persistent — you create them once and reuse them across many conversations.

A Thread is a conversation container. Unlike Chat Completions, where you manage the message history yourself, a Thread stores messages on OpenAI’s servers. Engineers describe this as “the thread maintains conversation history automatically.” Common phrases include: “We create a new thread per user session” and “We append a message to the thread before running the assistant.”

A Message in Assistants API is an object added to a thread. You will hear “add a user message to the thread” (rather than “send a message”) because the action is about adding to a persisted object, not sending over a socket.

Runs and Run Steps

A Run is the execution of an assistant on a thread — it is what produces a response. This is a key concept and the vocabulary is specific:

  • “Create a run to generate a response” — trigger the assistant to process the thread
  • “The run is in a queued state” — the run has been created but not started yet
  • “The run is in_progress” — the assistant is actively generating
  • “The run requires action” — the assistant has called a tool and is waiting for the result
  • “The run completed” — the assistant finished generating and a new message is in the thread
  • “The run expired” — it took too long and was cancelled automatically

Run steps are the individual actions within a run. “We inspect the run steps to see which tools were called” is a common debugging phrase. Run steps can be of type message_creation (the assistant wrote a response) or tool_calls (the assistant called a function or searched files).

The phrase “polling the run” describes repeatedly checking whether a run has finished. Many engineers note: “The new streaming API eliminates the need to poll — we receive events as the run progresses.”

One of the most powerful Assistant features is file search, which uses vector stores to search uploaded documents semantically. The vocabulary:

  • vector store — a collection of files that have been chunked, embedded, and indexed for semantic search
  • “Attach the vector store to the assistant” — link a vector store so the assistant can search it
  • “The file is being processed” — chunking and embedding are happening asynchronously
  • “We chunk the document into smaller segments” — split a large file for embedding
  • “Retrieve relevant chunks” — find the parts of the document most relevant to the query

Engineers discuss vector stores in architecture reviews: “We maintain a separate vector store per client organisation so their documents stay isolated.” The word “ingest” is also common: “We ingest the PDF into the vector store when the user uploads it.”

Tool Calls and Function Calling

When an assistant calls a function you have defined, it is called a tool call. The workflow:

  • The run enters requires_action status
  • You read the tool call arguments from the run
  • You execute the function in your own code
  • You “submit tool outputs” back to the run to continue

Engineers say: “We handle the tool call by executing the function locally and submitting the output.” In pull request comments: “Make sure to submit tool outputs before the run expires, otherwise you will need to create a new run.”

Key Vocabulary

TermDefinition
AssistantA persistent AI entity with instructions, model, and tools configured
ThreadA persistent conversation container that stores messages server-side
RunAn execution of an assistant on a thread to produce a response
Run stepAn individual action within a run, such as a tool call or message creation
requires_actionRun status indicating the assistant is waiting for tool call results
vector storeAn indexed collection of file chunks for semantic search
tool callWhen an assistant invokes a function defined in the API configuration
submit tool outputsProviding function call results back to a paused run
pollingRepeatedly checking run status until it reaches a terminal state
ingestUpload and process a file into a vector store

Practice Tips

  1. Read the OpenAI Assistants API reference in English. Pay attention to the state machine for runs — the statuses queued, in_progress, requires_action, completed, and failed describe a flow you need to handle in code, and understanding their English meanings helps you write better error handling.

  2. Write state machine comments in English. When handling run statuses, add comments: “If the run requires action, extract the tool call arguments and execute the function locally before submitting outputs.”

  3. Practise the phrase ‘submit tool outputs.’ This is specific to the Assistants API and non-obvious. Using the exact term in your code comments and documentation (“we submit tool outputs to resume the run”) shows familiarity with the API.

  4. Discuss trade-offs between Assistants and Chat Completions in English. A common design question: “Do we use Assistants for persistent conversation history, or manage history ourselves with Chat Completions?” Practise articulating the trade-off: cost, latency, control, and complexity.

Conclusion

The OpenAI Assistants API has a precise object model — Assistant, Thread, Run, Run Step, Vector Store — and understanding the English vocabulary around these objects is essential for working productively with the API. Whether you are writing documentation, reviewing a colleague’s code, or explaining the architecture to a stakeholder, using these terms correctly builds credibility and prevents misunderstandings.