Cloudflare Workers AI brings machine learning inference directly to the edge. With a simple binding and the run() method, you can generate text, create embeddings, and classify images without managing GPU infrastructure. Test your command of the key vocabulary.
0 / 5 completed
1 / 5
How do you access the Workers AI binding in a Cloudflare Worker?
The Workers AI binding is exposed through the env object: env.AI. You configure it in wrangler.toml with [ai] binding = 'AI', and Cloudflare injects the binding at runtime so your worker can call models without managing API keys or HTTP requests manually.
2 / 5
What method is called on the AI binding to run an inference in Cloudflare Workers AI?
env.AI.run(modelName, inputs) is the primary method for running inference. For example, await env.AI.run('@cf/meta/llama-3-8b-instruct', { messages: [...] }) runs the specified model with the provided inputs and returns the model's response asynchronously.
3 / 5
Which Workers AI model category would you use to generate vector embeddings for semantic search?
Text embedding models like @cf/baai/bge-small-en-v1.5 convert text into dense vector representations. These embeddings are used with Cloudflare Vectorize (or another vector store) to enable semantic search, RAG (retrieval-augmented generation), and similarity ranking without a separate embedding service.
4 / 5
What does streaming a Workers AI response enable, and how is it requested?
Streaming with Workers AI is enabled by passing stream: true in the run inputs. The binding returns a ReadableStream of server-sent events, allowing you to forward tokens to the client as they are generated. This significantly improves time-to-first-token for long responses.
5 / 5
Which Cloudflare product pairs with Workers AI embeddings to build a full vector search pipeline?
Cloudflare Vectorize is the vector database product that stores and queries embeddings generated by Workers AI. A typical RAG pipeline generates embeddings with env.AI.run('@cf/baai/bge-small-en-v1.5', ...) and then upserts or queries them via env.VECTORIZE.upsert() / env.VECTORIZE.query().