The Gemini API provides access to Google's most capable multimodal models. Master generateContent, streaming, multimodal inputs (text/image/video), system instructions, function calling, and Google Search grounding for production AI application development.
0 / 5 completed
1 / 5
A developer calls model.generateContent([{text: "Describe this image"}, {inlineData: {mimeType: "image/jpeg", data: base64Image}}]). What capability does this demonstrate?
This demonstrates multimodal input in the Gemini API. The contents array accepts multiple parts with different types: text, inlineData (base64 images/audio/video), and fileData (Google Files API references). Gemini models reason jointly across all modalities in a single inference pass.
2 / 5
A Gemini API request includes systemInstruction: {parts: [{text: "You are a SQL expert. Always explain your reasoning."}]}. What does systemInstruction configure?
systemInstruction in the Gemini API sets a system-level prompt that defines the model's role, persona, and behavioral guidelines. Unlike user turns, system instructions are processed with higher priority and apply consistently throughout the conversation — similar to the system parameter in other LLM APIs.
3 / 5
A Gemini API response uses streamGenerateContent(). What is returned by this method?
streamGenerateContent() returns an async iterable of partial GenerateContentResponse objects. Each chunk contains newly generated tokens in candidates[0].content.parts[0].text. Clients iterate with for await (const chunk of stream), displaying tokens progressively for low-latency UX.
4 / 5
A Gemini API call uses tools: [{functionDeclarations: [{name: "search_web", description: "...", parameters: {...}}]}]. When Gemini decides to call this function, what does the API response contain?
When Gemini decides to use a declared function, the response contains a functionCall part specifying the function name and arguments JSON. The developer executes the actual function, then sends a follow-up request with a functionResponse part containing the result — after which Gemini generates a final answer grounded in the real function output.
5 / 5
A Gemini API request includes tools: [{googleSearch: {}}] (or google_search_retrieval). What does this grounding tool enable?
Grounding with Google Search connects Gemini to real-time web search during inference. When enabled, the model can retrieve current information from the web and cite sources in its response. The API response includes groundingMetadata with search queries used and source citations — enabling responses grounded in up-to-date facts.