Gemini's multimodal API accepts images, audio, video, and documents alongside text. These exercises cover Part construction, File API upload and state management, video temporal understanding, part ordering, and inline_data vs. file_data patterns.
0 / 5 completed
1 / 5
A developer sends an image to Gemini using the Python SDK. Which object wraps the raw image bytes for inclusion in a contents list?
In the Google GenAI Python SDK, types.Part.from_bytes(data=bytes, mime_type='image/jpeg') (or glm.Part(inline_data=glm.Blob(...)) in older versions) wraps binary media for multimodal requests. The Part can also reference a URI via from_uri() for files stored in Google Cloud Storage.
2 / 5
When using Gemini's File API to upload a video, what must a developer check before sending the file URI in a generation request?
After uploading via the File API, videos undergo server-side processing. The file transitions through a PROCESSING state before becoming ACTIVE. Sending a generation request with a file URI that is still PROCESSING returns an error. Developers must poll client.files.get(name) until the state is ACTIVE.
3 / 5
Which Gemini model feature allows asking questions about a specific timestamp in a video, such as 'What is happening at 1:30?'
Gemini's native video understanding processes the entire video including frames and audio track with temporal awareness. You can ask natural language questions about specific timestamps and Gemini understands the temporal context without you needing to extract frames manually.
4 / 5
A developer uses model.generate_content([prompt, image1, image2]). How does Gemini handle the ordering of these parts?
Gemini processes parts in the exact order they appear in the list. This matters for tasks like 'compare these two images' where the prompt's positional references ('the first image', 'the second image') must align with the actual order of image parts in the request.
5 / 5
What does the inline_data field in a Gemini API Part object contain?
inline_data contains a Blob with two fields: mime_type (e.g., 'image/png') and data (base64-encoded bytes). This is used for small media files embedded directly in the API request, as opposed to file_data which references files uploaded via the File API.