Automate Product Listings with Gemini + Vision Agents

Google for Developers · Beginner ·🧠 Large Language Models ·2d ago
*Build a real-time voice agent with Gemini 3.1 Flash Live and Stream's Vision Agents SDK.* Stefan Blos, Senior Developer Advocate at Stream, walks through what's possible with early access to the Gemini 3.1 Flash Live model: object detection, AI image polish with Nano Banana, web search, and a guided multi-step workflow, all driven by a single voice conversation. *What's covered:* Setting up the Vision Agents SDK with the Gemini plugin, defining tools for image generation and product search, building a video processor to analyze live frames, orchestrating multi-step agent workflows with inst…
Watch on YouTube ↗ (saves to browser)
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)