AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind

AI Engineer · Intermediate ·🧠 Large Language Models ·1h ago
Gemini Nano on device weighs three to four gigabytes. Shipping that per app is not realistic, which is why AI core puts it in the system once and every app shares it. Foreground apps get top priority. Background batch jobs queue and run overnight on charge. The developer never manages any of that. The tradeoff is reach. The GenAI MLKit APIs require flagship devices from the last couple of years. Classic MLKit for vision and OCR runs on a billion plus devices without issue. Hybrid inference, launched a few weeks before this talk, falls back from Nano to Gemini Flash in the cloud when the on device model is not available. An embedding API is coming soon for RAG style solutions. For anything beyond that, LiteRT is the other path. Speaker info: - https://x.com/FMuntenescu - https://www.linkedin.com/in/florina-muntenescu-314b8921 - https://github.com/florina-muntenescu - https://linkedin.com/in/ogaymond
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →