AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind

Name: AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind
Uploaded: 2026-05-22T14:00:06Z
Channel: AI Engineer
Description: Gemini Nano on device weighs three to four gigabytes. Shipping that per app is not realistic, which is why AI core puts it in the system once and every ...

AI Engineer · Intermediate ·🧠 Large Language Models ·1h ago

Skills: LLM Foundations70%

Gemini Nano on device weighs three to four gigabytes. Shipping that per app is not realistic, which is why AI core puts it in the system once and every app shares it. Foreground apps get top priority. Background batch jobs queue and run overnight on charge. The developer never manages any of that. The tradeoff is reach. The GenAI MLKit APIs require flagship devices from the last couple of years. Classic MLKit for vision and OCR runs on a billion plus devices without issue. Hybrid inference, launched a few weeks before this talk, falls back from Nano to Gemini Flash in the cloud when the on device model is not available. An embedding API is coming soon for RAG style solutions. For anything beyond that, LiteRT is the other path. Speaker info: - https://x.com/FMuntenescu - https://www.linkedin.com/in/florina-muntenescu-314b8921 - https://github.com/florina-muntenescu - https://linkedin.com/in/ogaymond

Watch on YouTube ↗ (saves to browser)