TLMs: Tiny LLMs and Agents on Edge Devices with LiteRT-LM — Cormac Brick, Google
Tiny LLMs are making on-device agents much more practical. In this workshop, Cormac Brick walks through how LiteRT-LM brings language models to edge devices, with a focus on Gemma, agent skills, and the real engineering tradeoffs behind running LLM workflows on phones and other constrained hardware. The session covers performance across edge devices, on-device function calling, fine-tuning and deployment, platform support across Android and iOS, and the memory, safety, and UX constraints that shape edge-native AI systems. If you're building local agents or want a practical look at where edge LLMs are headed, this is a useful hands-on overview.
Speaker info:
- https://www.linkedin.com/in/cbrick/
Timestamps
(0:00:00) Intro: AI on the Edge, Small Language Models, and Gemma
(0:04:51) Enabling App Development: MediaPipe, LiteRT, and System Services
(0:09:09) Small Language Models: Performance, Reach, and Fine-tuning
(0:11:30) Gemma 4: Sizes (E2B and E4B) and AI Core Roadmap
(0:16:10) Gemma on Edge Runtime: Performance Benchmarks
(0:18:34) Agent Skills: Google AI Gallery, Mood Tracker, and Wikipedia Lookup
(0:23:38) Skill Architecture: Efficiency, Progressive Disclosure, and Tool Loading
(0:27:34) Reliability: Constrained Decoding and Tool Usage
(0:29:18) Community and Custom Skills
(0:31:30) Skill Development Deep Dive: Orchestrator and Registry
(0:33:30) Rapid Skill Prototyping: Using Gemini CLI and ADB
(0:38:35) Open Source: AI Edge Gallery and Community Engagement
(0:41:00) Deploying Tiny Models (sub-1B parameters) In-App
(0:47:44) Third-Party Models: Fast VLM and Hardware Acceleration
(0:50:17) Model Examples: Function Gemma, Mobile Actions, and Embedding Gemma
(0:55:41) AI Edge Eloquent: Transcription and Text Polishing
(0:59:07) Modularity Playbook: ASR and Text Polishing Engines
(1:01:23) Synthetic Data Workflows for Tiny Models
(1:06:36) Web Support and Fine-tuning Documentation
(1:08:20) Summary and Key Takeaways
(1:12:49) Q&A: Multi-skill Execution, Context
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: LLM Engineering
View skill →Related AI Lessons
🎓
Tutor Explanation
DeepCamp AI