Lightning Talk: Running ExecuTorch Applications With Silicon Accelera... George Gekov & Aki Makkonen

Name: Lightning Talk: Running ExecuTorch Applications With Silicon Accelera... George Gekov & Aki Makkonen
Uploaded: 2026-04-20T20:22:20Z
Channel: PyTorch
Description: Lightning Talk: Running ExecuTorch Applications With Silicon Acceleration, in Ultra-low Power - George Gekov, Arm; Aki Makkonen, Alif Semiconductor Effi...

PyTorch · Intermediate ·🧬 Deep Learning ·1mo ago

Skills: LLM Engineering80%AI Workflow Automation60%

Lightning Talk: Running ExecuTorch Applications With Silicon Acceleration, in Ultra-low Power - George Gekov, Arm; Aki Makkonen, Alif Semiconductor Efficient deployment of ML models on low-power embedded systems has been a significant challenge for a number of years. At the same time, these embedded SoCs are all around us—from everyday appliances to the latest smart glasses. ExecuTorch is a PyTorch-native framework for deploying neural networks on resource-constrained systems. In this session, we show how to build an end-to-end speech recognition application using PyTorch and ExecuTorch—from training a Transformer-based neural network in PyTorch, through quantization, all the way to deployment on a low-power embedded device. We will introduce the key ExecuTorch APIs for quantization and explain how models are transformed and lowered into a form that can run efficiently on a device. The application is running on the Alif Ensemble E8 SoC, the first implementation of the leading Arm® Ethos-U85 NPU which brings native support for Transformer models to the ultra-low power domain. Join the experts from Arm and Alif Semiconductor to see how we are bridging the gap between PyTorch and embedded deployment—and how you can bring PyTorch models to silicon-accelerated, ultra-low-power systems.

Watch on YouTube ↗ (saves to browser)