Brevitas Quantization Library - Pablo Monteagudo Lago, AMD

Name: Brevitas Quantization Library - Pablo Monteagudo Lago, AMD
Uploaded: 2026-04-20T20:22:20Z
Channel: PyTorch
Description: Brevitas Quantization Library - Pablo Monteagudo Lago, AMD Brevitas is an open‑source PyTorch library from AMD designed to support the research of state...

PyTorch · Advanced ·🏭 MLOps & LLMOps ·1mo ago

Skills: Fine-tuning LLMs80%

Brevitas Quantization Library - Pablo Monteagudo Lago, AMD Brevitas is an open‑source PyTorch library from AMD designed to support the research of state‑of‑the‑art quantization methods, including Qronos (ICLR 2026) and MixQuant (arXiv). Built for flexibility and composability, it offers modular components for exploring reduced‑precision data paths and accuracy‑preserving techniques. As generative models scale, post‑training quantization (PTQ) has become the preferred strategy for maintaining quality without retraining, yet PTQ methods are often applied in isolation due to fragmented tooling. Brevitas provides a unified environment for modern PTQ algorithms—including Qronos, SpinQuant and AutoRound—enabling practitioners to combine complementary techniques effectively. Brevitas leverages the latest PyTorch features, like Dynamo for tracing and selectively modifying compute graphs—for example, by inserting rotation ops to mitigate outliers. It integrates with frameworks like transformers and supports export flows including vLLM and GGUF, ensuring a smooth transition from experimentation to deployment. This talk shows how to use Brevitas for an end‑to‑end quantization flow, showcasing how its flexibility enables new research directions.

Watch on YouTube ↗ (saves to browser)