Zero-Shot Quantization via Weight-Space Arithmetic
Zero-Shot Quantization via Weight-Space Arithmetic improves robustness to post-training quantization by up to 60% without receiver-side quantization-aware training
- Extract the quantization vector from a donor task using weight-space arithmetic
- Apply the quantization vector to a receiver model to improve robustness to PTQ-induced noise
- Evaluate the performance of the patched receiver model on a target task
- Fine-tune the receiver model if necessary to further improve performance
AI engineers and researchers can benefit from this method as it provides a way to improve model robustness to quantization-induced noise without requiring additional training data or computational resources. This can be particularly useful in deployment scenarios where data is limited or expensive to obtain
💡 The quantization vector is a transferable direction in weight space that can be used to improve robustness to post-training quantization without requiring receiver-side quantization-aware training
🚀 Improve model robustness to quantization-induced noise by up to 60% with Zero-Shot Quantization via Weight-Space Arithmetic! 🤖
Key Takeaways
Zero-Shot Quantization via Weight-Space Arithmetic improves robustness to post-training quantization by up to 60% without receiver-side quantization-aware training
Full Article
Abstract:
arXiv:2604.03420v1 Announce Type: cross Abstract: We show that robustness to post-training quantization (PTQ) is a transferable direction in weight space. We call this direction the quantization vector: extracted from a donor task by simple weight-space arithmetic, it can be used to patch a receiver model and improve robustness to PTQ-induced noise by as much as 60%, without receiver-side quantization-aware training (QAT). Because the method requires no receiver training data, it provides a zero
DeepCamp AI