ScalePredictor: Instance-aware Scale Learning for Accurate Quantization of Vision Transformers
📰 ArXiv cs.AI
arXiv:2606.21947v1 Announce Type: cross Abstract: Vision Transformers have achieved remarkable success in many fields, yet their deployment on edge devices remains challenging due to their substantial computational demands. Post-Training Quantization (PTQ) offers an attractive solution by compressing models using a small calibration set with minimal training overhead. However, most existing PTQ works adopt a static quantization paradigm that is uniformly applied to all instances. Given the subst
DeepCamp AI