HiFloat4 Format for Language Model Pre-training on Ascend NPUs

📰 ArXiv cs.AI

arXiv:2604.08826v1 Announce Type: cross Abstract: Large foundation models have become central to modern machine learning, with performance scaling predictably with model size and data. However, training and deploying such models incur substantial computational and memory costs, motivating the development of low-precision training techniques. Recent work has demonstrated that 4-bit floating-point (FP4) formats--such as MXFP4 and NVFP4--can be successfully applied to linear GEMM operations in larg

Published 13 Apr 2026
Read full paper → ← Back to Reads