ScaleAcross Explorer: Exploring Communication Optimization for Scale-Across AI Model Training

📰 ArXiv cs.AI

arXiv:2605.24326v1 Announce Type: cross Abstract: The rapid scaling of large language model training requires distributing GPU resources across multiple data center buildings and regions. We refer to such paradigm as "scale-across" training. As infrastructure expands, the system design space becomes increasingly intricate, encompassing new model architectures, hardware heterogeneity, and evolving communication patterns. Drawing from Meta's production experience, we highlight the complexities of

Published 26 May 2026
Read full paper → ← Back to Reads