A general tensor-structured compression scheme for efficient large language models
📰 ArXiv cs.AI
arXiv:2605.25344v1 Announce Type: cross Abstract: Large language models (LLMs) are dominated by dense linear transformations, whose storage, memory and computational overheads hinder efficient adaptation and deployment while masking the functional impacts of structural simplification. Here we present Tensor Mixture (MixT), a general tensor-structured compression scheme that replaces targeted dense linear layers with natively executable mixtures of tensor operators. Operating directly on generic
DeepCamp AI