SimDiff: Depth Pruning via Similarity and Difference

📰 ArXiv cs.AI

arXiv:2604.19520v1 Announce Type: new Abstract: Depth pruning improves the deployment efficiency of large language models (LLMs) by identifying and removing redundant layers. A widely accepted standard for this identification process is to measure the similarity between layers using cosine distance. However, we find that methods relying solely on this one-dimensional heuristic can exhibit unpredictable performance and even catastrophic collapse across different architectures. To address this iss

Published 22 Apr 2026
Read full paper → ← Back to Reads