EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models
📰 ArXiv cs.AI
arXiv:2604.11512v1 Announce Type: cross Abstract: The growing demand for deploying Small Language Models (SLMs) on edge devices, including laptops, smartphones, and embedded platforms, has exposed fundamental inefficiencies in existing accelerators. While GPUs handle prefill workloads efficiently, the autoregressive decoding phase is dominated by GEMV operations that are inherently memory-bound, resulting in poor utilization and prohibitive energy costs at the edge. In this work, we present Edge
DeepCamp AI