Soft-TransFormers for Continual Learning

📰 ArXiv cs.AI

arXiv:2411.16073v3 Announce Type: replace-cross Abstract: Inspired by the \emph{Well-initialized Lottery Ticket Hypothesis (WLTH)}, we introduce Soft-Transformer (Soft-TF), a parameter-efficient framework for continual learning that leverages soft, real-valued subnetworks over a frozen pre-trained Transformer. Instead of relying on manually designed prompts or adapters, Soft-TF learns task-specific multiplicative masks applied to the key, query, value, and output projections in self-attention. T

Published 29 Apr 2026
Read full paper → ← Back to Reads