Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token
📰 ArXiv cs.AI
arXiv:2507.23386v3 Announce Type: replace-cross Abstract: Decoder-only large language models (LLMs) have been increasingly adopted to build embedding models for diverse tasks. To overcome the inherent limitations of causal attention in representation learning, many existing methods modify the attention mechanism to be bidirectional, potentially undermining LLMs' ability to extract semantic information acquired during pre-training. Meanwhile, leading unidirectional approaches often rely on extra
DeepCamp AI