Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning

📰 ArXiv cs.AI

arXiv:2606.05173v1 Announce Type: cross Abstract: Masked language modelling (MLM) has been the dominant pre-training objective for text encoders since BERT, yet it encourages representations that are strongly anchored to surface-form token identity rather than deeper semantic structure. Inspired by the success of Joint Embedding Predictive Architectures (JEPA) (LeCun, 2022) in vision and audio, we propose a hybrid pre-training objective that combines a JEPA-style latent-space prediction loss wit

Published 5 Jun 2026

Read full paper → ← Back to Reads