Improved Baselines with Representation Autoencoders

📰 ArXiv cs.AI

arXiv:2605.18324v1 Announce Type: cross Abstract: Representation Autoencoders (RAE) replace traditional VAE with pretrained vision encoders. In this paper, we systematically investigate several design choices and find three insights which simplify and improve RAE. First, we study a generalized formulation where the representation is defined as sum of the last k encoder layers rather than solely the final layer. This simple change greatly improves reconstruction without encoder finetuning or spec

Published 19 May 2026

Read full paper → ← Back to Reads