Weierstrass Positional Encoding for Vision Transformers

📰 ArXiv cs.AI

arXiv:2605.23719v1 Announce Type: cross Abstract: Vision Transformers have achieved remarkable success in computer vision, but their common use of learnable one-dimensional positional encodings weakens the inherent two-dimensional spatial structure of images after patch flattening. Existing positional encodings often lack geometric constraints and do not preserve a monotonic relationship between Euclidean spatial distances and sequential index distances, limiting ViTs' ability to exploit spatial

Published 25 May 2026
Read full paper → ← Back to Reads