The Topology of Multimodal Fusion: Why Current Architectures Fail at Creative Cognition
📰 ArXiv cs.AI
arXiv:2604.04465v1 Announce Type: new Abstract: This paper identifies a structural limitation in current multimodal AI architectures that is topological rather than parametric. Contrastive alignment (CLIP), cross-attention fusion (GPT-4V/Gemini), and diffusion-based generation share a common geometric prior -- modal separability -- which we term contact topology. The argument rests on three pillars with philosophy as the generative center. The philosophical pillar reinterprets Wittgenstein's say
DeepCamp AI