Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping

📰 ArXiv cs.AI

arXiv:2505.13777v2 Announce Type: replace-cross Abstract: We present Sat2Sound, a unified multimodal framework for geospatial soundscape understanding, designed to predict and map the distribution of sounds across the Earth's surface. Existing methods for this task rely on paired satellite images and geotagged audio samples, which often fail to capture the full diversity of sound at a location. Sat2Sound overcomes this limitation by augmenting datasets with semantically rich, vision-language mod

Published 14 Apr 2026

Read full paper → ← Back to Reads