AI News — Latest Developments & Breakthroughs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

FB-CLIP: Fine-Grained Zero-Shot Anomaly Detection with Foreground-Background Disentanglement

arXiv:2603.19608v1 Announce Type: cross Abstract: Fine-grained anomaly detection is crucial in industrial and medical applications, but labeled anomalies are of

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1w ago

LoD-Loc v3: Generalized Aerial Localization in Dense Cities using Instance Silhouette Alignment

arXiv:2603.19609v1 Announce Type: cross Abstract: We present LoD-Loc v3, a novel method for generalized aerial visual localization in dense urban environments.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

CAF-Score: Calibrating CLAP with LALMs for Reference-free Audio Captioning Evaluation

arXiv:2603.19615v1 Announce Type: cross Abstract: While Large Audio-Language Models (LALMs) have advanced audio captioning, robust evaluation remains difficult.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management

arXiv:2603.19621v1 Announce Type: cross Abstract: Deep Reinforcement Learning (DRL) provides a general-purpose methodology for training inventory policies that

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Dual Prompt-Driven Feature Encoding for Nighttime UAV Tracking

arXiv:2603.19628v1 Announce Type: cross Abstract: Robust feature encoding constitutes the foundation of UAV tracking by enabling the nuanced perception of targe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

MetaCues: Enabling Critical Engagement with Generative AI for Information Seeking and Sensemaking

arXiv:2603.19634v1 Announce Type: cross Abstract: Generative AI (GenAI) search tools are increasingly used for information seeking, yet their design tends to en

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

OmniDiT: Extending Diffusion Transformer to Omni-VTON Framework

arXiv:2603.19643v1 Announce Type: cross Abstract: Despite the rapid advancement of Virtual Try-On (VTON) and Try-Off (VTOFF) technologies, existing VTON methods

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

PolicySim: An LLM-Based Agent Social Simulation Sandbox for Proactive Policy Optimization

arXiv:2603.19649v1 Announce Type: cross Abstract: Social platforms serve as central hubs for information exchange, where user behaviors and platform interventio

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

The Residual Stream Is All You Need: On the Redundancy of the KV Cache in Transformer Inference

arXiv:2603.19664v1 Announce Type: cross Abstract: The key-value (KV) cache is widely treated as essential state in transformer inference, and a large body of wo

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1w ago

Toward High-Fidelity Visual Reconstruction: From EEG-Based Conditioned Generation to Joint-Modal Guided Rebuilding

arXiv:2603.19667v1 Announce Type: cross Abstract: Human visual reconstruction aims to reconstruct fine-grained visual stimuli based on subject-provided descript

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

ATHENA: Adaptive Test-Time Steering for Improving Count Fidelity in Diffusion Models

arXiv:2603.19676v1 Announce Type: cross Abstract: Text-to-image diffusion models achieve high visual fidelity but surprisingly exhibit systematic failures in nu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

GoAgent: Group-of-Agents Communication Topology Generation for LLM-based Multi-Agent Systems

arXiv:2603.19677v1 Announce Type: cross Abstract: Large language model (LLM)-based multi-agent systems (MAS) have demonstrated exceptional capabilities in solvi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

AIGQ: An End-to-End Hybrid Generative Architecture for E-commerce Query Recommendation

arXiv:2603.19710v1 Announce Type: cross Abstract: Pre-search query recommendation, widely known as HintQ on Taobao's homepage, plays a vital role in intent capt

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

FedRG: Unleashing the Representation Geometry for Federated Learning with Noisy Clients

arXiv:2603.19722v1 Announce Type: cross Abstract: Federated learning (FL) suffers from performance degradation due to the inevitable presence of noisy annotatio

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

MOSS-TTSD: Text to Spoken Dialogue Generation

arXiv:2603.19739v1 Announce Type: cross Abstract: Spoken dialogue generation is crucial for applications like podcasts, dynamic commentary, and entertainment co

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Uncertainty-aware Prototype Learning with Variational Inference for Few-shot Point Cloud Segmentation

arXiv:2603.19757v1 Announce Type: cross Abstract: Few-shot 3D semantic segmentation aims to generate accurate semantic masks for query point clouds with only a

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Learning Hierarchical Orthogonal Prototypes for Generalized Few-Shot 3D Point Cloud Segmentation

arXiv:2603.19788v1 Announce Type: cross Abstract: Generalized few-shot 3D point cloud segmentation aims to adapt to novel classes from only a few annotations wh

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Offshore oil and gas platform dynamics in the North Sea, Gulf of Mexico, and Persian Gulf: Exploiting the Sentinel-1 archive

arXiv:2603.19801v1 Announce Type: cross Abstract: The increasing use of marine spaces by offshore infrastructure, including oil and gas platforms, underscores t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Enhancing Alignment for Unified Multimodal Models via Semantically-Grounded Supervision

arXiv:2603.19807v1 Announce Type: cross Abstract: Unified Multimodal Models (UMMs) have emerged as a promising paradigm that integrates multimodal understanding

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

FrameNet Semantic Role Classification by Analogy

arXiv:2603.19825v1 Announce Type: cross Abstract: In this paper, we adopt a relational view of analogies applied to Semantic Role Classification in FrameNet. We

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Gesture2Speech: How Far Can Hand Movements Shape Expressive Speech?

arXiv:2603.19831v1 Announce Type: cross Abstract: Human communication seamlessly integrates speech and bodily motion, where hand gestures naturally complement v

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Semantic Delta: An Interpretable Signal Differentiating Human and LLMs Dialogue

arXiv:2603.19849v1 Announce Type: cross Abstract: Do LLMs talk like us? This question intrigues a multitude of scholar and it is relevant in many fields, from e

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Failure Modes for Deep Learning-Based Online Mapping: How to Measure and Address Them

arXiv:2603.19852v1 Announce Type: cross Abstract: Deep learning-based online mapping has emerged as a cornerstone of autonomous driving, yet these models freque

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

What If Consensus Lies? Selective-Complementary Reinforcement Learning at Test Time

arXiv:2603.19880v1 Announce Type: cross Abstract: Test-Time Reinforcement Learning (TTRL) enables Large Language Models (LLMs) to enhance reasoning capabilities

📰 ArXiv cs.AI