📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 7,014 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (18849) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers

arXiv:2505.04842v2 Announce Type: replace-cross Abstract: Prevalent reinforcement learning~(RL) methods for fine-tuning LLM reasoners, such as GRPO or Leave-one

ArXiv cs.AI 📄 Paper 1w ago

Auto-regressive transformation for image alignment

arXiv:2505.04864v2 Announce Type: replace-cross Abstract: Existing methods for image alignment struggle in cases involving feature-sparse regions, extreme scale

ArXiv cs.AI 📄 Paper 1w ago

Variational Visual Question Answering for Uncertainty-Aware Selective Prediction

arXiv:2505.09591v3 Announce Type: replace-cross Abstract: Despite remarkable progress in recent years, Vision Language Models (VLMs) remain prone to overconfide

ArXiv cs.AI 📄 Paper 1w ago

TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning

arXiv:2505.11737v4 Announce Type: replace-cross Abstract: While Large Language Models (LLMs) have demonstrated impressive capabilities, their output quality rem

ArXiv cs.AI 📄 Paper 1w ago

Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping

arXiv:2505.13777v2 Announce Type: replace-cross Abstract: We present Sat2Sound, a unified multimodal framework for geospatial soundscape understanding, designed

ArXiv cs.AI 📄 Paper 1w ago

SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence

arXiv:2505.17012v3 Announce Type: replace-cross Abstract: Existing evaluations of multimodal large language models (MLLMs) on spatial intelligence are typically

ArXiv cs.AI 📄 Paper 1w ago

GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning

arXiv:2505.17022v2 Announce Type: replace-cross Abstract: Visual generation models have made remarkable progress in creating realistic images from text prompts,

ArXiv cs.AI 📄 Paper 1w ago

Tuning Language Models for Robust Prediction of Diverse User Behaviors

arXiv:2505.17682v2 Announce Type: replace-cross Abstract: Predicting user behavior is essential for intelligent assistant services, yet deep learning models oft

ArXiv cs.AI 📄 Paper 1w ago

Learning World Models for Interactive Video Generation

arXiv:2505.21996v3 Announce Type: replace-cross Abstract: Foundational world models must be both interactive and preserve spatiotemporal coherence for effective

ArXiv cs.AI 📄 Paper 1w ago

Towards Reasonable Concept Bottleneck Models

arXiv:2506.05014v2 Announce Type: replace-cross Abstract: We propose a novel, flexible, and efficient framework for designing Concept Bottleneck Models (CBMs) t

ArXiv cs.AI 📄 Paper 1w ago

Progressive Multimodal Interaction Network for Reliable Quantification of Fish Feeding Intensity in Aquaculture

arXiv:2506.14170v3 Announce Type: replace-cross Abstract: Accurate quantification of fish feeding intensity is crucial for precision feeding in aquaculture, as

ArXiv cs.AI 📄 Paper 1w ago

LLM-based Realistic Safety-Critical Driving Video Generation

arXiv:2507.01264v2 Announce Type: replace-cross Abstract: Designing diverse and safety-critical driving scenarios is essential for evaluating autonomous driving

ArXiv cs.AI 📄 Paper 1w ago

Absorption and Inertness in Coarse-Grained Arithmetic: A Heuristic Application to the St. Petersburg Paradox

arXiv:2507.12475v2 Announce Type: replace-cross Abstract: The St. Petersburg paradox presents a longstanding challenge in decision theory: its classical expecte

ArXiv cs.AI 📄 Paper 1w ago

Large Language Model as An Operator: An Experience-Driven Solution for Distribution Network Voltage Control

arXiv:2507.14800v2 Announce Type: replace-cross Abstract: With the advanced reasoning, contextual understanding, and information synthesis capabilities of large

ArXiv cs.AI 📄 Paper 1w ago

Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training

arXiv:2507.15640v2 Announce Type: replace-cross Abstract: Continual pre-training on small-scale task-specific data is an effective method for improving large la

ArXiv cs.AI 📄 Paper 1w ago

PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving

arXiv:2507.17596v3 Announce Type: replace-cross Abstract: While end-to-end autonomous driving models show promising results, their practical deployment is often

ArXiv cs.AI 📄 Paper 1w ago

Modular Delta Merging with Orthogonal Constraints: A Scalable Framework for Continual and Reversible Model Composition

arXiv:2507.20997v4 Announce Type: replace-cross Abstract: In real-world machine learning deployments, models must be continually updated, composed, and when req

ArXiv cs.AI 📄 Paper 1w ago

Teaching the Teacher: The Role of Teacher-Student Smoothness Alignment in Genetic Programming-based Symbolic Distillation

arXiv:2507.22767v3 Announce Type: replace-cross Abstract: Obtaining human-readable symbolic formulas via genetic programming-based symbolic distillation of a de

ArXiv cs.AI 📄 Paper 1w ago

Reliable Evaluation Protocol for Low-Precision Retrieval

arXiv:2508.03306v4 Announce Type: replace-cross Abstract: Lowering the numerical precision of model parameters and computations is widely adopted to improve the

ArXiv cs.AI 📄 Paper 1w ago

AdvDINO: Domain-Adversarial Self-Supervised Representation Learning for Spatial Proteomics

arXiv:2508.04955v2 Announce Type: replace-cross Abstract: Self-supervised learning (SSL) has emerged as a powerful approach for learning visual representations

ArXiv cs.AI 📄 Paper 1w ago

Echoes of Automation: The Increasing Use of LLMs in Newsmaking

arXiv:2508.06445v3 Announce Type: replace-cross Abstract: The rapid rise of Generative AI (GenAI), particularly LLMs, poses concerns for journalistic integrity

ArXiv cs.AI 📄 Paper 1w ago

COXNet: Cross-Layer Fusion with Adaptive Alignment and Scale Integration for RGBT Tiny Object Detection

arXiv:2508.09533v2 Announce Type: replace-cross Abstract: Detecting tiny objects in multimodal Red-Green-Blue-Thermal (RGBT) imagery is a critical challenge in

ArXiv cs.AI 📄 Paper 1w ago

Proximal Supervised Fine-Tuning

arXiv:2508.17784v2 Announce Type: replace-cross Abstract: Supervised fine-tuning (SFT) of foundation models often leads to poor generalization, where prior capa

ArXiv cs.AI 📄 Paper 1w ago

Lifetime-Aware Design for Item-Level Intelligence at the Extreme Edge

arXiv:2509.08193v2 Announce Type: replace-cross Abstract: We present FlexiFlow, a lifetime-aware design framework for item-level intelligence (ILI) where comput