📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 5,060 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (12754) ArXiv cs.AI Dev.to · FORUM WEB Dev.to AI Forbes Innovation OpenAI News Hugging Face Blog

IAD-Unify: A Region-Grounded Unified Model for Industrial Anomaly Segmentation, Understanding, and Generation

arXiv:2604.12440v1 Announce Type: cross Abstract: Real-world industrial inspection requires not only localizing defects, but also explaining them in natural lan

ArXiv cs.AI 📄 Paper 2d ago

X-VC: Zero-shot Streaming Voice Conversion in Codec Space

arXiv:2604.12456v1 Announce Type: cross Abstract: Zero-shot voice conversion (VC) aims to convert a source utterance into the voice of an unseen target speaker

ArXiv cs.AI 📄 Paper 2d ago

Euler-inspired Decoupling Neural Operator for Efficient Pansharpening

arXiv:2604.12463v1 Announce Type: cross Abstract: Pansharpening aims to synthesize high-resolution multispectral (HR-MS) images by fusing the spatial textures o

ArXiv cs.AI 📄 Paper 2d ago

From Kinematics to Dynamics: Learning to Refine Hybrid Plans for Physically Feasible Execution

arXiv:2604.12474v1 Announce Type: cross Abstract: In many robotic tasks, agents must traverse a sequence of spatial regions to complete a mission. Such problems

ArXiv cs.AI 📄 Paper 2d ago

Mining Large Language Models for Low-Resource Language Data: Comparing Elicitation Strategies for Hausa and Fongbe

arXiv:2604.12477v1 Announce Type: cross Abstract: Large language models (LLMs) are trained on data contributed by low-resource language communities, yet the lin

ArXiv cs.AI 📄 Paper 2d ago

Audio Source Separation in Reverberant Environments using $\beta$-divergence based Nonnegative Factorization

arXiv:2604.12480v1 Announce Type: cross Abstract: In Gaussian model-based multichannel audio source separation, the likelihood of observed mixtures of source si

ArXiv cs.AI 📄 Paper 2d ago

Social Learning Strategies for Evolved Virtual Soft Robots

arXiv:2604.12482v1 Announce Type: cross Abstract: Optimizing the body and brain of a robot is a coupled challenge: the morphology determines what control strate

ArXiv cs.AI 📄 Paper 2d ago

Elastic Net Regularization and Gabor Dictionary for Classification of Heart Sound Signals using Deep Learning

arXiv:2604.12483v1 Announce Type: cross Abstract: In this article, we propose the optimization of the resolution of time-frequency atoms and the regularization

ArXiv cs.AI 📄 Paper 2d ago

KG-Reasoner: A Reinforced Model for End-to-End Multi-Hop Knowledge Graph Reasoning

arXiv:2604.12487v1 Announce Type: cross Abstract: Large Language Models (LLMs) exhibit strong abilities in natural language understanding and generation, yet th

ArXiv cs.AI 📄 Paper 2d ago

Deepfakes at Face Value: Image and Authority

arXiv:2604.12490v1 Announce Type: cross Abstract: Deepfakes are synthetic media that superimpose or generate someone's likeness on to pre-existing sound, images

ArXiv cs.AI 📄 Paper 2d ago

Latent Planning Emerges with Scale

arXiv:2604.12493v1 Announce Type: cross Abstract: LLMs can perform seemingly planning-intensive tasks, like writing coherent stories or functioning code, withou

ArXiv cs.AI 📄 Paper 2d ago

Lit2Vec: A Reproducible Workflow for Building a Legally Screened Chemistry Corpus from S2ORC for Downstream Retrieval and Text Mining

arXiv:2604.12498v1 Announce Type: cross Abstract: We present Lit2Vec, a reproducible workflow for constructing and validating a chemistry corpus from the Semant

ArXiv cs.AI 📄 Paper 2d ago

SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker

arXiv:2604.12502v1 Announce Type: cross Abstract: Parameter-efficient fine-tuning (PEFT) in multimodal tracking reveals a concerning trend where recent performa

ArXiv cs.AI 📄 Paper 2d ago

Topology-Aware Reasoning over Incomplete Knowledge Graph with Graph-Based Soft Prompting

arXiv:2604.12503v1 Announce Type: cross Abstract: Large Language Models (LLMs) have shown remarkable capabilities across various tasks but remain prone to hallu

ArXiv cs.AI 📄 Paper 2d ago

NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Professional Image Quality Assessment (Track 1)

arXiv:2604.12512v1 Announce Type: cross Abstract: In this paper, we present an overview of the NTIRE 2026 challenge on the 3rd Restore Any Image Model in the Wi

ArXiv cs.AI 📄 Paper 2d ago

Orthogonal Subspace Projection for Continual Machine Unlearning via SVD-Based LoRA

arXiv:2604.12526v1 Announce Type: cross Abstract: Continual machine unlearning aims to remove the influence of data that should no longer be retained, while pre

ArXiv cs.AI 📄 Paper 2d ago

MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models

arXiv:2604.12537v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) have achieved remarkable progress in multimodal understanding, yet their positio

ArXiv cs.AI 📄 Paper 2d ago

When Does Data Augmentation Help? Evaluating LLM and Back-Translation Methods for Hausa and Fongbe NLP

arXiv:2604.12540v1 Announce Type: cross Abstract: Data scarcity limits NLP development for low-resource African languages. We evaluate two data augmentation met

ArXiv cs.AI 📄 Paper 2d ago

KumoRFM-2: Scaling Foundation Models for Relational Learning

arXiv:2604.12596v1 Announce Type: cross Abstract: We introduce KumoRFM-2, the next iteration of a pre-trained foundation model for relational data. KumoRFM-2 su

ArXiv cs.AI 📄 Paper 2d ago

LLM-Guided Prompt Evolution for Password Guessing

arXiv:2604.12601v1 Announce Type: cross Abstract: Passwords still remain a dominant authentication method, yet their security is routinely subverted by predicta

ArXiv cs.AI 📄 Paper 2d ago

SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models

arXiv:2604.12617v1 Announce Type: cross Abstract: The post-training pipeline for diffusion models currently has two stages: supervised fine-tuning (SFT) on cura

ArXiv cs.AI 📄 Paper 2d ago

Efficient Semantic Image Communication for Traffic Monitoring at the Edge

arXiv:2604.12622v1 Announce Type: cross Abstract: Many visual monitoring systems operate under strict communication constraints, where transmitting full-resolut

ArXiv cs.AI 📄 Paper 2d ago

Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments

arXiv:2604.12625v1 Announce Type: cross Abstract: High-quality global illumination (GI) in real-time rendering is commonly achieved using precomputed lighting t

ArXiv cs.AI 📄 Paper 2d ago

Calibration-Aware Policy Optimization for Reasoning LLMs

arXiv:2604.12632v1 Announce Type: cross Abstract: Group Relative Policy Optimization (GRPO) enhances LLM reasoning but often induces overconfidence, where incor