5,060 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 5,060 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (13758) ArXiv cs.AIDev.to · FORUM WEBDev.to AIForbes InnovationOpenAI NewsMedium · Programming
ArXiv cs.AI 📄 Paper 5d ago
Constraint-Aware Corrective Memory for Language-Based Drug Discovery Agents
arXiv:2604.09308v1 Announce Type: new Abstract: Large language models are making autonomous drug discovery agents increasingly feasible, but reliable success in
ArXiv cs.AI 📄 Paper 5d ago
Mind the Gap Between Spatial Reasoning and Acting! Step-by-Step Evaluation of Agents With Spatial-Gym
arXiv:2604.09338v1 Announce Type: new Abstract: Spatial reasoning is central to navigation and robotics, yet measuring model capabilities on these tasks remains
ArXiv cs.AI 📄 Paper 5d ago
HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?
arXiv:2604.09408v1 Announce Type: new Abstract: Frontier coding agents solve complex tasks when given complete context but collapse when specifications are inco
ArXiv cs.AI 📄 Paper 5d ago
Do We Really Need to Approach the Entire Pareto Front in Many-Objective Bayesian Optimisation?
arXiv:2604.09417v1 Announce Type: new Abstract: Many-objective optimisation, a subset of multi-objective optimisation, involves optimisation problems with more
ArXiv cs.AI 📄 Paper 5d ago
E3-TIR: Enhanced Experience Exploitation for Tool-Integrated Reasoning
arXiv:2604.09455v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated significant potential in Tool-Integrated Reasoning (TIR), e
ArXiv cs.AI 📄 Paper 5d ago
Process Reward Agents for Steering Knowledge-Intensive Reasoning
arXiv:2604.09482v1 Announce Type: new Abstract: Reasoning in knowledge-intensive domains remains challenging as intermediate steps are often not locally verifia
ArXiv cs.AI 📄 Paper 5d ago
Strategic Algorithmic Monoculture:Experimental Evidence from Coordination Games
arXiv:2604.09502v1 Announce Type: new Abstract: AI agents increasingly operate in multi-agent environments where outcomes depend on coordination. We distinguish
ArXiv cs.AI 📄 Paper 5d ago
On Divergence Measures for Training GFlowNets
arXiv:2410.09355v2 Announce Type: cross Abstract: Generative Flow Networks (GFlowNets) are amortized inference models designed to sample from unnormalized distr
ArXiv cs.AI 📄 Paper 5d ago
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces
arXiv:2604.08362v1 Announce Type: cross Abstract: The emergence of Large Language Models (LLMs) has illuminated the potential for a general-purpose user simulat
ArXiv cs.AI 📄 Paper 5d ago
VerifAI: A Verifiable Open-Source Search Engine for Biomedical Question Answering
arXiv:2604.08549v1 Announce Type: cross Abstract: We introduce VerifAI, an open-source expert system for biomedical question answering that integrates retrieval
ArXiv cs.AI 📄 Paper 5d ago
Unbiased Rectification for Sequential Recommender Systems Under Fake Orders
arXiv:2604.08550v1 Announce Type: cross Abstract: Fake orders pose increasing threats to sequential recommender systems by misleading recommendation results thr
ArXiv cs.AI 📄 Paper 5d ago
Automated Standardization of Legacy Biomedical Metadata Using an Ontology-Constrained LLM Agent
arXiv:2604.08552v1 Announce Type: cross Abstract: Scientific metadata are often incomplete and noncompliant with community standards, limiting dataset findabili
ArXiv cs.AI 📄 Paper 5d ago
GNN-as-Judge: Unleashing the Power of LLMs for Graph Learning with GNN Feedback
arXiv:2604.08553v1 Announce Type: cross Abstract: Large Language Models (LLMs) have shown strong performance on text-attributed graphs (TAGs) due to their super
ArXiv cs.AI 📄 Paper 5d ago
Drift and selection in LLM text ecosystems
arXiv:2604.08554v1 Announce Type: cross Abstract: The public text record -- the material from which both people and AI systems now learn -- is increasingly shap
ArXiv cs.AI 📄 Paper 5d ago
EMA Is Not All You Need: Mapping the Boundary Between Structure and Content in Recurrent Context
arXiv:2604.08556v1 Announce Type: cross Abstract: What exactly do efficient sequence models gain over simple temporal averaging? We use exponential moving avera
ArXiv cs.AI 📄 Paper 5d ago
Re-Mask and Redirect: Exploiting Denoising Irreversibility in Diffusion Language Models
arXiv:2604.08557v1 Announce Type: cross Abstract: Diffusion-based language models (dLLMs) generate text by iteratively denoising masked token sequences. We show
ArXiv cs.AI 📄 Paper 5d ago
WAND: Windowed Attention and Knowledge Distillation for Efficient Autoregressive Text-to-Speech Models
arXiv:2604.08558v1 Announce Type: cross Abstract: Recent decoder-only autoregressive text-to-speech (AR-TTS) models produce high-fidelity speech, but their memo
ArXiv cs.AI 📄 Paper 5d ago
Medical Reasoning with Large Language Models: A Survey and MR-Bench
arXiv:2604.08559v1 Announce Type: cross Abstract: Large language models (LLMs) have achieved strong performance on medical exam-style tasks, motivating growing
ArXiv cs.AI 📄 Paper 5d ago
Uncertainty Estimation for the Open-Set Text Classification systems
arXiv:2604.08560v1 Announce Type: cross Abstract: Accurate uncertainty estimation is essential for building robust and trustworthy recognition systems. In this
ArXiv cs.AI 📄 Paper 5d ago
Neural networks for Text-to-Speech evaluation
arXiv:2604.08562v1 Announce Type: cross Abstract: Ensuring that Text-to-Speech (TTS) systems deliver human-perceived quality at scale is a central challenge for