📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 5,060 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (13078) ArXiv cs.AI Dev.to · FORUM WEB Dev.to AI Forbes Innovation OpenAI News Hugging Face Blog

BankerToolBench: Evaluating AI Agents in End-to-End Investment Banking Workflows

arXiv:2604.11304v1 Announce Type: new Abstract: Existing AI benchmarks lack the fidelity to assess economically meaningful progress on professional workflows. T

ArXiv cs.AI 📄 Paper 3d ago

PaperScope: A Multi-Modal Multi-Document Benchmark for Agentic Deep Research Across Massive Scientific Papers

arXiv:2604.11307v1 Announce Type: new Abstract: Leveraging Multi-modal Large Language Models (MLLMs) to accelerate frontier scientific research is promising, ye

ArXiv cs.AI 📄 Paper 3d ago

Select Smarter, Not More: Prompt-Aware Evaluation Scheduling with Submodular Guarantees

arXiv:2604.11328v1 Announce Type: new Abstract: Automatic prompt optimization (APO) hinges on the quality of its evaluation signal, yet scoring every prompt can

ArXiv cs.AI 📄 Paper 3d ago

Dynamic Summary Generation for Interpretable Multimodal Depression Detection

arXiv:2604.11334v1 Announce Type: new Abstract: Depression remains widely underdiagnosed and undertreated because stigma and subjective symptom ratings hinder r

ArXiv cs.AI 📄 Paper 3d ago

CoRe-ECG: Advancing Self-Supervised Representation Learning for 12-Lead ECG via Contrastive and Reconstructive Synergy

arXiv:2604.11359v1 Announce Type: new Abstract: Accurate interpretation of electrocardiogram (ECG) remains challenging due to the scarcity of labeled data and t

ArXiv cs.AI 📄 Paper 3d ago

The Missing Knowledge Layer in Cognitive Architectures for AI Agents

arXiv:2604.11364v1 Announce Type: new Abstract: The two most influential cognitive architecture frameworks for AI agents, CoALA [21] and JEPA [12], both lack an

ArXiv cs.AI 📄 Paper 3d ago

Learning from Contrasts: Synthesizing Reasoning Paths from Diverse Search Trajectories

arXiv:2604.11365v1 Announce Type: new Abstract: Monte Carlo Tree Search (MCTS) has been widely used for automated reasoning data exploration, but current superv

ArXiv cs.AI 📄 Paper 3d ago

From Agent Loops to Structured Graphs:A Scheduler-Theoretic Framework for LLM Agent Execution

arXiv:2604.11378v1 Announce Type: new Abstract: The dominant paradigm for building LLM based agents is the Agent Loop, an iterative cycle where a single languag

ArXiv cs.AI 📄 Paper 3d ago

Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retrieval

arXiv:2604.11419v1 Announce Type: new Abstract: Cyber threat intelligence (CTI) analysts must answer complex questions over large collections of narrative secur

ArXiv cs.AI 📄 Paper 3d ago

Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning

arXiv:2604.11462v1 Announce Type: new Abstract: Large Language Models (LLMs) struggle with long-horizon tasks due to the "context bottleneck" and the "lost-in-t

ArXiv cs.AI 📄 Paper 3d ago

Three Roles, One Model: Role Orchestration at Inference Time to Close the Performance Gap Between Small and Large Agents

arXiv:2604.11465v1 Announce Type: new Abstract: Large language model (LLM) agents show promise on realistic tool-use tasks, but deploying capable agents on mode

ArXiv cs.AI 📄 Paper 3d ago

From Attribution to Action: A Human-Centered Application of Activation Steering

arXiv:2604.11467v1 Announce Type: new Abstract: Explainable AI (XAI) methods reveal which features influence model predictions, yet provide limited means for pr

ArXiv cs.AI 📄 Paper 3d ago

OOM-RL: Out-of-Money Reinforcement Learning Market-Driven Alignment for LLM-Based Multi-Agent Systems

arXiv:2604.11477v1 Announce Type: new Abstract: The alignment of Multi-Agent Systems (MAS) for autonomous software engineering is constrained by evaluator epist

ArXiv cs.AI 📄 Paper 3d ago

On the Complexity of the Discussion-based Semantics in Abstraction Argumentation

arXiv:2604.11480v1 Announce Type: new Abstract: We show that deciding whether an argument a is stronger than an argument b with respect to the discussion-based

ArXiv cs.AI 📄 Paper 3d ago

Anthropogenic Regional Adaptation in Multimodal Vision-Language Model

arXiv:2604.11490v1 Announce Type: new Abstract: While the field of vision-language (VL) has achieved remarkable success in integrating visual and textual inform

ArXiv cs.AI 📄 Paper 3d ago

Lectures on AI for Mathematics

arXiv:2604.11504v1 Announce Type: new Abstract: This book provides a comprehensive and accessible introduction to the emerging field of AI for mathematics. It c

ArXiv cs.AI 📄 Paper 3d ago

PAC-BENCH: Evaluating Multi-Agent Collaboration under Privacy Constraints

arXiv:2604.11523v1 Announce Type: new Abstract: We are entering an era in which individuals and organizations increasingly deploy dedicated AI agents that inter

ArXiv cs.AI 📄 Paper 3d ago

Limited Perfect Monotonical Surrogates constructed using low-cost recursive linkage discovery with guaranteed output

arXiv:2604.11524v1 Announce Type: new Abstract: Surrogates provide a cheap solution evaluation and offer significant leverage for optimizing computationally exp

ArXiv cs.AI 📄 Paper 3d ago

Problem Reductions at Scale: Agentic Integration of Computationally Hard Problems

arXiv:2604.11535v1 Announce Type: new Abstract: Solving an NP-hard optimization problem often requires reformulating it for a specific solver -- quantum hardwar

ArXiv cs.AI 📄 Paper 3d ago

A collaborative agent with two lightweight synergistic models for autonomous crystal materials research

arXiv:2604.11540v1 Announce Type: new Abstract: Current large language models require hundreds of billions of parameters yet struggle with domain-specific reaso

ArXiv cs.AI 📄 Paper 3d ago

SemaClaw: A Step Towards General-Purpose Personal AI Agents through Harness Engineering

arXiv:2604.11548v1 Announce Type: new Abstract: The rise of OpenClaw in early 2026 marks the moment when millions of users began deploying personal AI agents in

ArXiv cs.AI 📄 Paper 3d ago

UniToolCall: Unifying Tool-Use Representation, Data, and Evaluation for LLM Agents

arXiv:2604.11557v1 Announce Type: new Abstract: Tool-use capability is a fundamental component of LLM agents, enabling them to interact with external systems th

ArXiv cs.AI 📄 Paper 3d ago

Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language Models

arXiv:2604.11609v1 Announce Type: new Abstract: Large language models exhibit sycophantic tendencies--validating incorrect user beliefs to appear agreeable. We

ArXiv cs.AI 📄 Paper 3d ago

Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems

arXiv:2604.11623v1 Announce Type: new Abstract: We introduce Context Kubernetes, an architecture for orchestrating enterprise knowledge in agentic AI systems, w