📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 4,742 articles · Updated every 3 hours · View all reads

arXiv:2604.09621v1 Announce Type: new Abstract: We present an agent-driven approach to the construction of parameter inference pipelines for scientific data ana

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2d ago

How LLMs Might Think

arXiv:2604.09674v1 Announce Type: new Abstract: Do large language models (LLMs) think? Daniel Stoljar and Zhihe Vincent Zhang have recently developed an argumen

ArXiv cs.AI 📄 Paper 2d ago

Belief-Aware VLM Model for Human-like Reasoning

arXiv:2604.09686v1 Announce Type: new Abstract: Traditional neural network models for intent inference rely heavily on observable states and struggle to general

ArXiv cs.AI 📄 Paper 2d ago

Tipiano: Cascaded Piano Hand Motion Synthesis via Fingertip Priors

arXiv:2604.09692v1 Announce Type: new Abstract: Synthesizing realistic piano hand motions requires both precision and naturalness. Physics-based methods achieve

ArXiv cs.AI 📄 Paper 2d ago

The Myth of Expert Specialization in MoEs: Why Routing Reflects Geometry, Not Necessarily Domain Expertise

arXiv:2604.09780v1 Announce Type: new Abstract: Mixture of Experts (MoEs) are now ubiquitous in large language models, yet the mechanisms behind their "expert s

ArXiv cs.AI 📄 Paper 2d ago

Pioneer Agent: Continual Improvement of Small Language Models in Production

arXiv:2604.09791v1 Announce Type: new Abstract: Small language models are attractive for production deployment due to their low cost, fast inference, and ease o

ArXiv cs.AI 📄 Paper 2d ago

Controllable and Verifiable Tool-Use Data Synthesis for Agentic Reinforcement Learning

arXiv:2604.09813v1 Announce Type: new Abstract: Existing synthetic tool-use corpora are primarily designed for offline supervised fine-tuning, yet reinforcement

ArXiv cs.AI 📄 Paper 2d ago

EE-MCP: Self-Evolving MCP-GUI Agents via Automated Environment Generation and Experience Learning

arXiv:2604.09815v1 Announce Type: new Abstract: Computer-use agents that combine GUI interaction with structured API calls via the Model Context Protocol (MCP)

ArXiv cs.AI 📄 Paper 2d ago

COMPOSITE-Stem

arXiv:2604.09836v1 Announce Type: new Abstract: AI agents hold growing promise for accelerating scientific discovery; yet, a lack of frontier evaluations hinder

ArXiv cs.AI 📄 Paper 2d ago

Steered LLM Activations are Non-Surjective

arXiv:2604.09839v1 Announce Type: new Abstract: Activation steering is a popular white-box control technique that modifies model activations to elicit an abstra

ArXiv cs.AI 📄 Paper 2d ago

MEMENTO: Teaching LLMs to Manage Their Own Context

arXiv:2604.09852v1 Announce Type: new Abstract: Reasoning models think in long, unstructured streams with no mechanism for compressing or organizing their own i

ArXiv cs.AI 📄 Paper 2d ago

Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards

arXiv:2604.09855v1 Announce Type: new Abstract: The recent advancement of Large Language Models (LLMs) has established their potential as autonomous interactive

ArXiv cs.AI 📄 Paper 2d ago

Evolutionary Token-Level Prompt Optimization for Diffusion Models

arXiv:2604.09861v1 Announce Type: new Abstract: Text-to-image diffusion models exhibit strong generative performance but remain highly sensitive to prompt formu

ArXiv cs.AI 📄 Paper 2d ago

What do your logits know? (The answer may surprise you!)

arXiv:2604.09885v1 Announce Type: new Abstract: Recent work has shown that probing model internals can reveal a wealth of information not apparent from the mode

ArXiv cs.AI 📄 Paper 2d ago

In-situ process monitoring for defect detection in wire-arc additive manufacturing: an agentic AI approach

arXiv:2604.09889v1 Announce Type: new Abstract: AI agents are being increasingly deployed across a wide range of real-world applications. In this paper, we prop

ArXiv cs.AI 📄 Paper 2d ago

GLEaN: A Text-to-image Bias Detection Approach for Public Comprehension

arXiv:2604.09923v1 Announce Type: new Abstract: Text-to-image (T2I) models, and their encoded biases, increasingly shape the visual media the public encounters.

ArXiv cs.AI 📄 Paper 2d ago

HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration Tasks

arXiv:2604.09937v1 Announce Type: new Abstract: Healthcare administration accounts for over $1 trillion in annual spending, making it a promising target for LLM

ArXiv cs.AI 📄 Paper 2d ago

New Hybrid Fine-Tuning Paradigm for LLMs: Algorithm Design and Convergence Analysis Framework

arXiv:2604.09940v1 Announce Type: new Abstract: Fine-tuning Large Language Models (LLMs) typically involves either full fine-tuning, which updates all model par

ArXiv cs.AI 📄 Paper 2d ago

FinTrace: Holistic Trajectory-Level Evaluation of LLM Tool Calling for Long-Horizon Financial Tasks

arXiv:2604.10015v1 Announce Type: new Abstract: Recent studies demonstrate that tool-calling capability enables large language models (LLMs) to interact with ex

ArXiv cs.AI 📄 Paper 2d ago

AI Achieves a Perfect LSAT Score

arXiv:2604.10034v1 Announce Type: new Abstract: This paper reports the first documented instance of a language model achieving a perfect score on an officially

ArXiv cs.AI 📄 Paper 2d ago

LoopGuard: Breaking Self-Reinforcing Attention Loops via Dynamic KV Cache Intervention

arXiv:2604.10044v1 Announce Type: new Abstract: Through systematic experiments on long-context generation, we observe a damaging failure mode in which decoding

ArXiv cs.AI 📄 Paper 2d ago

Learning Hierarchical and Geometry-Aware Graph Representations for Text-to-CAD

arXiv:2604.10075v1 Announce Type: new Abstract: Text-to-CAD code generation is a long-horizon task that translates textual instructions into long sequences of i

ArXiv cs.AI 📄 Paper 2d ago

Ontological Trajectory Forecasting via Finite Semigroup Iteration and Lie Algebra Approximation in Geopolitical Knowledge Graphs

arXiv:2604.10087v1 Announce Type: new Abstract: We present EL-DRUIN, an ontological reasoning system for geopolitical intelligence analysis that combines formal

ArXiv cs.AI 📄 Paper 2d ago

Trust Your Memory: Verifiable Control of Smart Homes through Reinforcement Learning with Multi-dimensional Rewards

arXiv:2604.10110v1 Announce Type: new Abstract: Large Language Models (LLMs) have become a key foundation for enabling personalized smart home experiences. Whil