📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 7,014 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (18809) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts

arXiv:2601.11044v3 Announce Type: replace Abstract: Large Language Models (LLMs) based autonomous agents demonstrate multifaceted capabilities to contribute sub

ArXiv cs.AI 📄 Paper 1w ago

Subargument Argumentation Frameworks: Separating Direct Conflict from Structural Dependency

arXiv:2601.12038v3 Announce Type: replace Abstract: Dung's abstract argumentation frameworks model acceptability solely in terms of an attack relation, thereby

ArXiv cs.AI 📄 Paper 1w ago

Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility

arXiv:2602.03402v3 Announce Type: replace Abstract: Vision language models (VLMs) extend the reasoning capabilities of large language models (LLMs) to cross-mod

ArXiv cs.AI 📄 Paper 1w ago

ANCHOR: Branch-Point Data Generation for GUI Agents

arXiv:2602.07153v2 Announce Type: replace Abstract: End-to-end GUI agents for real desktop environments require large amounts of high-quality interaction data,

ArXiv cs.AI 📄 Paper 1w ago

X-SYS: A Reference Architecture for Interactive Explanation Systems

arXiv:2602.12748v3 Announce Type: replace Abstract: The explainable AI (XAI) research community has proposed numerous technical methods, yet deploying explainab

ArXiv cs.AI 📄 Paper 1w ago

Constrained Assumption-Based Argumentation Frameworks

arXiv:2602.13135v2 Announce Type: replace Abstract: Assumption-based Argumentation (ABA) is a well-established form of structured argumentation. ABA frameworks

ArXiv cs.AI 📄 Paper 1w ago

Hunt Globally: Wide Search AI Agents for Drug Asset Scouting in Investing, Business Development, and Competitive Intelligence

arXiv:2602.15019v3 Announce Type: replace Abstract: Bio-pharmaceutical innovation has shifted: many new drug assets now originate outside the United States and

ArXiv cs.AI 📄 Paper 1w ago

FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

arXiv:2602.22822v2 Announce Type: replace Abstract: The identification and property prediction of chemical molecules is of central importance in the advancement

ArXiv cs.AI 📄 Paper 1w ago

Nano-EmoX: Unifying Multimodal Emotional Intelligence from Perception to Empathy

arXiv:2603.02123v3 Announce Type: replace Abstract: The development of affective multimodal language models (MLMs) has long been constrained by a gap between lo

ArXiv cs.AI 📄 Paper 1w ago

Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

arXiv:2603.02473v2 Announce Type: replace Abstract: Memory-augmented LLM agents store and retrieve information from prior interactions, yet the relative importa

ArXiv cs.AI 📄 Paper 1w ago

Do Machines Fail Like Humans? A Human-Centred Out-of-Distribution Spectrum for Mapping Error Alignment

arXiv:2603.07462v2 Announce Type: replace Abstract: Determining whether AI systems process information similarly to humans is central to cognitive science and t

ArXiv cs.AI 📄 Paper 1w ago

Normative Common Ground Replication (NormCoRe): Replication-by-Translation for Studying Norms in Multi-Agent AI

arXiv:2603.11974v2 Announce Type: replace Abstract: In the late 2010s, the fashion trend NormCore framed sameness as a signal of belonging, illustrating how nor

ArXiv cs.AI 📄 Paper 1w ago

dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models

arXiv:2603.18806v2 Announce Type: replace Abstract: Diffusion Large Language Models (dLLMs) introduce a new paradigm for language generation, which in turn pres

ArXiv cs.AI 📄 Paper 1w ago

Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation

arXiv:2603.18893v2 Announce Type: replace Abstract: Tracking the internal states of large language models across conversations is important for safety, interpre

ArXiv cs.AI 📄 Paper 1w ago

CoEvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

arXiv:2604.01687v2 Announce Type: replace Abstract: Anthropic proposes the concept of skills for LLM agents to tackle multi-step professional tasks that simple

ArXiv cs.AI 📄 Paper 1w ago

EmoMAS: Emotion-Aware Multi-Agent System for High-Stakes Edge-Deployable Negotiation with Bayesian Orchestration

arXiv:2604.07003v2 Announce Type: replace Abstract: Large language models (LLMs) has been widely used for automated negotiation, but their high computational co

ArXiv cs.AI 📄 Paper 1w ago

EVGeoQA: Benchmarking LLMs on Dynamic, Multi-Objective Geo-Spatial Exploration

arXiv:2604.07070v2 Announce Type: replace Abstract: While Large Language Models (LLMs) demonstrate remarkable reasoning capabilities, their potential for purpos

ArXiv cs.AI 📄 Paper 1w ago

Rhizome OS-1: Rhizome's Semi-Autonomous Operating System for Small Molecule Drug Discovery

arXiv:2604.07512v2 Announce Type: replace Abstract: We present Rhizome OS-1, a semi-autonomous operating system for small molecule drug discovery in which multi

ArXiv cs.AI 📄 Paper 1w ago

IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures

arXiv:2604.07709v2 Announce Type: replace Abstract: Ask a frontier model how to taper six milligrams of alprazolam (psychiatrist retired, ten days of pills left

ArXiv cs.AI 📄 Paper 1w ago

SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents

arXiv:2604.07791v2 Announce Type: replace Abstract: Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) have demonstrated significant poten

ArXiv cs.AI 📄 Paper 1w ago

AXIL: Exact Instance Attribution for Gradient Boosting

arXiv:2301.01864v2 Announce Type: replace-cross Abstract: We derive an exact, prediction-specific instance-attribution method for fitted gradient boosting machi

ArXiv cs.AI 📄 Paper 1w ago

Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings

arXiv:2305.14299v3 Announce Type: replace-cross Abstract: Learning high quality sentence embeddings from dialogues has drawn increasing attentions as it is esse

ArXiv cs.AI 📄 Paper 1w ago

SCITUNE: Aligning Large Language Models with Human-Curated Scientific Multimodal Instructions

arXiv:2307.01139v2 Announce Type: replace-cross Abstract: Instruction finetuning is a popular paradigm to align large language models (LLM) with human intent. D

ArXiv cs.AI 📄 Paper 1w ago

MM-LIMA: Less Is More for Alignment in Multi-Modal Datasets

arXiv:2308.12067v3 Announce Type: replace-cross Abstract: Multimodal large language models are typically trained in two stages: first pre-training on image-text