📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 5,060 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (13599) ArXiv cs.AI Dev.to · FORUM WEB Dev.to AI Forbes Innovation OpenAI News Hugging Face Blog

Do Machines Fail Like Humans? A Human-Centred Out-of-Distribution Spectrum for Mapping Error Alignment

arXiv:2603.07462v2 Announce Type: replace Abstract: Determining whether AI systems process information similarly to humans is central to cognitive science and t

ArXiv cs.AI 📄 Paper 4d ago

Normative Common Ground Replication (NormCoRe): Replication-by-Translation for Studying Norms in Multi-Agent AI

arXiv:2603.11974v2 Announce Type: replace Abstract: In the late 2010s, the fashion trend NormCore framed sameness as a signal of belonging, illustrating how nor

ArXiv cs.AI 📄 Paper 4d ago

dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models

arXiv:2603.18806v2 Announce Type: replace Abstract: Diffusion Large Language Models (dLLMs) introduce a new paradigm for language generation, which in turn pres

ArXiv cs.AI 📄 Paper 4d ago

Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation

arXiv:2603.18893v2 Announce Type: replace Abstract: Tracking the internal states of large language models across conversations is important for safety, interpre

ArXiv cs.AI 📄 Paper 4d ago

CoEvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

arXiv:2604.01687v2 Announce Type: replace Abstract: Anthropic proposes the concept of skills for LLM agents to tackle multi-step professional tasks that simple

ArXiv cs.AI 📄 Paper 4d ago

EmoMAS: Emotion-Aware Multi-Agent System for High-Stakes Edge-Deployable Negotiation with Bayesian Orchestration

arXiv:2604.07003v2 Announce Type: replace Abstract: Large language models (LLMs) has been widely used for automated negotiation, but their high computational co

ArXiv cs.AI 📄 Paper 4d ago

EVGeoQA: Benchmarking LLMs on Dynamic, Multi-Objective Geo-Spatial Exploration

arXiv:2604.07070v2 Announce Type: replace Abstract: While Large Language Models (LLMs) demonstrate remarkable reasoning capabilities, their potential for purpos

ArXiv cs.AI 📄 Paper 4d ago

Rhizome OS-1: Rhizome's Semi-Autonomous Operating System for Small Molecule Drug Discovery

arXiv:2604.07512v2 Announce Type: replace Abstract: We present Rhizome OS-1, a semi-autonomous operating system for small molecule drug discovery in which multi

ArXiv cs.AI 📄 Paper 4d ago

IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures

arXiv:2604.07709v2 Announce Type: replace Abstract: Ask a frontier model how to taper six milligrams of alprazolam (psychiatrist retired, ten days of pills left

ArXiv cs.AI 📄 Paper 4d ago

SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents

arXiv:2604.07791v2 Announce Type: replace Abstract: Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) have demonstrated significant poten

ArXiv cs.AI 📄 Paper 4d ago

AXIL: Exact Instance Attribution for Gradient Boosting

arXiv:2301.01864v2 Announce Type: replace-cross Abstract: We derive an exact, prediction-specific instance-attribution method for fitted gradient boosting machi

ArXiv cs.AI 📄 Paper 4d ago

Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings

arXiv:2305.14299v3 Announce Type: replace-cross Abstract: Learning high quality sentence embeddings from dialogues has drawn increasing attentions as it is esse

ArXiv cs.AI 📄 Paper 4d ago

SCITUNE: Aligning Large Language Models with Human-Curated Scientific Multimodal Instructions

arXiv:2307.01139v2 Announce Type: replace-cross Abstract: Instruction finetuning is a popular paradigm to align large language models (LLM) with human intent. D

ArXiv cs.AI 📄 Paper 4d ago

MM-LIMA: Less Is More for Alignment in Multi-Modal Datasets

arXiv:2308.12067v3 Announce Type: replace-cross Abstract: Multimodal large language models are typically trained in two stages: first pre-training on image-text

ArXiv cs.AI 📄 Paper 4d ago

CROP: Conservative Reward for Model-based Offline Policy Optimization

arXiv:2310.17245v2 Announce Type: replace-cross Abstract: Offline reinforcement learning (RL) aims to optimize a policy using collected data without online inte

ArXiv cs.AI 📄 Paper 4d ago

Language Reconstruction with Brain Predictive Coding from fMRI Data

arXiv:2405.11597v2 Announce Type: replace-cross Abstract: Many recent studies have shown that the perception of speech can be decoded from brain signals and sub

ArXiv cs.AI 📄 Paper 4d ago

An Iterative Utility Judgment Framework Inspired by Philosophical Relevance via LLMs

arXiv:2406.11290v3 Announce Type: replace-cross Abstract: Relevance and utility are two frequently used measures to evaluate the effectiveness of an information

ArXiv cs.AI 📄 Paper 4d ago

Linear Attention Based Deep Nonlocal Means Filtering for Multiplicative Noise Removal

arXiv:2407.05087v2 Announce Type: replace-cross Abstract: Multiplicative noise widely exists in radar images, medical images and other important fields' images.

ArXiv cs.AI 📄 Paper 4d ago

Deep deterministic policy gradient with symmetric data augmentation for lateral attitude tracking control of a fixed-wing aircraft

arXiv:2407.11077v3 Announce Type: replace-cross Abstract: The symmetry of dynamical systems can be exploited for state-transition prediction and to facilitate c

ArXiv cs.AI 📄 Paper 4d ago

Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading

arXiv:2410.21316v2 Announce Type: replace-cross Abstract: Transformers and large language models~(LLMs) have seen rapid adoption in all domains. Their sizes hav

ArXiv cs.AI 📄 Paper 4d ago

The Phantom of PCIe: Constraining Generative Artificial Intelligences for Practical Peripherals Trace Synthesizing

arXiv:2411.06376v3 Announce Type: replace-cross Abstract: Peripheral Component Interconnect Express (PCIe) is the de facto interconnect standard for high-speed

ArXiv cs.AI 📄 Paper 4d ago

PoTable: Towards Systematic Thinking via Plan-then-Execute Stage Reasoning on Tables

arXiv:2412.04272v5 Announce Type: replace-cross Abstract: In recent years, table reasoning has garnered substantial research interest, particularly regarding it

ArXiv cs.AI 📄 Paper 4d ago

WebLLM: A High-Performance In-Browser LLM Inference Engine

arXiv:2412.15803v2 Announce Type: replace-cross Abstract: Advancements in large language models (LLMs) have unlocked remarkable capabilities. While deploying th

ArXiv cs.AI 📄 Paper 4d ago

HumanVBench: Probing Human-Centric Video Understanding in MLLMs with Automatically Synthesized Benchmarks

arXiv:2412.17574v3 Announce Type: replace-cross Abstract: Evaluating the nuanced human-centric video understanding capabilities of Multimodal Large Language Mod