📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 8,253 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (21843) ArXiv cs.AI Dev.to AI Medium · AI Medium · Programming Forbes Innovation Medium · Machine Learning

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Brittlebench: Quantifying LLM robustness via prompt sensitivity

arXiv:2603.13285v2 Announce Type: replace-cross Abstract: Existing evaluation methods largely rely on clean, static benchmarks, which can overestimate true mode

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

AI-Driven Predictive Maintenance with Environmental Context Integration for Connected Vehicles: Simulation, Benchmarking, and Field Validation

arXiv:2603.13343v2 Announce Type: replace-cross Abstract: Predictive maintenance for connected vehicles offers the potential to reduce unexpected breakdowns and

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Generate Then Correct: Single Shot Global Correction for Aspect Sentiment Quad Prediction

arXiv:2603.13777v2 Announce Type: replace-cross Abstract: Aspect-based sentiment analysis (ABSA) extracts aspect-level sentiment signals from user-generated tex

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

Fine-tuning is Not Enough: A Parallel Framework for Collaborative Imitation and Reinforcement Learning in End-to-end Autonomous Driving

arXiv:2603.13842v2 Announce Type: replace-cross Abstract: End-to-end autonomous driving is typically built upon imitation learning (IL), yet its performance is

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Adaptive Stopping for Multi-Turn LLM Reasoning

arXiv:2604.01413v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) increasingly rely on multi-turn reasoning and interaction, such as adapti

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

GPA: Learning GUI Process Automation from Demonstrations

arXiv:2604.01676v2 Announce Type: replace-cross Abstract: GUI Process Automation (GPA) is a lightweight but general vision-based Robotic Process Automation (RPA

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web

arXiv:2604.02334v1 Announce Type: new Abstract: As large language models (LLM)-driven agents transition from isolated task solvers to persistent digital entitie

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

arXiv:2604.02368v1 Announce Type: new Abstract: As Large Language Models (LLMs) exhibit plateauing performance on conventional benchmarks, a pivotal challenge p

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Compositional Neuro-Symbolic Reasoning

arXiv:2604.02434v1 Announce Type: new Abstract: We study structured abstraction-based reasoning for the Abstraction and Reasoning Corpus (ARC) and compare its g

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Understanding the Nature of Generative AI as Threshold Logic in High-Dimensional Space

arXiv:2604.02476v1 Announce Type: new Abstract: This paper examines the role of threshold logic in understanding generative artificial intelligence. Threshold f

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AIVV: Neuro-Symbolic LLM Agent-Integrated Verification and Validation for Trustworthy Autonomous Systems

arXiv:2604.02478v1 Announce Type: new Abstract: Deep learning models excel at detecting anomaly patterns in normal data. However, they do not provide a direct s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

I must delete the evidence: AI Agents Explicitly Cover up Fraud and Violent Crime

arXiv:2604.02500v1 Announce Type: new Abstract: As ongoing research explores the ability of AI agents to be insider threats and act against company interests, w

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

A Comprehensive Framework for Long-Term Resiliency Investment Planning under Extreme Weather Uncertainty for Electric Utilities

arXiv:2604.02504v1 Announce Type: new Abstract: Electric utilities must make massive capital investments in the coming years to respond to explosive growth in d

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization

arXiv:2604.02528v1 Announce Type: new Abstract: The new Specifications for the National Bridge Inventory (SNBI), in effect from 2022, emphasize the use of eleme

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Competency Questions as Executable Plans: a Controlled RAG Architecture for Cultural Heritage Storytelling

arXiv:2604.02545v1 Announce Type: new Abstract: The preservation of intangible cultural heritage is a critical challenge as collective memory fades over time. W

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Mitigating LLM biases toward spurious social contexts using direct preference optimization

arXiv:2604.02585v1 Announce Type: new Abstract: LLMs are increasingly used for high-stakes decision-making, yet their sensitivity to spurious contextual informa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Do Audio-Visual Large Language Models Really See and Hear?

arXiv:2604.02605v1 Announce Type: new Abstract: Audio-Visual Large Language Models (AVLLMs) are emerging as unified interfaces to multimodal perception. We pres

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AutoVerifier: An Agentic Automated Verification Framework Using Large Language Models

arXiv:2604.02617v1 Announce Type: new Abstract: Scientific and Technical Intelligence (S&TI) analysis requires verifying complex technical claims across rapidly

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

OntoKG: Ontology-Oriented Knowledge Graph Construction with Intrinsic-Relational Routing

arXiv:2604.02618v1 Announce Type: new Abstract: Organizing a large-scale knowledge graph into a typed property graph requires structural decisions -- which enti

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Let's Have a Conversation: Designing and Evaluating LLM Agents for Interactive Optimization

arXiv:2604.02666v1 Announce Type: new Abstract: Optimization is as much about modeling the right problem as solving it. Identifying the right objectives, constr

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

arXiv:2604.02721v1 Announce Type: new Abstract: Competitive programming remains one of the last few human strongholds in coding against AI. The best AI system t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models

arXiv:2604.02733v1 Announce Type: new Abstract: Reasoning benchmarks typically evaluate whether a model derives the correct answer from a fixed premise set, but

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Aligning Progress and Feasibility: A Neuro-Symbolic Dual Memory Framework for Long-Horizon LLM Agents

arXiv:2604.02734v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated strong potential in long-horizon decision-making tasks, such as e

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity

arXiv:2604.02770v1 Announce Type: new Abstract: In large language model (LLM)-driven multi-agent systems, disobey role specification (failure to adhere to the d