📰 ArXiv cs.AI

20,523 articles · Updated every 3 hours · View all reads

arXiv:2606.13682v1 Announce Type: new Abstract: The open shop scheduling problem (OSSP) arises in many industrial and service settings but remains computational

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

UP-NRPA: User Portrait based Nested Rollout Policy Adaptation for Planning with Large Language Models in Goal-oriented Dialogue Systems

arXiv:2606.13683v1 Announce Type: new Abstract: To address the challenge that current dialogue policy planning methods struggle to dynamically adapt to diverse

ArXiv cs.AI 📄 Paper 5h ago

History of the Muddy Children Puzzle

arXiv:2606.13703v1 Announce Type: new Abstract: The Muddy Children Puzzle is a puzzle about knowledge and ignorance that has been inspiring for the development

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

Orchestra-o1: Omnimodal Agent Orchestration

arXiv:2606.13707v1 Announce Type: new Abstract: The recent success of agent swarms has shifted the paradigm of large language model (LLM)-based agents from sing

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

Hybrid Open-Ended Tri-Evolution Makes Better Deep Researcher

arXiv:2606.13710v1 Announce Type: new Abstract: Deep research and agent evolution serve as de-facto tasks for AI agents in real-world applications toward artifi

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

WorkBench Revisited: Workplace Agents Two Years On

arXiv:2606.13715v1 Announce Type: new Abstract: The best agent on WorkBench in March 2024, GPT-4, completed 43% of tasks and took an unintended harmful action,

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 5h ago

Refusal Beyond a Single Direction: A Preliminary Comparison of Diff-in-Means and INLP

arXiv:2606.13720v1 Announce Type: new Abstract: Arditi et al. (2024) has shown that refusal in safety fine-tuned chat models is mediated by a single linear dire

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

YeasierAgent: Agentic Social Sandbox as a Canvas for Intent-Driven Creation of Platform-Agnostic Symbiotic Agent-Native Applications

arXiv:2606.13722v1 Announce Type: new Abstract: This paper introduces YeasierAgent, an application-building paradigm based on symbiotic agents, narrative worlds

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

TwinBI: An Agentic Digital Twin for Efficient Augmented Interactions with Business Intelligence Dashboards

arXiv:2606.13731v1 Announce Type: new Abstract: Business intelligence (BI) increasingly combines dashboard interaction with LLM-based assistance, but these two

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 5h ago

When Sample Selection Bias Precipitates Model Collapse

arXiv:2606.13732v1 Announce Type: new Abstract: The proliferation of recursive training on synthetic data can alleviate data scarcity but risks model collapse,

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 5h ago

AI Receptivity or AI Adoption Breadth? A Tool-Specific Reanalysis of the Lower-Literacy/Higher-Usage Link

arXiv:2606.13734v1 Announce Type: new Abstract: Recent evidence reported by Tully, Longoni, and Appel (2025) suggests that lower artificial intelligence (AI) li

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

MA-ProofBench: A Two-Tiered Evaluation of LLMs for Theorem Proving in Mathematical Analysis

arXiv:2606.13782v1 Announce Type: new Abstract: Large Language Models (LLMs) have made notable progress in automated theorem proving, yet existing formal benchm

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

Poker Arena: Multi-Axis Profiling of Strategic Reasoning and Memory in LLMs

arXiv:2606.13815v1 Announce Type: new Abstract: Strategic reasoning under uncertainty underpins consequential decisions in negotiation, finance, and policy, but

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

Hyperdimensional computing for structured querying on tabular data embeddings

arXiv:2606.13871v1 Announce Type: new Abstract: Tabular data embeddings have become a cornerstone of data profiling and data integration pipelines, enabling tas

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

Capability Minimization as a Safety Primitive: Risk-Aware Causal Gating for Least-Privilege LLM Agents

arXiv:2606.13884v1 Announce Type: new Abstract: Modern decision systems increasingly rely on learned components whose outputs may be confident yet wrong, exposi

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

A Multi-Agent AI System for Automated High School Transcript Processing: Collaborative Document Analysis at Scale

arXiv:2606.13916v1 Announce Type: new Abstract: Each year, college admissions offices face an overwhelming challenge: processing millions of high school transcr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

Sorries Are Not the Hard Part: An Expert-Review Case Study of a Semi-Autonomous Formalization

arXiv:2606.13925v1 Announce Type: new Abstract: Large language models can often close proof gaps in interactive theorem provers, but a verified theorem is not t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

Adversarial Concept Search: Predicting Compositional Errors From Feature Geometry

arXiv:2606.13934v1 Announce Type: new Abstract: Humans cannot always intuit what scenarios are most challenging to LLMs. Hoping to capture challenging edge case

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

Minim: Privacy-Aware Minimal View for Agents via Trusted Local Sanitization

arXiv:2606.13949v1 Announce Type: new Abstract: Modern LLM-powered autonomous agents increasingly rely on rich user interface (UI) state observations to achieve

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

Formalizing Numerical Analysis: An Agent Pipeline and Quality Audit Beyond Kernel Acceptance

arXiv:2606.14000v1 Announce Type: new Abstract: Recent work has demonstrated that coding agents can formalize entire advanced mathematics textbooks in Lean 4, y

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

Applicability Condition Extraction for Therapeutic Drug-Disease Relations

arXiv:2606.14031v1 Announce Type: new Abstract: Identifying conditions that a certain drug takes therapeutic effect on a target disease is crucial for clinical

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5h ago

FactoryLLM: A Safe and Open-Source AI Playground for Evaluating LLMs in Smart Factories

arXiv:2606.14119v1 Announce Type: new Abstract: Fault diagnostics and recovery in smart factories is challenging because critical information is dispersed acros

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

VeriGeo: Controllable Geometry Question Generation with Numerical and Analytical Verification

arXiv:2606.14176v1 Announce Type: new Abstract: Geometry problem generation is useful for AI-assisted education and multimodal mathematical reasoning, but relia

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 5h ago

When Should Agent Trust Be Conditional? Characterizing and Attacking Skill-Conditional Reputation in Agent Swarms

arXiv:2606.14200v1 Announce Type: new Abstract: Open platforms increasingly route tasks among heterogeneous LLM agents--differing in base model, scaffold, and t