4,506 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 4,506 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (12057) ArXiv cs.AIDev.to · FORUM WEBDev.to AIForbes InnovationOpenAI NewsHugging Face Blog
ArXiv cs.AI 📄 Paper 2d ago
Collaborative Multi-Agent Scripts Generation for Enhancing Imperfect-Information Reasoning in Murder Mystery Games
arXiv:2604.11741v1 Announce Type: new Abstract: Vision-language models (VLMs) have shown impressive capabilities in perceptual tasks, yet they degrade in comple
ArXiv cs.AI 📄 Paper 2d ago
Retrieval Is Not Enough: Why Organizational AI Needs Epistemic Infrastructure
arXiv:2604.11759v1 Announce Type: new Abstract: Organizational knowledge used by AI agents typically lacks epistemic structure: retrieval systems surface semant
ArXiv cs.AI 📄 Paper 2d ago
GenTac: Generative Modeling and Forecasting of Soccer Tactics
arXiv:2604.11786v1 Announce Type: new Abstract: Modeling open-play soccer tactics is a formidable challenge due to the stochastic, multi-agent nature of the gam
ArXiv cs.AI 📄 Paper 2d ago
Detecting Safety Violations Across Many Agent Traces
arXiv:2604.11806v1 Announce Type: new Abstract: To identify safety violations, auditors often search over large sets of agent traces. This search is difficult b
ArXiv cs.AI 📄 Paper 2d ago
The Paradox of Professional Input: How Expert Collaboration with AI Systems Shapes Their Future Value
arXiv:2504.12654v1 Announce Type: cross Abstract: This perspective paper examines a fundamental paradox in the relationship between professional expertise and a
ArXiv cs.AI 📄 Paper 2d ago
Retrieval-Augmented Large Language Models for Evidence-Informed Guidance on Cannabidiol Use in Older Adults
arXiv:2604.09548v1 Announce Type: cross Abstract: Older adults commonly experience chronic conditions such as pain and sleep disturbances and may consider canna
ArXiv cs.AI 📄 Paper 2d ago
Beyond Offline A/B Testing: Context-Aware Agent Simulation for Recommender System Evaluation
arXiv:2604.09549v1 Announce Type: cross Abstract: Recommender systems are central to online services, enabling users to navigate through massive amounts of cont
ArXiv cs.AI 📄 Paper 2d ago
SemaCDR: LLM-Powered Transferable Semantics for Cross-Domain Sequential Recommendation
arXiv:2604.09551v1 Announce Type: cross Abstract: Cross-domain recommendation (CDR) addresses the data sparsity and cold-start problems in the target domain by
ArXiv cs.AI 📄 Paper 2d ago
MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval
arXiv:2604.09552v1 Announce Type: cross Abstract: Engineering rulebooks and technical standards contain multimodal information like dense text, tables, and illu
ArXiv cs.AI 📄 Paper 2d ago
SRBench: A Comprehensive Benchmark for Sequential Recommendation with Large Language Models
arXiv:2604.09553v1 Announce Type: cross Abstract: LLM development has aroused great interest in Sequential Recommendation (SR) applications. However, comprehens
ArXiv cs.AI 📄 Paper 2d ago
Para-B&B: Load-Balanced Deterministic Parallelization of Solving MIP
arXiv:2604.09556v1 Announce Type: cross Abstract: Mixed-integer programming (MIP) extends linear programming by incorporating both continuous and integer decisi
ArXiv cs.AI 📄 Paper 2d ago
SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding
arXiv:2604.09557v1 Announce Type: cross Abstract: Speculative Decoding (SD) has emerged as a critical technique for accelerating Large Language Model (LLM) infe
ArXiv cs.AI 📄 Paper 2d ago
Emergent Social Structures in Autonomous AI Agent Networks: A Metadata Analysis of 626 Agents on the Pilot Protocol
arXiv:2604.09561v1 Announce Type: cross Abstract: We present the first empirical analysis of social structure formation among autonomous AI agents on a live net
ArXiv cs.AI 📄 Paper 2d ago
StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
arXiv:2604.09562v1 Announce Type: cross Abstract: Efficient LLM serving must balance throughput and latency across diverse, bursty workloads. We introduce Strea
ArXiv cs.AI 📄 Paper 2d ago
ACE-Bench: A Lightweight Benchmark for Evaluating Azure SDK Usage Correctness
arXiv:2604.09564v1 Announce Type: cross Abstract: We present ACE-Bench (Azure SDK Coding Evaluation Benchmark), an execution-free benchmark that provides fast,
ArXiv cs.AI 📄 Paper 2d ago
AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators
arXiv:2604.09565v1 Announce Type: cross Abstract: This paper introduces a unified, hardware-independent baremetal runtime architecture designed to enable high-p
ArXiv cs.AI 📄 Paper 2d ago
LETGAMES: An LLM-Powered Gamified Approach to Cognitive Training for Patients with Cognitive Impairment
arXiv:2604.09566v1 Announce Type: cross Abstract: The application of games as a therapeutic tool for cognitive training is beneficial for patients with cognitiv
ArXiv cs.AI 📄 Paper 2d ago
Neuro-Symbolic Strong-AI Robots with Closed Knowledge Assumption: Learning and Deductions
arXiv:2604.09567v1 Announce Type: cross Abstract: Knowledge representation formalisms are aimed to represent general conceptual information and are typically us
ArXiv cs.AI 📄 Paper 2d ago
Tuning Qwen2.5-VL to Improve Its Web Interaction Skills
arXiv:2604.09571v1 Announce Type: cross Abstract: Recent advances in vision-language models (VLMs) have sparked growing interest in using them to automate web t
ArXiv cs.AI 📄 Paper 2d ago
ACE-TA: An Agentic Teaching Assistant for Grounded Q&A, Quiz Generation, and Code Tutoring
arXiv:2604.09572v1 Announce Type: cross Abstract: We introduce ACE-TA, the Agentic Coding and Explanations Teaching Assistant framework, that autonomously route