8,253 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 8,253 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (21843) ArXiv cs.AIDev.to AIMedium · AIMedium · ProgrammingForbes InnovationMedium · Machine Learning
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
TimeSeek: Temporal Reliability of Agentic Forecasters
arXiv:2604.04220v1 Announce Type: new Abstract: We introduce TimeSeek, a benchmark for studying how the reliability of agentic LLM forecasters changes over a pr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Pedagogical Safety in Educational Reinforcement Learning: Formalizing and Detecting Reward Hacking in AI Tutoring Systems
arXiv:2604.04237v1 Announce Type: new Abstract: Reinforcement learning (RL) is increasingly used to personalize instruction in intelligent tutoring systems, yet
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents
arXiv:2604.04247v1 Announce Type: new Abstract: Recent advances in prompt learning allow large language model agents to acquire task-relevant knowledge from inf
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago
MC-CPO: Mastery-Conditioned Constrained Policy Optimization
arXiv:2604.04251v1 Announce Type: new Abstract: Engagement-optimized adaptive tutoring systems may prioritize short-term behavioral signals over sustained learn
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Context Engineering: A Practitioner Methodology for Structured Human-AI Collaboration
arXiv:2604.04258v1 Announce Type: new Abstract: The quality of AI-generated output is often attributed to prompting technique, but extensive empirical observati
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago
Beyond Fluency: Toward Reliable Trajectories in Agentic IR
arXiv:2604.04269v1 Announce Type: new Abstract: Information Retrieval is shifting from passive document ranking toward autonomous agentic workflows that operate
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI
arXiv:2604.04274v1 Announce Type: new Abstract: Causal inference is central to scientific discovery, yet choosing appropriate methods remains challenging becaus
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts
arXiv:2604.04281v1 Announce Type: new Abstract: Width expansion offers a practical route to reuse smaller causal-language-model checkpoints, but selecting a wid
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence
arXiv:2604.04297v1 Announce Type: new Abstract: Physiological foundation models (FMs) have shown promise for biosignal representation learning, yet most remain
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
RESCORE: LLM-Driven Simulation Recovery in Control Systems Research Papers
arXiv:2604.04324v1 Announce Type: new Abstract: Reconstructing numerical simulations from control systems research papers is often hindered by underspecified pa
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago
Soft Tournament Equilibrium
arXiv:2604.04328v1 Announce Type: new Abstract: The evaluation of general-purpose artificial agents, particularly those based on large language models, presents
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Thermodynamic-Inspired Explainable GeoAI: Uncovering Regime-Dependent Mechanisms in Heterogeneous Spatial Systems
arXiv:2604.04339v1 Announce Type: new Abstract: Modeling spatial heterogeneity and associated critical transitions remains a fundamental challenge in geography
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Implementing surrogate goals for safer bargaining in LLM-based agents
arXiv:2604.04341v1 Announce Type: new Abstract: Surrogate goals have been proposed as a strategy for reducing risks from bargaining failures. A surrogate goal i
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Domain-Contextualized Inference: A Computable Graph Architecture for Explicit-Domain Reasoning
arXiv:2604.04344v1 Announce Type: new Abstract: We establish a computation-substrate-agnostic inference architecture in which domain is an explicit first-class
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago
RoboPhD: Evolving Diverse Complex Agents Under Tight Evaluation Budgets
arXiv:2604.04347v1 Announce Type: new Abstract: 2026 has brought an explosion of interest in LLM-guided evolution of agentic artifacts, with systems like GEPA a
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
REAM: Merging Improves Pruning of Experts in LLMs
arXiv:2604.04356v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) large language models (LLMs) are among the top-performing architectures. The largest mo
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Decocted Experience Improves Test-Time Inference in LLM Agents
arXiv:2604.04373v1 Announce Type: new Abstract: There is growing interest in improving LLMs without updating model parameters. One well-established direction is
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Optimizing Service Operations via LLM-Powered Multi-Agent Simulation
arXiv:2604.04383v1 Announce Type: new Abstract: Service system performance depends on how participants respond to design choices, but modeling these responses i
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Automatically Generating Hard Math Problems from Hypothesis-Driven Error Analysis
arXiv:2604.04386v1 Announce Type: new Abstract: Numerous math benchmarks exist to evaluate LLMs' mathematical capabilities. However, most involve extensive manu
ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago
Gradual Cognitive Externalization: A Framework for Understanding How Ambient Intelligence Externalizes Human Cognition
arXiv:2604.04387v1 Announce Type: new Abstract: Developers are publishing AI agent skills that replicate a colleague's communication style, encode a supervisor'