8,253 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 8,253 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (21843) ArXiv cs.AIDev.to AIMedium · AIMedium · ProgrammingForbes InnovationMedium · Machine Learning
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Make Geometry Matter for Spatial Reasoning
arXiv:2603.26639v1 Announce Type: cross Abstract: Empowered by large-scale training, vision-language models (VLMs) achieve strong image and video understanding,
ArXiv cs.AI 📄 Paper 1mo ago
Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification
arXiv:2603.26648v1 Announce Type: cross Abstract: Recent advances in large language models have improved the capabilities of coding agents, yet systematic evalu
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1mo ago
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning
arXiv:2603.26653v1 Announce Type: cross Abstract: We introduce PerceptionComp, a manually annotated benchmark for complex, long-horizon, perception-centric vide
ArXiv cs.AI 📄 Paper 1mo ago
Ruka-v2: Tendon Driven Open-Source Dexterous Hand with Wrist and Abduction for Robot Learning
arXiv:2603.26660v1 Announce Type: cross Abstract: Lack of accessible and dexterous robot hardware has been a significant bottleneck to achieving human-level dex
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Scale-Adaptive Balancing of Exploration and Exploitation in Classical Planning
arXiv:2305.09840v4 Announce Type: replace Abstract: Balancing exploration and exploitation has been an important problem in both game tree search and automated
ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago
Extreme Value Monte Carlo Tree Search for Classical Planning
arXiv:2405.18248v3 Announce Type: replace Abstract: Despite being successful in board games and reinforcement learning (RL), Monte Carlo Tree Search (MCTS) comb
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
ReMe: Scaffolding Personalized Cognitive Training via Controllable LLM-Mediated Conversations
arXiv:2410.19733v2 Announce Type: replace Abstract: Global aging calls for scalable and engaging cognitive interventions. Computerized cognitive training (CCT)
ArXiv cs.AI 📄 Paper 1mo ago
Efficient Energy-Optimal Path Planning for Electric Vehicles Considering Vehicle Dynamics
arXiv:2411.12964v2 Announce Type: replace Abstract: The rapid adoption of electric vehicles (EVs) in modern transport systems has made energy-aware routing a cr
ArXiv cs.AI 📄 Paper 1mo ago
Deontic Temporal Logic for Formal Verification of AI Ethics
arXiv:2501.05765v4 Announce Type: replace Abstract: Ensuring ethical behavior in Artificial Intelligence (AI) systems amidst their increasing ubiquity and influ
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety
arXiv:2508.00500v3 Announce Type: replace Abstract: Large Language Model (LLM) agents increasingly operate across domains such as robotics, virtual assistants,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Humanline: Online Alignment as Perceptual Loss
arXiv:2509.24207v2 Announce Type: replace Abstract: Online alignment (e.g., GRPO) is generally more performant than offline alignment (e.g., DPO) -- but why? Dr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
arXiv:2510.08222v2 Announce Type: replace Abstract: Due to their inherent complexity, reasoning tasks have long been regarded as rigorous benchmarks for assessi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Shared Spatial Memory Through Predictive Coding
arXiv:2511.04235v4 Announce Type: replace Abstract: Constructing a consistent shared spatial memory is a critical challenge in multi-agent systems, where partia
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
HeaRT: A Hierarchical Circuit Reasoning Tree-Based Agentic Framework for AMS Design Optimization
arXiv:2511.19669v2 Announce Type: replace Abstract: Conventional AI-driven AMS design automation algorithms remain constrained by their reliance on high-quality
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
Before We Trust Them: Decision-Making Failures in Navigation of Foundation Models
arXiv:2601.05529v4 Announce Type: replace Abstract: High success rates on navigation-related tasks do not necessarily translate into reliable decision making by
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation
arXiv:2601.08323v3 Announce Type: replace Abstract: Equipping agents with memory is essential for solving real-world long-horizon problems. However, most existi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago
See, Symbolize, Act: Grounding VLMs with Spatial Representations for Better Gameplay
arXiv:2603.11601v2 Announce Type: replace Abstract: Vision-Language Models (VLMs) excel at describing visual scenes, yet struggle to translate perception into p
ArXiv cs.AI 📄 Paper 1mo ago
AIDABench: AI Data Analytics Benchmark
arXiv:2603.15636v2 Announce Type: replace Abstract: As AI-driven document understanding and processing tools become increasingly prevalent in real-world applica
ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago
Draft-and-Prune: Improving the Reliability of Auto-formalization for Logical Reasoning
arXiv:2603.17233v2 Announce Type: replace Abstract: Auto-formalization (AF) translates natural-language reasoning problems into solver-executable programs, enab
ArXiv cs.AI 📄 Paper 1mo ago
Large-Scale Analysis of Persuasive Content on Moltbook
arXiv:2603.18349v2 Announce Type: replace Abstract: We present an NLP-based study of political propaganda on Moltbook, a Reddit-style platform for AI agents. To