📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 8,253 articles · Updated every 3 hours · View all reads

arXiv:2603.26592v1 Announce Type: cross Abstract: Reliable machine-learning models in biomedical settings depend on accurate labels, yet annotating biomedical t

ArXiv cs.AI 📄 Paper 1mo ago

Sustainability Is Not Linear: Quantifying Performance, Energy, and Privacy Trade-offs in On-Device Intelligence

arXiv:2603.26603v1 Announce Type: cross Abstract: The migration of Large Language Models (LLMs) from cloud clusters to edge devices promises enhanced privacy an

ArXiv cs.AI 📄 Paper 1mo ago

Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling

arXiv:2603.26610v1 Announce Type: cross Abstract: Mobile devices continuously interact with cellular base stations, generating massive volumes of signaling reco

ArXiv cs.AI 📄 Paper 1mo ago

Machine Learning Transferability for Malware Detection

arXiv:2603.26632v1 Announce Type: cross Abstract: Malware continues to be a predominant operational risk for organizations, especially when obfuscation techniqu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Make Geometry Matter for Spatial Reasoning

arXiv:2603.26639v1 Announce Type: cross Abstract: Empowered by large-scale training, vision-language models (VLMs) achieve strong image and video understanding,

ArXiv cs.AI 📄 Paper 1mo ago

Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

arXiv:2603.26648v1 Announce Type: cross Abstract: Recent advances in large language models have improved the capabilities of coding agents, yet systematic evalu

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1mo ago

PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning

arXiv:2603.26653v1 Announce Type: cross Abstract: We introduce PerceptionComp, a manually annotated benchmark for complex, long-horizon, perception-centric vide

ArXiv cs.AI 📄 Paper 1mo ago

Ruka-v2: Tendon Driven Open-Source Dexterous Hand with Wrist and Abduction for Robot Learning

arXiv:2603.26660v1 Announce Type: cross Abstract: Lack of accessible and dexterous robot hardware has been a significant bottleneck to achieving human-level dex

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Scale-Adaptive Balancing of Exploration and Exploitation in Classical Planning

arXiv:2305.09840v4 Announce Type: replace Abstract: Balancing exploration and exploitation has been an important problem in both game tree search and automated

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

Extreme Value Monte Carlo Tree Search for Classical Planning

arXiv:2405.18248v3 Announce Type: replace Abstract: Despite being successful in board games and reinforcement learning (RL), Monte Carlo Tree Search (MCTS) comb

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

ReMe: Scaffolding Personalized Cognitive Training via Controllable LLM-Mediated Conversations

arXiv:2410.19733v2 Announce Type: replace Abstract: Global aging calls for scalable and engaging cognitive interventions. Computerized cognitive training (CCT)

ArXiv cs.AI 📄 Paper 1mo ago

Efficient Energy-Optimal Path Planning for Electric Vehicles Considering Vehicle Dynamics

arXiv:2411.12964v2 Announce Type: replace Abstract: The rapid adoption of electric vehicles (EVs) in modern transport systems has made energy-aware routing a cr

ArXiv cs.AI 📄 Paper 1mo ago

Deontic Temporal Logic for Formal Verification of AI Ethics

arXiv:2501.05765v4 Announce Type: replace Abstract: Ensuring ethical behavior in Artificial Intelligence (AI) systems amidst their increasing ubiquity and influ

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety

arXiv:2508.00500v3 Announce Type: replace Abstract: Large Language Model (LLM) agents increasingly operate across domains such as robotics, virtual assistants,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Humanline: Online Alignment as Perceptual Loss

arXiv:2509.24207v2 Announce Type: replace Abstract: Online alignment (e.g., GRPO) is generally more performant than offline alignment (e.g., DPO) -- but why? Dr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens

arXiv:2510.08222v2 Announce Type: replace Abstract: Due to their inherent complexity, reasoning tasks have long been regarded as rigorous benchmarks for assessi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Shared Spatial Memory Through Predictive Coding

arXiv:2511.04235v4 Announce Type: replace Abstract: Constructing a consistent shared spatial memory is a critical challenge in multi-agent systems, where partia

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

HeaRT: A Hierarchical Circuit Reasoning Tree-Based Agentic Framework for AMS Design Optimization

arXiv:2511.19669v2 Announce Type: replace Abstract: Conventional AI-driven AMS design automation algorithms remain constrained by their reliance on high-quality

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Before We Trust Them: Decision-Making Failures in Navigation of Foundation Models

arXiv:2601.05529v4 Announce Type: replace Abstract: High success rates on navigation-related tasks do not necessarily translate into reliable decision making by

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation

arXiv:2601.08323v3 Announce Type: replace Abstract: Equipping agents with memory is essential for solving real-world long-horizon problems. However, most existi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

See, Symbolize, Act: Grounding VLMs with Spatial Representations for Better Gameplay

arXiv:2603.11601v2 Announce Type: replace Abstract: Vision-Language Models (VLMs) excel at describing visual scenes, yet struggle to translate perception into p

ArXiv cs.AI 📄 Paper 1mo ago

AIDABench: AI Data Analytics Benchmark

arXiv:2603.15636v2 Announce Type: replace Abstract: As AI-driven document understanding and processing tools become increasingly prevalent in real-world applica

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1mo ago

Draft-and-Prune: Improving the Reliability of Auto-formalization for Logical Reasoning

arXiv:2603.17233v2 Announce Type: replace Abstract: Auto-formalization (AF) translates natural-language reasoning problems into solver-executable programs, enab

ArXiv cs.AI 📄 Paper 1mo ago

Large-Scale Analysis of Persuasive Content on Moltbook

arXiv:2603.18349v2 Announce Type: replace Abstract: We present an NLP-based study of political propaganda on Moltbook, a Reddit-style platform for AI agents. To