📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 8,253 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (21843) ArXiv cs.AI Dev.to AI Medium · AI Medium · Programming Forbes Innovation Medium · Machine Learning

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Teaching an Agent to Sketch One Part at a Time

arXiv:2603.19500v1 Announce Type: new Abstract: We develop a method for producing vector sketches one part at a time. To do this, we train a multi-modal languag

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Learning to Disprove: Formal Counterexample Generation with Large Language Models

arXiv:2603.19514v1 Announce Type: new Abstract: Mathematical reasoning demands two critical, complementary skills: constructing rigorous proofs for true stateme

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models

arXiv:2603.19515v1 Announce Type: new Abstract: Large language models (LLMs) with advanced cognitive capabilities are emerging as agents for various reasoning a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

PA2D-MORL: Pareto Ascent Directional Decomposition based Multi-Objective Reinforcement Learning

arXiv:2603.19579v1 Announce Type: new Abstract: Multi-objective reinforcement learning (MORL) provides an effective solution for decision-making problems involv

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management

arXiv:2603.19584v1 Announce Type: new Abstract: Battery life remains a critical challenge for mobile devices, yet existing power management mechanisms rely on s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning

arXiv:2603.19639v1 Announce Type: new Abstract: Although agentic workflows have demonstrated strong potential for solving complex tasks, existing automated gene

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

arXiv:2603.19685v1 Announce Type: new Abstract: Large language model (LLM)-based agents have emerged as powerful autonomous controllers for digital environments

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification

arXiv:2603.19715v1 Announce Type: new Abstract: Formal verification via interactive theorem proving is increasingly used to ensure the correctness of critical s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Embodied Science: Closing the Discovery Loop with Agentic Embodied AI

arXiv:2603.19782v1 Announce Type: new Abstract: Artificial intelligence has demonstrated remarkable capability in predicting scientific properties, yet scientif

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

FormalEvolve: Neuro-Symbolic Evolutionary Search for Diverse and Prover-Effective Autoformalization

arXiv:2603.19828v1 Announce Type: new Abstract: Autoformalization aims to translate natural-language mathematics into compilable, machine-checkable statements.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Utility-Guided Agent Orchestration for Efficient LLM Tool Use

arXiv:2603.19896v1 Announce Type: new Abstract: Tool-using large language model (LLM) agents often face a fundamental tension between answer quality and executi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

On the Ability of Transformers to Verify Plans

arXiv:2603.19954v1 Announce Type: new Abstract: Transformers have shown inconsistent success in AI planning tasks, and theoretical understanding of when general

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

arXiv:2603.20046v1 Announce Type: new Abstract: Reinforcement Learning (RL) with rubric-based rewards has recently shown remarkable progress in enhancing genera

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

DIAL-KG: Schema-Free Incremental Knowledge Graph Construction via Dynamic Schema Induction and Evolution-Intent Assessment

arXiv:2603.20059v1 Announce Type: new Abstract: Knowledge Graphs (KGs) are foundational to applications such as search, question answering, and recommendation.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Pitfalls in Evaluating Interpretability Agents

arXiv:2603.20101v1 Announce Type: new Abstract: Automated interpretability systems aim to reduce the need for human labor and scale analysis to increasingly lar

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

arXiv:2603.20170v1 Announce Type: new Abstract: Theory of Mind (ToM) reasoning with Large Language Models (LLMs) requires inferring how people's implicit, evolv

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

arXiv:2603.17765v1 Announce Type: cross Abstract: Automated radiology report generation has gained increasing attention with the rise of deep learning and large

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

arXiv:2603.19236v1 Announce Type: cross Abstract: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework provides a rigorous

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

arXiv:2603.19247v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly integrated into high-stakes applications, making robust safety g

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

DuCCAE: A Hybrid Engine for Immersive Conversation via Collaboration, Augmentation, and Evolution

arXiv:2603.19248v1 Announce Type: cross Abstract: Immersive conversational systems in production face a persistent trade-off between responsiveness and long-hor

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

arXiv:2603.19252v1 Announce Type: cross Abstract: Evaluating the symbolic reasoning of large language models (LLMs) calls for geometry benchmarks that require m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

arXiv:2603.19253v1 Announce Type: cross Abstract: Argument mining (AM) is an interdisciplinary research field focused on the automatic identification and classi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

arXiv:2603.19255v1 Announce Type: cross Abstract: Despite the strong performance of Large Language Models (LLMs) on complex instruction-following tasks, precise

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

MAPLE: Metadata Augmented Private Language Evolution

arXiv:2603.19258v1 Announce Type: cross Abstract: While differentially private (DP) fine-tuning of large language models (LLMs) is a powerful tool, it is often