6,347 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 6,347 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (16074) ArXiv cs.AIDev.to AIDev.to · FORUM WEBForbes InnovationMedium · ProgrammingMedium · AI
ArXiv cs.AI 📄 Paper 1w ago
Mantis: A Foundation Model for Mechanistic Disease Forecasting
arXiv:2508.12260v5 Announce Type: replace Abstract: Infectious disease forecasting in novel outbreaks or low-resource settings is hampered by the need for large
ArXiv cs.AI 📄 Paper 1w ago
Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training
arXiv:2509.25758v2 Announce Type: replace Abstract: The remarkable capabilities of modern large reasoning models are largely unlocked through post-training tech
ArXiv cs.AI 📄 Paper 1w ago
ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack
arXiv:2509.25843v2 Announce Type: replace Abstract: Large language models (LLMs), despite being safety-aligned, exhibit brittle refusal behaviors that can be ci
ArXiv cs.AI 📄 Paper 1w ago
The Stackelberg Speaker: Optimizing Persuasive Communication in Social Deduction Games
arXiv:2510.09087v2 Announce Type: replace Abstract: Large language model (LLM) agents have shown remarkable progress in social deduction games (SDGs). However,
ArXiv cs.AI 📄 Paper 1w ago
Mixed-Density Diffuser: Efficient Planning with Non-Uniform Temporal Resolution
arXiv:2510.23026v5 Announce Type: replace Abstract: Recent studies demonstrate that diffusion planners benefit from sparse-step planning over single-step planni
ArXiv cs.AI 📄 Paper 1w ago
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence
arXiv:2510.23538v2 Announce Type: replace Abstract: The scope of neural code intelligence is rapidly expanding beyond text-based source code to encompass the ri
ArXiv cs.AI 📄 Paper 1w ago
Does RLVR Extend Reasoning Boundaries? Investigating Capability Expansion in Vision-Language Models
arXiv:2511.00710v4 Announce Type: replace Abstract: Recent studies posit that Reinforcement Learning with Verifiable Rewards (RLVR) primarily amplifies behavior
ArXiv cs.AI 📄 Paper 1w ago
DecompSR: A dataset for decomposed analyses of compositional multihop spatial reasoning
arXiv:2511.02627v2 Announce Type: replace Abstract: We introduce DecompSR, decomposed spatial reasoning, a large benchmark dataset (over 5m datapoints) and gene
ArXiv cs.AI 📄 Paper 1w ago
Dataset Safety in Autonomous Driving: Requirements, Risks, and Assurance
arXiv:2511.08439v2 Announce Type: replace Abstract: Dataset integrity is fundamental to the safety and reliability of AI systems, especially in autonomous drivi
ArXiv cs.AI 📄 Paper 1w ago
Learning the Value of Value Learning
arXiv:2511.17714v5 Announce Type: replace Abstract: Standard decision frameworks address uncertainty about facts but assume fixed options and values. We extend
ArXiv cs.AI 📄 Paper 1w ago
A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents
arXiv:2512.20798v4 Announce Type: replace Abstract: As autonomous AI agents are deployed in high-stakes environments, ensuring their safety has become a paramou
ArXiv cs.AI 📄 Paper 1w ago
No More Stale Feedback: Co-Evolving Critics for Open-World Agent Learning
arXiv:2601.06794v2 Announce Type: replace Abstract: Critique-guided reinforcement learning (RL) has emerged as a powerful paradigm for training LLM agents by au
ArXiv cs.AI 📄 Paper 1w ago
PrivacyReasoner: Can LLM Emulate a Human-like Privacy Mind?
arXiv:2601.09152v2 Announce Type: replace Abstract: Prior work on LLM-based privacy focuses on norm judgment over synthetic vignettes, rather than how people th
ArXiv cs.AI 📄 Paper 1w ago
LatentRefusal: Latent-Signal Refusal for Unanswerable Text-to-SQL Queries
arXiv:2601.10398v3 Announce Type: replace Abstract: In LLM-based text-to-SQL systems, unanswerable and underspecified user queries may generate not only incorre
ArXiv cs.AI 📄 Paper 1w ago
WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web Agents
arXiv:2603.05044v2 Announce Type: replace Abstract: Current paradigms for training GUI agents are fundamentally limited by a reliance on either unsafe, non-repr
ArXiv cs.AI 📄 Paper 1w ago
WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces
arXiv:2603.05295v3 Announce Type: replace Abstract: We introduce WebChain, the largest open-source dataset of human-annotated trajectories on real-world website
ArXiv cs.AI 📄 Paper 1w ago
A Survey of Multimodal Mathematical Reasoning: From Perception, Alignment to Reasoning
arXiv:2603.08291v3 Announce Type: replace Abstract: Multimodal Mathematical Reasoning (MMR) has recently attracted increasing attention for its capability to so
ArXiv cs.AI 📄 Paper 1w ago
Reasoning Graphs: Self-Improving, Deterministic RAG through Evidence-Centric Feedback
arXiv:2604.07595v2 Announce Type: replace Abstract: Language model agents reason from scratch on every query, discarding their chain of thought after each run.
ArXiv cs.AI 📄 Paper 1w ago
Pictorial and apictorial polygonal jigsaw puzzles from arbitrary number of crossing cuts
arXiv:2008.07644v3 Announce Type: replace-cross Abstract: Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unor
ArXiv cs.AI 📄 Paper 1w ago
Prompt Evolution for Generative AI: A Classifier-Guided Approach
arXiv:2305.16347v2 Announce Type: replace-cross Abstract: Synthesis of digital artifacts conditioned on user prompts has become an important paradigm facilitati