📰 ArXiv cs.AI
Articles from ArXiv cs.AI · 4,742 articles · Updated every 3 hours · View all reads
All
⚡ AI Lessons (12682)
ArXiv cs.AIDev.to · FORUM WEBDev.to AIForbes InnovationOpenAI NewsHugging Face Blog
ArXiv cs.AI
📄 Paper
2d ago
When More Thinking Hurts: Overthinking in LLM Test-Time Compute Scaling
arXiv:2604.10739v1 Announce Type: new Abstract: Scaling test-time compute through extended chains of thought has become a dominant paradigm for improving large
ArXiv cs.AI
📄 Paper
2d ago
Learning Preference-Based Objectives from Clinical Narratives for Sequential Treatment Decision-Making
arXiv:2604.10783v1 Announce Type: new Abstract: Designing reward functions remains a central challenge in reinforcement learning (RL) for healthcare, where outc
ArXiv cs.AI
📄 Paper
2d ago
TorchUMM: A Unified Multimodal Model Codebase for Evaluation, Analysis, and Post-training
arXiv:2604.10784v1 Announce Type: new Abstract: Recent advances in unified multimodal models (UMMs) have led to a proliferation of architectures capable of unde
ArXiv cs.AI
📄 Paper
2d ago
CheeseBench: Evaluating Large Language Models on Rodent Behavioral Neuroscience Paradigms
arXiv:2604.10825v1 Announce Type: new Abstract: We introduce CheeseBench, a benchmark that evaluates large language models (LLMs) on nine classical behavioral n
ArXiv cs.AI
📄 Paper
2d ago
Your Model Diversity, Not Method, Determines Reasoning Strategy
arXiv:2604.10827v1 Announce Type: new Abstract: Compute scaling for LLM reasoning requires allocating budget between exploring solution approaches ($breadth$) a
ArXiv cs.AI
📄 Paper
2d ago
A Benchmark for Gap and Overlap Analysis as a Test of KG Task Readiness
arXiv:2604.10853v1 Announce Type: new Abstract: Task-oriented evaluation of knowledge graph (KG) quality increasingly asks whether an ontology-based representat
ArXiv cs.AI
📄 Paper
2d ago
Beyond Statistical Co-occurrence: Unlocking Intrinsic Semantics for Tabular Data Clustering
arXiv:2604.10865v1 Announce Type: new Abstract: Deep Clustering (DC) has emerged as a powerful tool for tabular data analysis in real-world domains like finance
ArXiv cs.AI
📄 Paper
2d ago
A Quantitative Definition of Intelligence
arXiv:2604.10873v1 Announce Type: new Abstract: We propose an operational, quantitative definition of intelligence for arbitrary physical systems. The intellige
ArXiv cs.AI
📄 Paper
2d ago
ZoomR: Memory Efficient Reasoning through Multi-Granularity Key Value Retrieval
arXiv:2604.10898v1 Announce Type: new Abstract: Large language models (LLMs) have shown great performance on complex reasoning tasks but often require generatin
ArXiv cs.AI
📄 Paper
2d ago
CASK: Core-Aware Selective KV Compression for Reasoning Traces
arXiv:2604.10900v1 Announce Type: new Abstract: In large language models performing long-form reasoning, the KV cache grows rapidly with decode length, creating
ArXiv cs.AI
📄 Paper
2d ago
Reasoning as Data: Representation-Computation Unity and Its Implementation in a Domain-Algebraic Inference Engine
arXiv:2604.10908v1 Announce Type: new Abstract: Every existing knowledge system separates storage from computation. We show this separation is unnecessary and e
ArXiv cs.AI
📄 Paper
2d ago
EvoNash-MARL: A Closed-Loop Multi-Agent Reinforcement Learning Framework for Medium-Horizon Equity Allocation
arXiv:2604.10911v1 Announce Type: new Abstract: Medium-to-long-horizon stock allocation presents significant challenges due toveak predictive structures, non-st
ArXiv cs.AI
📄 Paper
2d ago
CSPO: Alleviating Reward Ambiguity for Structured Table-to-LaTeX Generation
arXiv:2604.10918v1 Announce Type: new Abstract: Tables contain rich structured information, yet when stored as images their contents remain "locked" within pixe
ArXiv cs.AI
📄 Paper
2d ago
RAG-KT: Cross-platform Explainable Knowledge Tracing with Multi-view Fusion Retrieval Generation
arXiv:2604.10960v1 Announce Type: new Abstract: Knowledge Tracing (KT) infers a student's knowledge state from past interactions to predict future performance.
ArXiv cs.AI
📄 Paper
2d ago
Delving Aleatoric Uncertainty in Medical Image Segmentation via Vision Foundation Models
arXiv:2604.10963v1 Announce Type: new Abstract: Medical image segmentation supports clinical workflows by precisely delineating anatomical structures and lesion
ArXiv cs.AI
📄 Paper
2d ago
CFMS: A Coarse-to-Fine Multimodal Synthesis Framework for Enhanced Tabular Reasoning
arXiv:2604.10973v1 Announce Type: new Abstract: Reasoning over tabular data is a crucial capability for tasks like question answering and fact verification, as
ArXiv cs.AI
📄 Paper
2d ago
ATANT v1.1: Positioning Continuity Evaluation Against Memory, Long-Context, and Agentic-Memory Benchmarks
arXiv:2604.10981v1 Announce Type: new Abstract: ATANT v1.0 (arXiv:2604.06710) defined continuity as a system property with 7 required properties and introduced
ArXiv cs.AI
📄 Paper
2d ago
Back to the Barn with LLAMAs: Evolving Pretrained LLM Backbones in Finetuning Vision Language Models
arXiv:2604.10985v1 Announce Type: new Abstract: Vision-Language Models (VLMs) have rapidly advanced by leveraging powerful pre-trained Large Language Models (LL
ArXiv cs.AI
📄 Paper
2d ago
WebForge: Breaking the Realism-Reproducibility-Scalability Trilemma in Browser Agent Benchmark
arXiv:2604.10988v1 Announce Type: new Abstract: Existing browser agent benchmarks face a fundamental trilemma: real-website benchmarks lack reproducibility due
ArXiv cs.AI
📄 Paper
2d ago
MAFIG: Multi-agent Driven Formal Instruction Generation Framework
arXiv:2604.10989v1 Announce Type: new Abstract: Emergency situations in scheduling systems often trigger local functional failures that undermine system stabili
ArXiv cs.AI
📄 Paper
2d ago
Sanity Checks for Agentic Data Science
arXiv:2604.11003v1 Announce Type: new Abstract: Agentic data science (ADS) pipelines have grown rapidly in both capability and adoption, with systems such as Op
ArXiv cs.AI
📄 Paper
2d ago
Diffusion-CAM: Faithful Visual Explanations for dMLLMs
arXiv:2604.11005v1 Announce Type: new Abstract: While diffusion Multimodal Large Language Models (dMLLMs) have recently achieved remarkable strides in multimoda
ArXiv cs.AI
📄 Paper
2d ago
Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics
arXiv:2604.11012v1 Announce Type: new Abstract: The quality of text generated by large language models depends critically on the decoding sampling strategy. Whi
ArXiv cs.AI
📄 Paper
2d ago
Introspective Diffusion Language Models
arXiv:2604.11035v1 Announce Type: new Abstract: Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in qualit
DeepCamp AI