📰 Reads

114,258 articles · Updated every 3 hours

All ⚡ AI Lessons (17059) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

MIRROR: A Hierarchical Benchmark for Metacognitive Calibration in Large Language Models

arXiv:2604.19809v1 Announce Type: new Abstract: We introduce MIRROR, a benchmark comprising eight experiments across four metacognitive levels that evaluates wh

ArXiv cs.AI 📄 Paper 1d ago

The Existential Theory of Research: Why Discovery Is Hard

arXiv:2604.19810v1 Announce Type: new Abstract: Can scientific discovery be made arbitrarily easy by choosing the right representation, collecting enough data,

ArXiv cs.AI 📄 Paper 1d ago

Large Language Models Meet Biomedical Knowledge Graphs for Mechanistically Grounded Therapeutic Prioritization

arXiv:2604.19815v1 Announce Type: new Abstract: Drug repurposing is often framed as a candidate identification task, but existing approaches provide limited gui

ArXiv cs.AI 📄 Paper 1d ago

Emergence Transformer: Dynamical Temporal Attention Matters

arXiv:2604.19816v1 Announce Type: new Abstract: The Transformer, a breakthrough architecture in artificial intelligence, owes its success to the attention mecha

ArXiv cs.AI 📄 Paper 1d ago

JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents

arXiv:2604.19821v1 Announce Type: new Abstract: Large language model (LLM) agents augmented with external tools often struggle as number of tools grow large and

ArXiv cs.AI 📄 Paper 1d ago

Forage V2: Knowledge Evolution and Transfer in Autonomous Agent Organizations

arXiv:2604.19837v1 Announce Type: new Abstract: Autonomous agents operating in open-world tasks -- where the completion boundary is not given in advance -- face

ArXiv cs.AI 📄 Paper 1d ago

Resolving space-sharing conflicts in road user interactions through uncertainty reduction: An active inference-based computational model

arXiv:2604.19838v1 Announce Type: new Abstract: Understanding how road users resolve space-sharing conflicts is important both for traffic safety and the safe d

ArXiv cs.AI 📄 Paper 1d ago

Deconstructing Superintelligence: Identity, Self-Modification and Diff\'erance

arXiv:2604.19845v2 Announce Type: new Abstract: Self-modification is often taken as constitutive of artificial superintelligence (SI), yet modification is a rel

ArXiv cs.AI 📄 Paper 1d ago

Learning When Not to Decide: A Framework for Overcoming Factual Presumptuousness in AI Adjudication

arXiv:2604.19895v1 Announce Type: new Abstract: A well-known limitation of AI systems is presumptuousness: the tendency of AI systems to provide confident answe

ArXiv cs.AI 📄 Paper 1d ago

CreativeGame:Toward Mechanic-Aware Creative Game Generation

arXiv:2604.19926v1 Announce Type: new Abstract: Large language models can generate plausible game code, but turning this capability into \emph{iterative creativ

ArXiv cs.AI 📄 Paper 1d ago

What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review

arXiv:2604.19998v1 Announce Type: new Abstract: Evaluating AI-generated reviews by verdict agreement is widely recognized as insufficient, yet current alternati

ArXiv cs.AI 📄 Paper 1d ago

Separable Pathways for Causal Reasoning: How Architectural Scaffolding Enables Hypothesis-Space Restructuring in LLM Agents

arXiv:2604.20039v1 Announce Type: new Abstract: Causal discovery through experimentation and intervention is fundamental to robust problem solving. It requires

ArXiv cs.AI 📄 Paper 1d ago

From Fuzzy to Formal: Scaling Hospital Quality Improvement with AI

arXiv:2604.20055v1 Announce Type: new Abstract: Hospital Quality Improvement (QI) plays a critical role in optimizing healthcare delivery by translating high-le

ArXiv cs.AI 📄 Paper 1d ago

EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation

arXiv:2604.20133v1 Announce Type: new Abstract: This paper proposes EvoAgent - an evolvable large language model (LLM) agent framework that integrates structure

ArXiv cs.AI 📄 Paper 1d ago

HiPO: Hierarchical Preference Optimization for Adaptive Reasoning in LLMs

arXiv:2604.20140v1 Announce Type: new Abstract: Direct Preference Optimization (DPO) is an effective framework for aligning large language models with human pre

ArXiv cs.AI 📄 Paper 1d ago

Stateless Decision Memory for Enterprise AI Agents

arXiv:2604.20158v1 Announce Type: new Abstract: Enterprise deployment of long-horizon decision agents in regulated domains (underwriting, claims adjudication, t

ArXiv cs.AI 📄 Paper 1d ago

Mol-Debate: Multi-Agent Debate Improves Structural Reasoning in Molecular Design

arXiv:2604.20254v1 Announce Type: new Abstract: Text-guided molecular design is a key capability for AI-driven drug discovery, yet it remains challenging to map

ArXiv cs.AI 📄 Paper 1d ago

Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data

arXiv:2604.20261v1 Announce Type: new Abstract: Automated feature generation extracts informative features from raw tabular data without manual intervention and

ArXiv cs.AI 📄 Paper 1d ago

ActuBench: A Multi-Agent LLM Pipeline for Generation and Evaluation of Actuarial Reasoning Tasks

arXiv:2604.20273v1 Announce Type: new Abstract: We present ActuBench, a multi-agent LLM pipeline for the automated generation and evaluation of advanced actuari

ArXiv cs.AI 📄 Paper 1d ago

FSFM: A Biologically-Inspired Framework for Selective Forgetting of Agent Memory

arXiv:2604.20300v2 Announce Type: new Abstract: For LLM agents, memory management critically impacts efficiency, quality, and security. While much research focu

ArXiv cs.AI 📄 Paper 1d ago

Self-Awareness before Action: Mitigating Logical Inertia via Proactive Cognitive Awareness

arXiv:2604.20413v1 Announce Type: new Abstract: Large language models perform well on many reasoning tasks, yet they often lack awareness of whether their curre

ArXiv cs.AI 📄 Paper 1d ago

MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills

arXiv:2604.20441v1 Announce Type: new Abstract: Background: Agent skills are increasingly deployed as modular, reusable capability units in AI agent systems. Me

ArXiv cs.AI 📄 Paper 1d ago

Measuring the Machine: Evaluating Generative AI as Pluralist Sociotechical Systems

arXiv:2604.20545v1 Announce Type: new Abstract: In measurement theory, instruments do not simply record reality; they help constitute what is observed. The same

ArXiv cs.AI 📄 Paper 1d ago

Self-Guided Plan Extraction for Instruction-Following Tasks with Goal-Conditional Reinforcement Learning

arXiv:2604.20601v1 Announce Type: new Abstract: We introduce SuperIgor, a framework for instruction-following tasks. Unlike prior methods that rely on predefine