📰 Reads

110,784 articles · Updated every 3 hours

All ⚡ AI Lessons (16091) ArXiv cs.AI Dev.to AI Dev.to · FORUM WEB Forbes Innovation Medium · Programming Medium · AI

FASTER: Value-Guided Sampling for Fast RL

arXiv:2604.19730v1 Announce Type: cross Abstract: Some of the most performant reinforcement learning algorithms today can be prohibitively expensive as they use

ArXiv cs.AI 📄 Paper 10h ago

UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling

arXiv:2604.19734v1 Announce Type: cross Abstract: Scaling humanoid foundation models is bottlenecked by the scarcity of robotic data. While massive egocentric h

ArXiv cs.AI 📄 Paper 10h ago

Generalization at the Edge of Stability

arXiv:2604.19740v1 Announce Type: cross Abstract: Training modern neural networks often relies on large learning rates, operating at the edge of stability, wher

ArXiv cs.AI 📄 Paper 10h ago

Conjuring Semantic Similarity

arXiv:2410.16431v4 Announce Type: replace Abstract: The semantic similarity between sample expressions measures the distance between their latent 'meaning'. The

ArXiv cs.AI 📄 Paper 10h ago

User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation

arXiv:2501.04410v2 Announce Type: replace Abstract: User simulation is an emerging interdisciplinary topic with multiple critical applications in the era of Gen

ArXiv cs.AI 📄 Paper 10h ago

Epistemic Skills: Reasoning about Knowledge and Oblivion

arXiv:2504.01733v4 Announce Type: replace Abstract: This paper presents a class of epistemic logics that captures the dynamics of acquiring knowledge and descen

ArXiv cs.AI 📄 Paper 10h ago

Memory Assignment for Finite-Memory Strategies in Adversarial Patrolling Games

arXiv:2505.14137v2 Announce Type: replace Abstract: Adversarial Patrolling games form a subclass of Security games where a Defender moves between locations, gua

ArXiv cs.AI 📄 Paper 10h ago

MRS: Multi-Resolution Skills for HRL Agents

arXiv:2505.21410v2 Announce Type: replace Abstract: Hierarchical reinforcement learning (HRL) decomposes the policy into a manager and a worker, enabling long-h

ArXiv cs.AI 📄 Paper 10h ago

SEAT: Sparse Entity-Aware Tuning for Knowledge Adaptation while Preserving Epistemic Abstention

arXiv:2506.14387v3 Announce Type: replace Abstract: Adapting LLMs with new knowledge is increasingly important, but standard fine-tuning often erodes aligned ep

ArXiv cs.AI 📄 Paper 10h ago

GRAIL:Learning to Interact with Large Knowledge Graphs for Retrieval Augmented Reasoning

arXiv:2508.05498v2 Announce Type: replace Abstract: Large Language Models (LLMs) integrated with Retrieval-Augmented Generation (RAG) techniques have exhibited

ArXiv cs.AI 📄 Paper 10h ago

GeoLaux: A Benchmark for Evaluating MLLMs' Geometry Performance on Long-Step Problems Requiring Auxiliary Lines

arXiv:2508.06226v2 Announce Type: replace Abstract: Geometry problem solving (GPS) poses significant challenges for Multimodal Large Language Models (MLLMs) in

ArXiv cs.AI 📄 Paper 10h ago

VideoAgent: Personalized Synthesis of Scientific Videos

arXiv:2509.11253v2 Announce Type: replace Abstract: The technical complexity of research papers often limits their reach, necessitating more accessible formats

ArXiv cs.AI 📄 Paper 10h ago

RepIt: Steering Language Models with Concept-Specific Refusal Vectors

arXiv:2509.13281v5 Announce Type: replace Abstract: Current safety evaluations of language models rely on benchmark-based assessments that may miss localized vu

ArXiv cs.AI 📄 Paper 10h ago

How to Teach Large Multimodal Models New Skills

arXiv:2510.08564v2 Announce Type: replace Abstract: How can we teach large multimodal models (LMMs) new skills without erasing prior abilities? We study sequent

ArXiv cs.AI 📄 Paper 10h ago

StepFly: Agentic Troubleshooting Guide Automation for Incident Diagnosis

arXiv:2510.10074v2 Announce Type: replace Abstract: Effective incident management in large-scale IT systems relies on troubleshooting guides (TSGs), but their m

ArXiv cs.AI 📄 Paper 10h ago

Chain-of-Thought as a Lens: Evaluating Structured Reasoning Alignment between Human Preferences and Large Language Models

arXiv:2511.06168v3 Announce Type: replace Abstract: This paper primarily demonstrates a method to quantitatively assess the alignment between multi-step, struct

ArXiv cs.AI 📄 Paper 10h ago

TROJail: Trajectory-Level Optimization for Multi-Turn Large Language Model Jailbreaks with Process Rewards

arXiv:2512.07761v3 Announce Type: replace Abstract: Large language models have seen widespread adoption, yet they remain vulnerable to multi-turn jailbreak atta

ArXiv cs.AI 📄 Paper 10h ago

Beyond Itinerary Planning-A Real-World Benchmark for Multi-Turn and Tool-Using Travel Tasks

arXiv:2512.22673v3 Announce Type: replace Abstract: Travel planning is a natural real-world task to test large language models' (LLMs) planning and tool-use abi

ArXiv cs.AI 📄 Paper 10h ago

SAGE-32B: Agentic Reasoning via Iterative Distillation

arXiv:2601.04237v2 Announce Type: replace Abstract: We demonstrate SAGE-32B, a 32 billion parameter language model that focuses on agentic reasoning and long ra

ArXiv cs.AI 📄 Paper 10h ago

Reasoning Over Space: Enabling Geographic Reasoning for LLM-Based Generative Next POI Recommendation

arXiv:2601.04562v2 Announce Type: replace Abstract: Generative recommendation with large language models (LLMs) reframes prediction as sequence generation, yet

ArXiv cs.AI 📄 Paper 10h ago

ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios

arXiv:2601.08620v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) pipelines must address challenges beyond simple single-document retriev

ArXiv cs.AI 📄 Paper 10h ago

BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

arXiv:2601.11037v2 Announce Type: replace Abstract: RL-based agentic search enables LLMs to solve complex questions via dynamic planning and external search. Wh

ArXiv cs.AI 📄 Paper 10h ago

Failure Modes in Multi-Hop QA: The Weakest Link Effect and the Recognition Bottleneck

arXiv:2601.12499v2 Announce Type: replace Abstract: Despite scaling to massive context windows, Large Language Models (LLMs) struggle with multi-hop reasoning d

ArXiv cs.AI 📄 Paper 10h ago

Sentipolis: Emotion-Aware Agents for Social Simulations

arXiv:2601.18027v2 Announce Type: replace Abstract: LLM agents are increasingly used for social simulation, yet emotion is often treated as a transient cue, cau