📰 Reads

134,258 articles · Updated every 3 hours

arXiv:2604.25860v1 Announce Type: cross Abstract: Machine-generated text (MGT) detection requires identifying structurally invariant signals across generation m

ArXiv cs.AI 📄 Paper 5d ago

RESTestBench: A Benchmark for Evaluating the Effectiveness of LLM-Generated REST API Test Cases from NL Requirements

arXiv:2604.25862v1 Announce Type: cross Abstract: Existing REST API testing tools are typically evaluated using code coverage and crash-based fault metrics. How

ArXiv cs.AI 📄 Paper 5d ago

When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient

arXiv:2604.25872v1 Announce Type: cross Abstract: Training language models via reinforcement learning often relies on imperfect proxy rewards, since ground trut

ArXiv cs.AI 📄 Paper 5d ago

No Pedestrian Left Behind: Real-Time Detection and Tracking of Vulnerable Road Users for Adaptive Traffic Signal Control

arXiv:2604.25887v1 Announce Type: cross Abstract: Current pedestrian crossing signals operate on fixed timing without adjustment to pedestrian behavior, which c

ArXiv cs.AI 📄 Paper 5d ago

Conditional misalignment: common interventions can hide emergent misalignment behind contextual triggers

arXiv:2604.25891v1 Announce Type: cross Abstract: Finetuning a language model can lead to emergent misalignment (EM) [Betley et al., 2025b]. Models trained on a

ArXiv cs.AI 📄 Paper 5d ago

Three Models of RLHF Annotation: Extension, Evidence, and Authority

arXiv:2604.25895v1 Announce Type: cross Abstract: Preference-based alignment methods, most prominently Reinforcement Learning with Human Feedback (RLHF), use th

ArXiv cs.AI 📄 Paper 5d ago

TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning

arXiv:2604.25898v1 Announce Type: cross Abstract: Continual offline reinforcement learning (CORL) aims to learn a sequence of tasks from datasets collected over

ArXiv cs.AI 📄 Paper 5d ago

Toward a Functional Geometric Algebra for Natural Language Semantics

arXiv:2604.25902v1 Announce Type: cross Abstract: Distributional and neural approaches to natural language semantics have been built almost exclusively on conve

ArXiv cs.AI 📄 Paper 5d ago

How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum

arXiv:2604.25907v1 Announce Type: cross Abstract: Adapting reasoning models to new tasks during post-training with only output-level supervision stalls under re

ArXiv cs.AI 📄 Paper 5d ago

Generative AI Carries Non-Democratic Biases and Stereotypes: Representation of Women, Black Individuals, Age Groups, and People with Disability in AI-Generated Images across Occupations

arXiv:2409.13869v2 Announce Type: replace Abstract: In this study, I investigate how generative artificial intelligence (AI) systems reproduce and reinforce soc

ArXiv cs.AI 📄 Paper 5d ago

BayesL: a Logical Framework for the Verification of Bayesian Networks

arXiv:2506.23773v2 Announce Type: replace Abstract: Modern explainable AI still struggles with a fundamental gap: although Bayesian networks (BNs) provide trans

ArXiv cs.AI 📄 Paper 5d ago

AInstein: Can LLMs Solve Research Problems From Parametric Memory Alone?

arXiv:2510.05432v2 Announce Type: replace Abstract: Can large language models solve AI research problems using only their parametric knowledge, without fine-tun

ArXiv cs.AI 📄 Paper 5d ago

Aligning Deep Implicit Preferences by Learning to Reason Defensively

arXiv:2510.11194v2 Announce Type: replace Abstract: Personalized alignment is crucial for enabling Large Language Models (LLMs) to engage effectively in user-ce

ArXiv cs.AI 📄 Paper 5d ago

MPR-GUI: Benchmarking and Enhancing Multilingual Perception and Reasoning in GUI Agents

arXiv:2512.00756v2 Announce Type: replace Abstract: Large Vision-Language Models (LVLMs) have shown strong potential as multilingual Graphical User Interface (G

ArXiv cs.AI 📄 Paper 5d ago

GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts

arXiv:2601.05110v3 Announce Type: replace Abstract: Large Reasoning Models (LRMs) achieve remarkable performance by explicitly generating multi-step chains of t

ArXiv cs.AI 📄 Paper 5d ago

ReCreate: Reasoning and Creating Domain Agents Driven by Experience

arXiv:2601.11100v2 Announce Type: replace Abstract: Large Language Model agents are reshaping the industrial landscape. However, most practical agents remain hu

ArXiv cs.AI 📄 Paper 5d ago

Exploring Reasoning Reward Model for Agents

arXiv:2601.22154v2 Announce Type: replace Abstract: Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform compl

ArXiv cs.AI 📄 Paper 5d ago

DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder

arXiv:2602.00592v2 Announce Type: replace Abstract: Reliable Docker-based environment construction is a dominant bottleneck for scaling execution-grounded train

ArXiv cs.AI 📄 Paper 5d ago

NeuroHex: A Brain-Inspired Hex Coordinate System to Enable Highly Computationally-Efficient World Models for Continuous Online-Adaptive Learning

arXiv:2603.00376v3 Announce Type: replace Abstract: NeuroHex is a brain-inspired hexagonal coordinate system designed to support highly efficient world models a

ArXiv cs.AI 📄 Paper 5d ago

SciDER: Scientific Data-centric End-to-end Researcher

arXiv:2603.01421v2 Announce Type: replace Abstract: Automated scientific discovery with large language models is transforming the research lifecycle from ideati

ArXiv cs.AI 📄 Paper 5d ago

Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective

arXiv:2603.14248v2 Announce Type: replace Abstract: Large language model (LLM) web agents are increasingly used for web navigation but remain far from human rel

ArXiv cs.AI 📄 Paper 5d ago

Agent Lifecycle Toolkit (ALTK): Reusable Middleware Components for Robust AI Agents

arXiv:2603.15473v2 Announce Type: replace Abstract: As AI agents move from demos into enterprise deployments, their failure modes become consequential: a misint

ArXiv cs.AI 📄 Paper 5d ago

Domain-Independent Dynamic Programming with Constraint Propagation

arXiv:2603.16648v2 Announce Type: replace Abstract: There are two prevalent model-based paradigms for combinatorial problems: 1) state-based representations, su

ArXiv cs.AI 📄 Paper 5d ago

Contrast-Enhanced Gating in GRUs for Robust Low-Data Sequence Learning

arXiv:2402.09034v3 Announce Type: replace-cross Abstract: Activation functions govern how recurrent networks regulate and transmit information across temporal d