134,258 articles

📰 Reads

134,258 articles · Updated every 3 hours

All ⚡ AI Lessons (21843) ArXiv cs.AIDev.to AIMedium · AIMedium · ProgrammingForbes InnovationMedium · Machine Learning
ArXiv cs.AI 📄 Paper 5d ago
Conditional misalignment: common interventions can hide emergent misalignment behind contextual triggers
arXiv:2604.25891v1 Announce Type: cross Abstract: Finetuning a language model can lead to emergent misalignment (EM) [Betley et al., 2025b]. Models trained on a
ArXiv cs.AI 📄 Paper 5d ago
Three Models of RLHF Annotation: Extension, Evidence, and Authority
arXiv:2604.25895v1 Announce Type: cross Abstract: Preference-based alignment methods, most prominently Reinforcement Learning with Human Feedback (RLHF), use th
ArXiv cs.AI 📄 Paper 5d ago
TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning
arXiv:2604.25898v1 Announce Type: cross Abstract: Continual offline reinforcement learning (CORL) aims to learn a sequence of tasks from datasets collected over
ArXiv cs.AI 📄 Paper 5d ago
Toward a Functional Geometric Algebra for Natural Language Semantics
arXiv:2604.25902v1 Announce Type: cross Abstract: Distributional and neural approaches to natural language semantics have been built almost exclusively on conve
ArXiv cs.AI 📄 Paper 5d ago
How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum
arXiv:2604.25907v1 Announce Type: cross Abstract: Adapting reasoning models to new tasks during post-training with only output-level supervision stalls under re
ArXiv cs.AI 📄 Paper 5d ago
Generative AI Carries Non-Democratic Biases and Stereotypes: Representation of Women, Black Individuals, Age Groups, and People with Disability in AI-Generated Images across Occupations
arXiv:2409.13869v2 Announce Type: replace Abstract: In this study, I investigate how generative artificial intelligence (AI) systems reproduce and reinforce soc
ArXiv cs.AI 📄 Paper 5d ago
BayesL: a Logical Framework for the Verification of Bayesian Networks
arXiv:2506.23773v2 Announce Type: replace Abstract: Modern explainable AI still struggles with a fundamental gap: although Bayesian networks (BNs) provide trans
ArXiv cs.AI 📄 Paper 5d ago
AInstein: Can LLMs Solve Research Problems From Parametric Memory Alone?
arXiv:2510.05432v2 Announce Type: replace Abstract: Can large language models solve AI research problems using only their parametric knowledge, without fine-tun
ArXiv cs.AI 📄 Paper 5d ago
Aligning Deep Implicit Preferences by Learning to Reason Defensively
arXiv:2510.11194v2 Announce Type: replace Abstract: Personalized alignment is crucial for enabling Large Language Models (LLMs) to engage effectively in user-ce
ArXiv cs.AI 📄 Paper 5d ago
MPR-GUI: Benchmarking and Enhancing Multilingual Perception and Reasoning in GUI Agents
arXiv:2512.00756v2 Announce Type: replace Abstract: Large Vision-Language Models (LVLMs) have shown strong potential as multilingual Graphical User Interface (G
ArXiv cs.AI 📄 Paper 5d ago
GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts
arXiv:2601.05110v3 Announce Type: replace Abstract: Large Reasoning Models (LRMs) achieve remarkable performance by explicitly generating multi-step chains of t
ArXiv cs.AI 📄 Paper 5d ago
ReCreate: Reasoning and Creating Domain Agents Driven by Experience
arXiv:2601.11100v2 Announce Type: replace Abstract: Large Language Model agents are reshaping the industrial landscape. However, most practical agents remain hu
ArXiv cs.AI 📄 Paper 5d ago
Exploring Reasoning Reward Model for Agents
arXiv:2601.22154v2 Announce Type: replace Abstract: Agentic Reinforcement Learning (Agentic RL) has achieved notable success in enabling agents to perform compl
ArXiv cs.AI 📄 Paper 5d ago
DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder
arXiv:2602.00592v2 Announce Type: replace Abstract: Reliable Docker-based environment construction is a dominant bottleneck for scaling execution-grounded train
ArXiv cs.AI 📄 Paper 5d ago
NeuroHex: A Brain-Inspired Hex Coordinate System to Enable Highly Computationally-Efficient World Models for Continuous Online-Adaptive Learning
arXiv:2603.00376v3 Announce Type: replace Abstract: NeuroHex is a brain-inspired hexagonal coordinate system designed to support highly efficient world models a
ArXiv cs.AI 📄 Paper 5d ago
SciDER: Scientific Data-centric End-to-end Researcher
arXiv:2603.01421v2 Announce Type: replace Abstract: Automated scientific discovery with large language models is transforming the research lifecycle from ideati
ArXiv cs.AI 📄 Paper 5d ago
Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective
arXiv:2603.14248v2 Announce Type: replace Abstract: Large language model (LLM) web agents are increasingly used for web navigation but remain far from human rel
ArXiv cs.AI 📄 Paper 5d ago
Agent Lifecycle Toolkit (ALTK): Reusable Middleware Components for Robust AI Agents
arXiv:2603.15473v2 Announce Type: replace Abstract: As AI agents move from demos into enterprise deployments, their failure modes become consequential: a misint
ArXiv cs.AI 📄 Paper 5d ago
Domain-Independent Dynamic Programming with Constraint Propagation
arXiv:2603.16648v2 Announce Type: replace Abstract: There are two prevalent model-based paradigms for combinatorial problems: 1) state-based representations, su
ArXiv cs.AI 📄 Paper 5d ago
Contrast-Enhanced Gating in GRUs for Robust Low-Data Sequence Learning
arXiv:2402.09034v3 Announce Type: replace-cross Abstract: Activation functions govern how recurrent networks regulate and transmit information across temporal d