134,258 articles

📰 Reads

134,258 articles · Updated every 3 hours

All ⚡ AI Lessons (21843) ArXiv cs.AIDev.to AIMedium · AIMedium · ProgrammingForbes InnovationMedium · Machine Learning
ArXiv cs.AI 📄 Paper 5d ago
Sparse Personalized Text Generation with Multi-Trajectory Reasoning
arXiv:2604.24996v1 Announce Type: new Abstract: As Large Language Models (LLMs) advance, personalization has become a key mechanism for tailoring outputs to ind
ArXiv cs.AI 📄 Paper 5d ago
Toward a Science of Intent: Closure Gaps and Delegation Envelopes for Open-World AI Agents
arXiv:2604.25000v1 Announce Type: new Abstract: Recent work has framed intelligence in verifiable tasks as reducing time-to-solution through learned structure a
ArXiv cs.AI 📄 Paper 5d ago
Leverage Laws: A Per-Task Framework for Human-Agent Collaboration
arXiv:2604.25040v1 Announce Type: new Abstract: We propose a per-task leverage ratio for human-agent collaboration: human work displaced by an agent, divided by
ArXiv cs.AI 📄 Paper 5d ago
Evaluating Risks in Weak-to-Strong Alignment: A Bias-Variance Perspective
arXiv:2604.25077v1 Announce Type: new Abstract: Weak-to-strong alignment offers a promising route to scalable supervision, but it can fail when a strong model b
ArXiv cs.AI 📄 Paper 5d ago
Agentic Architect: An Agentic AI Framework for Architecture Design Exploration and Optimization
arXiv:2604.25083v1 Announce Type: new Abstract: Rapid advances in Large Language Models (LLMs) create new opportunities by enabling efficient exploration of bro
ArXiv cs.AI 📄 Paper 5d ago
Cooperate to Compete: Strategic Coordination in Multi-Agent Conquest
arXiv:2604.25088v1 Announce Type: new Abstract: Language Model (LM)-based agents remain largely untested in mixed-motive settings where agents must leverage sho
ArXiv cs.AI 📄 Paper 5d ago
Doing More With Less: Revisiting the Effectiveness of LLM Pruning for Test-Time Scaling
arXiv:2604.25098v1 Announce Type: new Abstract: While current Large Language Models (LLMs) exhibit remarkable reasoning capabilities through test-time compute s
ArXiv cs.AI 📄 Paper 5d ago
Semantic Layers for Reliable LLM-Powered Data Analytics: A Paired Benchmark of Accuracy and Hallucination Across Three Frontier Models
arXiv:2604.25149v1 Announce Type: new Abstract: LLMs deployed for natural-language querying of analytical databases suffer from two intertwined failures - incor
ArXiv cs.AI 📄 Paper 5d ago
Training Transformers as a Universal Computer
arXiv:2604.25166v1 Announce Type: new Abstract: We demonstrate that a small transformer can learn to execute programs in MicroPy, a simplified yet computational
ArXiv cs.AI 📄 Paper 5d ago
From Insight to Action: A Novel Framework for Interpretability-Guided Data Selection in Large Language Models
arXiv:2604.25167v1 Announce Type: new Abstract: While mechanistic interpretability tools like Sparse Autoencoders (SAEs) can uncover meaningful features within
ArXiv cs.AI 📄 Paper 5d ago
DATAREEL: Automated Data-Driven Video Story Generation with Animations
arXiv:2604.25220v1 Announce Type: new Abstract: Data videos are a powerful medium for visual data based storytelling, combining animated, chart-centric visualiz
ArXiv cs.AI 📄 Paper 5d ago
ValueAlpha: Agreement-Gated Stress Testing of LLM-Judged Investment Rationales Before Returns Are Observable
arXiv:2604.25224v1 Announce Type: new Abstract: Long-horizon investment decisions create a pre-realization evaluation problem: realized returns are the eventual
ArXiv cs.AI 📄 Paper 5d ago
AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery
arXiv:2604.25256v1 Announce Type: new Abstract: Autonomous scientific research is significantly advanced thanks to the development of AI agents. One key step in
ArXiv cs.AI 📄 Paper 5d ago
Plausible but Wrong: A case study on Agentic Failures in Astrophysical Workflows
arXiv:2604.25345v1 Announce Type: new Abstract: Agentic AI systems are increasingly being integrated into scientific workflows, yet their behavior under realist
ArXiv cs.AI 📄 Paper 5d ago
Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control
arXiv:2604.25369v1 Announce Type: new Abstract: Over the past few decades, machine learning has been widely used to learn complex tasks. Reinforcement Learning
ArXiv cs.AI 📄 Paper 5d ago
JURY-RL: Votes Propose, Proofs Dispose for Label-Free RLVR
arXiv:2604.25419v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) enhances the reasoning of large language models (LLMs), bu
ArXiv cs.AI 📄 Paper 5d ago
PI-TTA: Physics-Informed Source-Free Test-Time Adaptation for Robust Human Activity Recognition on Mobile Devices
arXiv:2604.25435v1 Announce Type: new Abstract: Source-free test-time adaptation (TTA) is appealing for mobile and wearable sensing because it enables on-device
ArXiv cs.AI 📄 Paper 5d ago
SciEval: A Benchmark for Automatic Evaluation of K-12 Science Instructional Materials
arXiv:2604.25472v1 Announce Type: new Abstract: The need to evaluate instructional materials for K-12 science education has become increasingly important, as mo
ArXiv cs.AI 📄 Paper 5d ago
Improving Zero-Shot Offline RL via Behavioral Task Sampling
arXiv:2604.25496v1 Announce Type: new Abstract: Offline zero-shot reinforcement learning (RL) aims to learn agents that optimize unseen reward functions without
ArXiv cs.AI 📄 Paper 5d ago
PHISHREV: A Hybrid Machine Learning and Post-Hoc Non-monotonic Reasoning Framework for Context-Aware Phishing Website Classification
arXiv:2604.25512v1 Announce Type: new Abstract: Phishing detection systems are predominantly rely on statistical machine learning models, which often lack conte