📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 7,966 articles · Updated every 3 hours · View all reads

arXiv:2509.18633v4 Announce Type: replace Abstract: We present an open-source Python framework for modelling cascading physical climate risk in a spatial supply

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Multiplayer Nash Preference Optimization

arXiv:2509.23102v3 Announce Type: replace Abstract: Reinforcement learning from human feedback (RLHF) has emerged as the standard paradigm for aligning large la

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

arXiv:2509.25454v4 Announce Type: replace Abstract: Although RLVR has become an essential component for developing advanced reasoning skills in language models,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Hypothesis-Driven Feature Manifold Analysis in LLMs via Supervised Multi-Dimensional Scaling

arXiv:2510.01025v2 Announce Type: replace Abstract: The linear representation hypothesis states that language models (LMs) encode concepts as directions in thei

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

TS-Agent: Understanding and Reasoning Over Raw Time Series via Iterative Insight Gathering

arXiv:2510.07432v2 Announce Type: replace Abstract: Large language models (LLMs) exhibit strong symbolic and compositional reasoning, yet they struggle with tim

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems

arXiv:2510.10815v4 Announce Type: replace Abstract: Automating the formalization of mathematical statements for theorem proving remains a major challenge for La

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Toward Virtuous Reinforcement Learning: A Critique and Roadmap

arXiv:2512.04246v2 Announce Type: replace Abstract: This paper critiques common patterns in machine ethics for Reinforcement Learning (RL) and argues for a virt

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 3w ago

Robust AI Security and Alignment: A Sisyphean Endeavor?

arXiv:2512.10100v2 Announce Type: replace Abstract: This manuscript establishes information-theoretic limitations for robustness of AI security and alignment by

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

EchoTrail-GUI: Building Actionable Memory for GUI Agents via Critic-Guided Self-Exploration

arXiv:2512.19396v2 Announce Type: replace Abstract: Contemporary GUI agents, while increasingly capable due to advances in Large Vision-Language Models (VLMs),

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

RL-VLA$^3$: A Flexible and Asynchronous Reinforcement Learning Framework for VLA Training

arXiv:2602.05765v2 Announce Type: replace Abstract: Reinforcement learning (RL) has emerged as a critical paradigm for post-training Vision-Language-Action (VLA

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Emergent Introspection in AI is Content-Agnostic

arXiv:2603.05414v2 Announce Type: replace Abstract: Introspection is a foundational cognitive ability, but its mechanism is not well understood. Recent work has

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AgentHER: Hindsight Experience Replay for LLM Agent Trajectory Relabeling

arXiv:2603.21357v2 Announce Type: replace Abstract: LLM agents fail on the majority of real-world tasks -- GPT-4o succeeds on fewer than 15% of WebArena navigat

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement

arXiv:2604.01591v2 Announce Type: replace Abstract: We introduce ThinkTwice, a simple two-phase framework that jointly optimizes LLMs to solve reasoning problem

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Sim-CLIP: Unsupervised Siamese Adversarial Fine-Tuning for Robust and Semantically-Rich Vision-Language Models

arXiv:2407.14971v3 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) rely heavily on pretrained vision encoders to support downstream tasks s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

TransAgent: Enhancing LLM-Based Code Translation via Fine-Grained Execution Alignment

arXiv:2409.19894v5 Announce Type: replace-cross Abstract: Code translation transforms code between programming languages while preserving functionality, which i

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Cobblestone: A Divide-and-Conquer Approach for Automating Formal Verification

arXiv:2410.19940v4 Announce Type: replace-cross Abstract: Formal verification using proof assistants, such as Coq, is an effective way of improving software qua

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap

arXiv:2410.20791v3 Announce Type: replace-cross Abstract: The rapid expansion of foundation models (FMs), such as large language models (LLMs), has given rise t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Aligned Vector Quantization for Edge-Cloud Collabrative Vision-Language Models

arXiv:2411.05961v2 Announce Type: replace-cross Abstract: Vision Language Models (VLMs) are central to Visual Question Answering (VQA) systems and are typically

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Retrieval Augmented Time Series Forecasting

arXiv:2411.08249v2 Announce Type: replace-cross Abstract: Retrieval-augmented generation (RAG) is a central component of modern LLM systems, particularly in sce

ArXiv cs.AI 📐 ML Fundamentals 📄 Paper ⚡ AI Lesson 3w ago

VarDrop: Enhancing Training Efficiency by Reducing Variate Redundancy in Periodic Time Series Forecasting

arXiv:2501.14183v3 Announce Type: replace-cross Abstract: Variate tokenization, which independently embeds each variate as separate tokens, has achieved remarka

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ENTER: Event Based Interpretable Reasoning for VideoQA

arXiv:2501.14194v2 Announce Type: replace-cross Abstract: In this paper, we present ENTER, an interpretable Video Question Answering (VideoQA) system based on e

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 3w ago

An Innovative Next Activity Prediction Using Process Entropy and Dynamic Attribute-Wise-Transformer in Predictive Business Process Monitoring

arXiv:2502.10573v2 Announce Type: replace-cross Abstract: Next activity prediction in predictive business process monitoring is crucial for operational efficien

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification

arXiv:2502.17421v3 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) can now process extremely long contexts, efficient inference over thes

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Hedging and Non-Affirmation: Quantifying LLM Alignment on Questions of Human Rights

arXiv:2502.19463v2 Announce Type: replace-cross Abstract: Hedging and non-affirmation are behaviors exhibited by large language models (LLMs) that limit the cle