📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 3,317 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (13747) ArXiv cs.AI Dev.to · FORUM WEB Dev.to AI Forbes Innovation OpenAI News Medium · Programming

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2w ago

arXiv:2604.00788v1 Announce Type: new Abstract: This technical report presents methods developed by the UK AI Security Institute for assessing whether advanced

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning

arXiv:2604.00790v1 Announce Type: new Abstract: While large language models (LLMs) have demonstrated strong performance on complex reasoning tasks such as compe

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2w ago

Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning

arXiv:2604.00795v1 Announce Type: new Abstract: We propose the Preference Guided Iterated Pareto Referent Optimisation (PG-IPRO) for urban route planning for pe

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2w ago

Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

arXiv:2604.00842v1 Announce Type: new Abstract: Proactive agents that anticipate user needs and autonomously execute tasks hold great promise as digital assista

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Beyond Symbolic Solving: Multi Chain-of-Thought Voting for Geometric Reasoning in Large Language Models

arXiv:2604.00890v1 Announce Type: new Abstract: Geometric Problem Solving (GPS) remains at the heart of enhancing mathematical reasoning in large language model

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Experience as a Compass: Multi-agent RAG with Evolving Orchestration and Agent Prompts

arXiv:2604.00901v1 Announce Type: new Abstract: Multi-agent Retrieval-Augmented Generation (RAG), wherein each agent takes on a specific role, supports hard que

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

PsychAgent: An Experience-Driven Lifelong Learning Agent for Self-Evolving Psychological Counselor

arXiv:2604.00931v1 Announce Type: new Abstract: Existing methods for AI psychological counselors predominantly rely on supervised fine-tuning using static dialo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

OmniMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory

arXiv:2604.01007v1 Announce Type: new Abstract: AI agents increasingly operate over extended time horizons, yet their ability to retain, organize, and recall mu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Adversarial Moral Stress Testing of Large Language Models

arXiv:2604.01108v1 Announce Type: new Abstract: Evaluating the ethical robustness of large language models (LLMs) deployed in software systems remains challengi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Detecting Multi-Agent Collusion Through Multi-Agent Interpretability

arXiv:2604.01151v1 Announce Type: new Abstract: As LLM agents are increasingly deployed in multi-agent systems, they introduce risks of covert coordination that

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Therefore I am. I Think

arXiv:2604.01202v1 Announce Type: new Abstract: We consider the question: when a large language reasoning model makes a choice, did it think first and then deci

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2w ago

HippoCamp: Benchmarking Contextual Agents on Personal Computers

arXiv:2604.01221v1 Announce Type: new Abstract: We present HippoCamp, a new benchmark designed to evaluate agents' capabilities on multimodal file management. U

ArXiv cs.AI 🤖 AI Agents & Automation 📄 Paper ⚡ AI Lesson 2w ago

Agentic AI -- Physicist Collaboration in Experimental Particle Physics: A Proof-of-Concept Measurement with LEP Open Data

arXiv:2603.05735v2 Announce Type: cross Abstract: We present an AI agentic measurement of the thrust distribution in $e^{+}e^{-}$ collisions at $\sqrt{s}=91.2$~

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Two-Stage Optimizer-Aware Online Data Selection for Large Language Models

arXiv:2604.00001v1 Announce Type: cross Abstract: Gradient-based data selection offers a principled framework for estimating sample utility in large language mo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Benchmark for Assessing Olfactory Perception of Large Language Models

arXiv:2604.00002v1 Announce Type: cross Abstract: Here we introduce the Olfactory Perception (OP) benchmark, designed to assess the capability of large language

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

A Reliability Evaluation of Hybrid Deterministic-LLM Based Approaches for Academic Course Registration PDF Information Extraction

arXiv:2604.00003v1 Announce Type: cross Abstract: This study evaluates the reliability of information extraction approaches from KRS documents using three strat

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

LinearARD: Linear-Memory Attention Distillation for RoPE Restoration

arXiv:2604.00004v1 Announce Type: cross Abstract: The extension of context windows in Large Language Models is typically facilitated by scaling positional encod

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Dynin-Omni: Omnimodal Unified Large Diffusion Language Model

arXiv:2604.00007v1 Announce Type: cross Abstract: We present Dynin-Omni, the first masked-diffusion-based omnimodal foundation model that unifies text, image, a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

How Trustworthy Are LLM-as-Judge Ratings for Interpretive Responses? Implications for Qualitative Research Workflows

arXiv:2604.00008v1 Announce Type: cross Abstract: As qualitative researchers show growing interest in using automated tools to support interpretive analysis, a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Eyla: Toward an Identity-Anchored LLM Architecture with Integrated Biological Priors -- Vision, Implementation Attempt, and Lessons from AI-Assisted Development

arXiv:2604.00009v1 Announce Type: cross Abstract: We present the design rationale, implementation attempt, and failure analysis of Eyla, a proposed identity-anc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Can LLMs Perceive Time? An Empirical Investigation

arXiv:2604.00010v1 Announce Type: cross Abstract: Large language models cannot estimate how long their own tasks take. We investigate this limitation through fo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Quantifying Gender Bias in Large Language Models: When ChatGPT Becomes a Hiring Manager

arXiv:2604.00011v1 Announce Type: cross Abstract: The growing prominence of large language models (LLMs) in daily life has heightened concerns that LLMs exhibit

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

Finding and Reactivating Post-Trained LLMs' Hidden Safety Mechanisms

arXiv:2604.00012v1 Announce Type: cross Abstract: Despite the impressive performance of general-purpose large language models (LLMs), they often require fine-tu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

arXiv:2604.00013v1 Announce Type: cross Abstract: Multimodal sentiment analysis aims to understand human emotions by integrating textual, auditory, and visual m