📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 5,060 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (13330) ArXiv cs.AI Dev.to · FORUM WEB Dev.to AI Forbes Innovation OpenAI News Hugging Face Blog

Identity-Aware U-Net: Fine-grained Cell Segmentation via Identity-Aware Representation Learning

arXiv:2604.09702v1 Announce Type: cross Abstract: Precise segmentation of objects with highly similar shapes remains a challenging problem in dense prediction,

ArXiv cs.AI 📄 Paper 4d ago

The Deployment Gap in AI Media Detection: Platform-Aware and Visually Constrained Adversarial Evaluation

arXiv:2604.09706v1 Announce Type: cross Abstract: Recent AI media detectors report near-perfect performance under clean laboratory evaluation, yet their robustn

ArXiv cs.AI 📄 Paper 4d ago

Orthogonal Quadratic Complements for Vision Transformer Feed-Forward Networks

arXiv:2604.09709v1 Announce Type: cross Abstract: Recent bilinear feed-forward replacements for vision transformers can substantially improve accuracy, but they

ArXiv cs.AI 📄 Paper 4d ago

LAST: Leveraging Tools as Hints to Enhance Spatial Reasoning for Multimodal Large Language Models

arXiv:2604.09712v1 Announce Type: cross Abstract: Spatial reasoning is a cornerstone capability for intelligent systems to perceive and interact with the physic

ArXiv cs.AI 📄 Paper 4d ago

Training Deep Visual Networks Beyond Loss and Accuracy Through a Dynamical Systems Approach

arXiv:2604.09716v1 Announce Type: cross Abstract: Deep visual recognition models are usually trained and evaluated using metrics such as loss and accuracy. Whil

ArXiv cs.AI 📄 Paper 4d ago

ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge--Cloud Speculative LLM Serving

arXiv:2604.09722v1 Announce Type: cross Abstract: Speculative decoding enables collaborative Large Language Model (LLM) inference across cloud and edge by separ

ArXiv cs.AI 📄 Paper 4d ago

LOLGORITHM: Funny Comment Generation Agent For Short Videos

arXiv:2604.09729v1 Announce Type: cross Abstract: Short-form video platforms have become central to multimedia information dissemination, where comments play a

ArXiv cs.AI 📄 Paper 4d ago

SMART: When is it Actually Worth Expanding a Speculative Tree?

arXiv:2604.09731v1 Announce Type: cross Abstract: Tree-based speculative decoding accelerates autoregressive generation by verifying a branching tree of draft t

ArXiv cs.AI 📄 Paper 4d ago

Multi-Frequency Local Plasticity for Visual Representation Learning

arXiv:2604.09734v1 Announce Type: cross Abstract: We study how far structured architectural bias can compensate for the absence of end-to-end gradient-based rep

ArXiv cs.AI 📄 Paper 4d ago

STaR-DRO: Stateful Tsallis Reweighting for Group-Robust Structured Prediction

arXiv:2604.09737v1 Announce Type: cross Abstract: Structured prediction requires models to generate ontology-constrained labels, grounded evidence, and valid st

ArXiv cs.AI 📄 Paper 4d ago

ExecTune: Effective Steering of Black-Box LLMs with Guide Models

arXiv:2604.09741v1 Announce Type: cross Abstract: For large language models deployed through black-box APIs, recurring inference costs often exceed one-time tra

ArXiv cs.AI 📄 Paper 4d ago

MPAC: A Multi-Principal Agent Coordination Protocol for Interoperable Multi-Agent Collaboration

arXiv:2604.09744v1 Announce Type: cross Abstract: The AI agent ecosystem has converged on two protocols: the Model Context Protocol (MCP) for tool invocation an

ArXiv cs.AI 📄 Paper 4d ago

CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation

arXiv:2604.09746v1 Announce Type: cross Abstract: As large language models (LLMs) are increasingly deployed as autonomous agents, understanding how strategic be

ArXiv cs.AI 📄 Paper 4d ago

ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

arXiv:2604.09747v1 Announce Type: cross Abstract: Large Language Model (LLM) agents have achieved rapid adoption and demonstrated remarkable capabilities across

ArXiv cs.AI 📄 Paper 4d ago

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward

arXiv:2604.09748v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) is an emerging paradigm that significantly boosts a Larg

ArXiv cs.AI 📄 Paper 4d ago

Conflicts Make Large Reasoning Models Vulnerable to Attacks

arXiv:2604.09750v1 Announce Type: cross Abstract: Large Reasoning Models (LRMs) have achieved remarkable performance across diverse domains, yet their decision-

ArXiv cs.AI 📄 Paper 4d ago

A-IO: Adaptive Inference Orchestration for Memory-Bound NPUs

arXiv:2604.09752v1 Announce Type: cross Abstract: During the deployment of Large Language Models (LLMs), the autoregressive decoding phase on heterogeneous NPU

ArXiv cs.AI 📄 Paper 4d ago

MedLVR: Latent Visual Reasoning for Reliable Medical Visual Question Answering

arXiv:2604.09757v1 Announce Type: cross Abstract: Medical vision--language models (VLMs) have shown strong potential for medical visual question answering (VQA)

ArXiv cs.AI 📄 Paper 4d ago

GIANTS: Generative Insight Anticipation from Scientific Literature

arXiv:2604.09793v1 Announce Type: cross Abstract: Scientific breakthroughs often emerge from synthesizing prior ideas into novel contributions. While language m

ArXiv cs.AI 📄 Paper 4d ago

Explainable Human Activity Recognition: A Unified Review of Concepts and Mechanisms

arXiv:2604.09799v1 Announce Type: cross Abstract: Human activity recognition (HAR) has become a key component of intelligent systems for healthcare monitoring,

ArXiv cs.AI 📄 Paper 4d ago

ACCIDENT: A Benchmark Dataset for Vehicle Accident Detection from Traffic Surveillance Videos

arXiv:2604.09819v1 Announce Type: cross Abstract: We introduce ACCIDENT, a benchmark dataset for traffic accident detection in CCTV footage, designed to evaluat

ArXiv cs.AI 📄 Paper 4d ago

F3G-Avatar : Face Focused Full-body Gaussian Avatar

arXiv:2604.09835v1 Announce Type: cross Abstract: Existing full-body Gaussian avatar methods primarily optimize global reconstruction quality and often fail to

ArXiv cs.AI 📄 Paper 4d ago

Is There Knowledge Left to Extract? Evidence of Fragility in Medically Fine-Tuned Vision-Language Models

arXiv:2604.09841v1 Announce Type: cross Abstract: Vision-language models (VLMs) are increasingly adapted through domain-specific fine-tuning, yet it remains unc

ArXiv cs.AI 📄 Paper 4d ago

RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies

arXiv:2604.09860v1 Announce Type: cross Abstract: The pursuit of general-purpose robotics has yielded impressive foundation models, yet simulation-based benchma