AI News — Latest Developments & Breakthroughs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Biased Error Attribution in Multi-Agent Human-AI Systems Under Delayed Feedback

arXiv:2603.23419v1 Announce Type: cross Abstract: Human decision-making is strongly influenced by cognitive biases, particularly under conditions of uncertainty

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

Targeted Adversarial Traffic Generation : Black-box Approach to Evade Intrusion Detection Systems in IoT Networks

arXiv:2603.23438v1 Announce Type: cross Abstract: The integration of machine learning (ML) algorithms into Internet of Things (IoT) applications has introduced

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Evaluating LLM-Based Test Generation Under Software Evolution

arXiv:2603.23443v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly used for automated unit test generation. However, it remains unc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding

arXiv:2603.23447v1 Announce Type: cross Abstract: While multi-modality large language models excel in object-centric or indoor scenarios, scaling them to 3D cit

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

Code Review Agent Benchmark

arXiv:2603.23448v1 Announce Type: cross Abstract: Software engineering agents have shown significant promise in writing code. As AI agents permeate code writing

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting

arXiv:2603.23463v1 Announce Type: cross Abstract: Recent diffusion-based models achieve photorealism in image inpainting but require many sampling steps, limiti

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

arXiv:2603.23481v1 Announce Type: cross Abstract: Video-Action Models (VAMs) have emerged as a promising framework for embodied intelligence, learning implicit

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains

arXiv:2603.23482v1 Announce Type: cross Abstract: Requirements engineering is a vital, yet labor-intensive, stage in the software development process. This arti

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Failure of contextual invariance in gender inference with large language models

arXiv:2603.23485v1 Announce Type: cross Abstract: Standard evaluation practices assume that large language model (LLM) outputs are stable under contextually equ

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions

arXiv:2603.23495v1 Announce Type: cross Abstract: Existing approaches for improving the efficiency of Large Vision-Language Models (LVLMs) are largely based on

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

arXiv:2603.23501v1 Announce Type: cross Abstract: Vision Language Models (VLMs) are increasingly used for tasks like medical report generation and visual questi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

An Accurate and Interpretable Framework for Trustworthy Process Monitoring

arXiv:2302.10426v3 Announce Type: replace Abstract: Trustworthy process monitoring seeks to build an accurate and interpretable monitoring framework, which is c

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts

arXiv:2410.22492v3 Announce Type: replace Abstract: Multimodal reasoning models often produce fluent answers supported by seemingly coherent rationales. Existin

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

Toward Data Systems That Are Business Semantic Centric and AI Agents Assisted

arXiv:2506.05520v3 Announce Type: replace Abstract: Contemporary businesses operate in dynamic environments requiring rapid adaptation to achieve goals and main

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

arXiv:2510.00415v3 Announce Type: replace Abstract: Recent advances in large language models (LLMs) and agent system designs have empowered agents with unpreced

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

arXiv:2510.05318v3 Announce Type: replace Abstract: Large language models (LLMs) have demonstrated remarkable performance on single-turn text-to-SQL tasks, but

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

BuilderBench: The Building Blocks of Intelligent Agents

arXiv:2510.06288v3 Announce Type: replace Abstract: Today's AI models learn primarily through mimicry and refining, so it is not surprising that they struggle t

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources

arXiv:2511.07719v2 Announce Type: replace Abstract: Mitigating anthropogenic methane sources is one of the most cost-effective levers to slow down global warmin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents

arXiv:2511.22076v2 Announce Type: replace Abstract: The Internet of Agents (IoA) is rapidly gaining prominence as a foundational architecture for interconnected

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

arXiv:2601.12138v3 Announce Type: replace Abstract: Large Language Models (LLMs) are increasingly integrated into vehicle-based digital assistants, where unsafe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents

arXiv:2602.02050v3 Announce Type: replace Abstract: Tool-using agents based on Large Language Models (LLMs) excel in tasks such as mathematical reasoning and mu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

arXiv:2602.22983v3 Announce Type: replace Abstract: As Large Language Models (LLMs) are increasingly used, their security risks have drawn increasing attention.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays

arXiv:2602.23276v2 Announce Type: replace Abstract: Chest X-ray plays a central role in thoracic diagnosis, and its interpretation inherently requires multi-ste

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Agentic AI-based Coverage Closure for Formal Verification

arXiv:2603.03147v2 Announce Type: replace Abstract: Coverage closure is a critical requirement in Integrated Chip (IC) development process and key metric for ve

📰 ArXiv cs.AI