AI News — Latest Developments & Breakthroughs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Chain-of-Adaptation: Surgical Vision-Language Adaptation with Reinforcement Learning

arXiv:2603.20116v1 Announce Type: cross Abstract: Conventional fine-tuning on domain-specific datasets can inadvertently alter a model's pretrained multimodal p

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models

arXiv:2603.20122v1 Announce Type: cross Abstract: Large Language Models (LLMs) have been widely deployed, especially through free Web-based applications that ex

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

An Agentic Multi-Agent Architecture for Cybersecurity Risk Management

arXiv:2603.20131v1 Announce Type: cross Abstract: Getting a real cybersecurity risk assessment for a small organization is expensive -- a NIST CSF-aligned engag

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Enhancing Hyperspace Analogue to Language (HAL) Representations via Attention-Based Pooling for Text Classification

arXiv:2603.20149v1 Announce Type: cross Abstract: The Hyperspace Analogue to Language (HAL) model relies on global word co-occurrence matrices to construct dist

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Design-OS: A Specification-Driven Framework for Engineering System Design with a Control-Systems Design Case

arXiv:2603.20151v1 Announce Type: cross Abstract: Engineering system design -- whether mechatronic, control, or embedded -- often proceeds in an ad hoc manner,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models

arXiv:2603.20161v1 Announce Type: cross Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks. However, the trut

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning

arXiv:2603.20164v1 Announce Type: cross Abstract: Conventional robot social behavior generation has been limited in flexibility and autonomy, relying on predefi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation

arXiv:2603.20172v1 Announce Type: cross Abstract: Recent work on chain-of-thought (CoT) faithfulness reports single aggregate numbers (e.g., DeepSeek-R1 acknowl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

AI Agents Can Already Autonomously Perform Experimental High Energy Physics

arXiv:2603.20179v1 Announce Type: cross Abstract: Large language model-based AI agents are now able to autonomously execute substantial portions of a high energ

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Adaptive Greedy Frame Selection for Long Video Understanding

arXiv:2603.20180v1 Announce Type: cross Abstract: Large vision--language models (VLMs) are increasingly applied to long-video question answering, yet inference

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Improving Generalization on Cybersecurity Tasks with Multi-Modal Contrastive Learning

arXiv:2603.20181v1 Announce Type: cross Abstract: The use of ML in cybersecurity has long been impaired by generalization issues: Models that work well in contr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking

arXiv:2603.20185v1 Announce Type: cross Abstract: Video agentic models have advanced challenging video-language tasks. However, most agentic approaches still he

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation

arXiv:2603.20192v1 Announce Type: cross Abstract: Recent advances in diffusion models have significantly improved text-to-video generation, enabling personalize

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1w ago

From Masks to Pixels and Meaning: A New Taxonomy, Benchmark, and Metrics for VLM Image Tampering

arXiv:2603.20193v1 Announce Type: cross Abstract: Existing tampering detection benchmarks largely rely on object masks, which severely misalign with the true ed

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

HPS: Hard Preference Sampling for Human Preference Alignment

arXiv:2502.14400v5 Announce Type: replace Abstract: Aligning Large Language Model (LLM) responses with human preferences is vital for building safe and controll

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Average Reward Reinforcement Learning for Omega-Regular and Mean-Payoff Objectives

arXiv:2505.15693v3 Announce Type: replace Abstract: Recent advances in reinforcement learning (RL) have renewed interest in reward design for shaping agent beha

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Preference-Driven Multi-Objective Combinatorial Optimization with Conditional Computation

arXiv:2506.08898v4 Announce Type: replace Abstract: Recent deep reinforcement learning methods have achieved remarkable success in solving multi-objective combi

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

Multimodal Fused Learning for Solving the Generalized Traveling Salesman Problem in Robotic Task Planning

arXiv:2506.16931v3 Announce Type: replace Abstract: Effective and efficient task planning is essential for mobile robots, especially in applications like wareho

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Improved Generalized Planning with LLMs through Strategy Refinement and Reflection

arXiv:2508.13876v2 Announce Type: replace Abstract: LLMs have recently been used to generate Python programs representing generalized plans in PDDL planning, i.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Evaluation-Aware Reinforcement Learning

arXiv:2509.19464v3 Announce Type: replace Abstract: Policy evaluation is a core component of many reinforcement learning (RL) algorithms and a critical tool for

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark

arXiv:2509.24897v2 Announce Type: replace Abstract: The integration of visual understanding and generation into unified multimodal models represents a significa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

PDDL Axioms Are Equivalent to Least Fixed Point Logic (Extended Version)

arXiv:2510.14412v2 Announce Type: replace Abstract: Axioms are a feature of the Planning Domain Definition Language PDDL that can be considered as a generalizat

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

DAPS++: Rethinking Diffusion Inverse Problems with Decoupled Posterior Annealing

arXiv:2511.17038v2 Announce Type: replace Abstract: From a Bayesian perspective, score-based diffusion solves inverse problems through joint inference, embeddin

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 1w ago

VIRO: Robust and Efficient Neuro-Symbolic Reasoning with Verification for Referring Expression Comprehension

arXiv:2601.12781v2 Announce Type: replace Abstract: Referring Expression Comprehension (REC) aims to localize the image region corresponding to a natural langua

📰 ArXiv cs.AI