AI News — Latest Developments & Breakthroughs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Evaluating Chunking Strategies For Retrieval-Augmented Generation in Oil and Gas Enterprise Documents

arXiv:2603.24556v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) has emerged as a framework to address the constraints of Large Language M

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 6d ago

LensWalk: Agentic Video Understanding by Planning How You See in Videos

arXiv:2603.24558v1 Announce Type: cross Abstract: The dense, temporal nature of video presents a profound challenge for automated analysis. Despite the use of p

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

The Free-Market Algorithm: Self-Organizing Optimization for Open-Ended Complex Systems

arXiv:2603.24559v1 Announce Type: cross Abstract: We introduce the Free-Market Algorithm (FMA), a novel metaheuristic inspired by free-market economics. Unlike

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Anti-I2V: Safeguarding your photos from malicious image-to-video generation

arXiv:2603.24570v1 Announce Type: cross Abstract: Advances in diffusion-based video generation models, while significantly improving human animation, poses thre

ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 6d ago

VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models

arXiv:2603.24575v1 Announce Type: cross Abstract: Scalable Vector Graphics (SVG) are an essential format for technical illustration and digital design, offering

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Chameleon: Episodic Memory for Long-Horizon Robotic Manipulation

arXiv:2603.24576v1 Announce Type: cross Abstract: Robotic manipulation often requires memory: occlusion and state changes can make decision-time observations pe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction

arXiv:2603.24577v1 Announce Type: cross Abstract: Accurate 3D reconstruction of deformable soft tissues is essential for surgical robotic perception. However, l

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA

arXiv:2603.24580v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) systems are increasingly used to analyze complex policy documents, but ac

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Learning To Guide Human Decision Makers With Vision-Language Models

arXiv:2403.16501v4 Announce Type: replace Abstract: There is growing interest in AI systems that support human decision-making in high-stakes domains (e.g., med

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

The Collaboration Paradox: Why Generative AI Requires Both Strategic Intelligence and Operational Stability in Supply Chain Management

arXiv:2508.13942v2 Announce Type: replace Abstract: The rise of autonomous, AI-driven agents in economic settings raises critical questions about their emergent

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

From Guidelines to Guarantees: A Graph-Based Evaluation Harness for Domain-Specific Evaluation of LLMs

arXiv:2508.20810v2 Announce Type: replace Abstract: Rigorous evaluation of domain-specific language models requires benchmarks that are comprehensive, contamina

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation

arXiv:2509.22460v3 Announce Type: replace Abstract: Geometric Problem Solving (GPS) poses a unique challenge for Multimodal Large Language Models (MLLMs), requi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

SAG-Agent: Enabling Long-Horizon Reasoning in Strategy Games via Dynamic Knowledge Graphs

arXiv:2510.15259v3 Announce Type: replace Abstract: Most commodity software lacks accessible Application Programming Interfaces (APIs), requiring autonomous age

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

CastMind: An Interaction-Driven Agentic Reasoning Framework for Cognition-Inspired Time Series Forecasting

arXiv:2511.08947v3 Announce Type: replace Abstract: Time series forecasting plays a crucial role in decision-making across many real-world applications. Despite

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

Pharos-ESG: A Framework for Multimodal Parsing, Contextual Narration, and Hierarchical Labeling of ESG Report

arXiv:2511.16417v2 Announce Type: replace Abstract: Environmental, Social, and Governance (ESG) principles are reshaping the foundations of global financial gov

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning

arXiv:2512.16917v3 Announce Type: replace Abstract: Large language models (LLMs) with explicit reasoning capabilities excel at mathematical reasoning yet still

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering

arXiv:2601.10402v5 Announce Type: replace Abstract: The advancement of artificial intelligence toward agentic science is currently bottlenecked by the challenge

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation

arXiv:2601.12410v2 Announce Type: replace Abstract: Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer othe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

CollectiveKV: Decoupling and Sharing Collaborative Information in Sequential Recommendation

arXiv:2601.19178v2 Announce Type: replace Abstract: Sequential recommendation models are widely used in applications, yet they face stringent latency requiremen

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 6d ago

CIRCLE: A Framework for Evaluating AI from a Real-World Lens

arXiv:2602.24055v4 Announce Type: replace Abstract: This paper proposes CIRCLE, a six-stage, lifecycle-based framework to bridge the reality gap between model-c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Agentified Assessment of Logical Reasoning Agents

arXiv:2603.02788v3 Announce Type: replace Abstract: We present a framework for evaluating and benchmarking logical reasoning agents when assessment itself must

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

arXiv:2603.03072v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly used to assist scientists across diverse workflows. A key chal

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics

arXiv:2603.11442v2 Announce Type: replace Abstract: Can humans detect AI-generated financial documents better than machines? We present GPT4o-Receipt, a benchma

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Relationship-Aware Safety Unlearning for Multimodal LLMs

arXiv:2603.14185v3 Announce Type: replace Abstract: Generative multimodal models can exhibit safety failures that are inherently relational: two benign concepts

📰 ArXiv cs.AI