AI News — Latest Developments & Breakthroughs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Visuospatial Perspective Taking in Multimodal Language Models

arXiv:2603.23510v1 Announce Type: cross Abstract: As multimodal language models (MLMs) are increasingly used in social and collaborative settings, it is crucial

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 5d ago

DISCO: Document Intelligence Suite for COmparative Evaluation

arXiv:2603.23511v1 Announce Type: cross Abstract: Document intelligence requires accurate text extraction and reliable reasoning over document content. We intro

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

S-Path-RAG: Semantic-Aware Shortest-Path Retrieval Augmented Generation for Multi-Hop Knowledge Graph Question Answering

arXiv:2603.23512v1 Announce Type: cross Abstract: We present S-Path-RAG, a semantic-aware shortest-path Retrieval-Augmented Generation framework designed to imp

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 5d ago

Berta: an open-source, modular tool for AI-enabled clinical documentation

arXiv:2603.23513v1 Announce Type: cross Abstract: Commercial AI scribes cost \$99-600 per physician per month, operate as opaque systems, and do not return data

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

arXiv:2603.23514v1 Announce Type: cross Abstract: Large Language Models appear competent when answering general questions but often fail when pushed into domain

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Training a Large Language Model for Medical Coding Using Privacy-Preserving Synthetic Clinical Data

arXiv:2603.23515v1 Announce Type: cross Abstract: Improving the accuracy and reliability of medical coding reduces clinician burnout and supports revenue cycle

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

arXiv:2603.23516v1 Announce Type: cross Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information rem

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation

arXiv:2603.23517v1 Announce Type: cross Abstract: Accuracy-based evaluation cannot reliably distinguish genuine generalization from shortcuts like memorization,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Cluster-R1: Large Reasoning Models Are Instruction-following Clustering Agents

arXiv:2603.23518v1 Announce Type: cross Abstract: General-purpose embedding models excel at recognizing semantic similarities but fail to capture the characteri

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?

arXiv:2603.23519v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various specialist domains and h

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

From Physician Expertise to Clinical Agents: Preserving, Standardizing, and Scaling Physicians' Medical Expertise with Lightweight LLM

arXiv:2603.23520v1 Announce Type: cross Abstract: Medicine is an empirical discipline refined through long-term observation and the messy, high-variance reality

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages

arXiv:2603.23521v1 Announce Type: cross Abstract: Multimodal research has predominantly focused on single-image reasoning, with limited exploration of multi-ima

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Qworld: Question-Specific Evaluation Criteria for LLMs

arXiv:2603.23522v1 Announce Type: cross Abstract: Evaluating large language models (LLMs) on open-ended questions is difficult because response quality depends

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Navigating the Concept Space of Language Models

arXiv:2603.23524v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) trained on large language model activations output thousands of features that enabl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language

arXiv:2603.23529v1 Announce Type: cross Abstract: Large Language Models (LLMs) consistently under perform in low-resource linguistic contexts such as Konkani. T

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

arXiv:2603.23530v1 Announce Type: cross Abstract: Large language models often fail to satisfy formatting instructions when they must simultaneously perform dema

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs

arXiv:2603.23532v1 Announce Type: cross Abstract: This paper investigates whether structured representations can preserve the meaning of scientific sentences. T

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG

arXiv:2603.23533v1 Announce Type: cross Abstract: RAG pipelines typically rely on fixed-size chunking, which ignores document structure, fragments semantic unit

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Large Language Models and Scientific Discourse: Where's the Intelligence?

arXiv:2603.23543v1 Announce Type: cross Abstract: We explore the capabilities of Large Language Models (LLMs) by comparing the way they gather data with the way

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 5d ago

Mixture of Demonstrations for Textual Graph Understanding and Question Answering

arXiv:2603.23554v1 Announce Type: cross Abstract: Textual graph-based retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing

ArXiv cs.AI 📄 Paper ⚡ AI Lesson 5d ago

Upper Entropy for 2-Monotone Lower Probabilities

arXiv:2603.23558v1 Announce Type: cross Abstract: Uncertainty quantification is a key aspect in many tasks such as model selection/regularization, or quantifyin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training

arXiv:2603.23559v1 Announce Type: cross Abstract: GUI agents are rapidly shifting from multi-module pipelines to end-to-end, native vision-language models (VLMs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG

arXiv:2603.23562v1 Announce Type: cross Abstract: Synthetic data augmentation helps language models learn new knowledge in data-constrained domains. However, na

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 5d ago

Safe Reinforcement Learning with Preference-based Constraint Inference

arXiv:2603.23565v1 Announce Type: cross Abstract: Safe reinforcement learning (RL) is a standard paradigm for safety-critical decision making. However, real-wor

📰 ArXiv cs.AI