📰 ArXiv cs.AI
Articles from ArXiv cs.AI · 1,258 articles · Updated every 3 hours · View all news
All
⚡ AI Lessons (4987)
ArXiv cs.AIOpenAI NewsHugging Face BlogForbes InnovationDev.to AIWeaviate Blog
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Visuospatial Perspective Taking in Multimodal Language Models
arXiv:2603.23510v1 Announce Type: cross Abstract: As multimodal language models (MLMs) are increasingly used in social and collaborative settings, it is crucial
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
5d ago
DISCO: Document Intelligence Suite for COmparative Evaluation
arXiv:2603.23511v1 Announce Type: cross Abstract: Document intelligence requires accurate text extraction and reliable reasoning over document content. We intro
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
S-Path-RAG: Semantic-Aware Shortest-Path Retrieval Augmented Generation for Multi-Hop Knowledge Graph Question Answering
arXiv:2603.23512v1 Announce Type: cross Abstract: We present S-Path-RAG, a semantic-aware shortest-path Retrieval-Augmented Generation framework designed to imp
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
5d ago
Berta: an open-source, modular tool for AI-enabled clinical documentation
arXiv:2603.23513v1 Announce Type: cross Abstract: Commercial AI scribes cost \$99-600 per physician per month, operate as opaque systems, and do not return data
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models
arXiv:2603.23514v1 Announce Type: cross Abstract: Large Language Models appear competent when answering general questions but often fail when pushed into domain
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Training a Large Language Model for Medical Coding Using Privacy-Preserving Synthetic Clinical Data
arXiv:2603.23515v1 Announce Type: cross Abstract: Improving the accuracy and reliability of medical coding reduces clinician burnout and supports revenue cycle
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens
arXiv:2603.23516v1 Announce Type: cross Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information rem
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation
arXiv:2603.23517v1 Announce Type: cross Abstract: Accuracy-based evaluation cannot reliably distinguish genuine generalization from shortcuts like memorization,
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Cluster-R1: Large Reasoning Models Are Instruction-following Clustering Agents
arXiv:2603.23518v1 Announce Type: cross Abstract: General-purpose embedding models excel at recognizing semantic similarities but fail to capture the characteri
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?
arXiv:2603.23519v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various specialist domains and h
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
From Physician Expertise to Clinical Agents: Preserving, Standardizing, and Scaling Physicians' Medical Expertise with Lightweight LLM
arXiv:2603.23520v1 Announce Type: cross Abstract: Medicine is an empirical discipline refined through long-term observation and the messy, high-variance reality
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages
arXiv:2603.23521v1 Announce Type: cross Abstract: Multimodal research has predominantly focused on single-image reasoning, with limited exploration of multi-ima
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Qworld: Question-Specific Evaluation Criteria for LLMs
arXiv:2603.23522v1 Announce Type: cross Abstract: Evaluating large language models (LLMs) on open-ended questions is difficult because response quality depends
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Navigating the Concept Space of Language Models
arXiv:2603.23524v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) trained on large language model activations output thousands of features that enabl
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language
arXiv:2603.23529v1 Announce Type: cross Abstract: Large Language Models (LLMs) consistently under perform in low-resource linguistic contexts such as Konkani. T
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Did You Forget What I Asked? Prospective Memory Failures in Large Language Models
arXiv:2603.23530v1 Announce Type: cross Abstract: Large language models often fail to satisfy formatting instructions when they must simultaneously perform dema
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs
arXiv:2603.23532v1 Announce Type: cross Abstract: This paper investigates whether structured representations can preserve the meaning of scientific sentences. T
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG
arXiv:2603.23533v1 Announce Type: cross Abstract: RAG pipelines typically rely on fixed-size chunking, which ignores document structure, fragments semantic unit
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Large Language Models and Scientific Discourse: Where's the Intelligence?
arXiv:2603.23543v1 Announce Type: cross Abstract: We explore the capabilities of Large Language Models (LLMs) by comparing the way they gather data with the way
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
5d ago
Mixture of Demonstrations for Textual Graph Understanding and Question Answering
arXiv:2603.23554v1 Announce Type: cross Abstract: Textual graph-based retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
5d ago
Upper Entropy for 2-Monotone Lower Probabilities
arXiv:2603.23558v1 Announce Type: cross Abstract: Uncertainty quantification is a key aspect in many tasks such as model selection/regularization, or quantifyin
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training
arXiv:2603.23559v1 Announce Type: cross Abstract: GUI agents are rapidly shifting from multi-module pipelines to end-to-end, native vision-language models (VLMs
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG
arXiv:2603.23562v1 Announce Type: cross Abstract: Synthetic data augmentation helps language models learn new knowledge in data-constrained domains. However, na
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Safe Reinforcement Learning with Preference-based Constraint Inference
arXiv:2603.23565v1 Announce Type: cross Abstract: Safe reinforcement learning (RL) is a standard paradigm for safety-critical decision making. However, real-wor
DeepCamp AI