Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,875

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,455 Reads 5,420

Showing 5,420 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

LLMs Do Not Grade Essays Like Humans

arXiv:2603.23714v1 Announce Type: new Abstract: Large language models have recently been proposed as tools for automated essay scoring, but their agreement with

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Learning-guided Prioritized Planning for Lifelong Multi-Agent Path Finding in Warehouse Automation

arXiv:2603.23838v1 Announce Type: new Abstract: Lifelong Multi-Agent Path Finding (MAPF) is critical for modern warehouse automation, which requires multiple ro

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

VehicleMemBench: An Executable Benchmark for Multi-User Long-Term Memory in In-Vehicle Agents

arXiv:2603.23840v1 Announce Type: new Abstract: With the growing demand for intelligent in-vehicle experiences, vehicle-based agents are evolving from simple as

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems

arXiv:2603.23853v1 Announce Type: new Abstract: Combining multiple Vision-Language Models (VLMs) can enhance multimodal reasoning and robustness, but aggregatin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

When AI output tips to bad but nobody notices: Legal implications of AI's mistakes

arXiv:2603.23857v1 Announce Type: new Abstract: The adoption of generative AI across commercial and legal professions offers dramatic efficiency gains -- yet fo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions and Search

arXiv:2603.23873v1 Announce Type: new Abstract: DeepXube is a free and open-source Python package and command-line tool that seeks to automate the solution of p

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

DUPLEX: Agentic Dual-System Planning via LLM-Driven Information Extraction

arXiv:2603.23909v1 Announce Type: new Abstract: While Large Language Models (LLMs) provide semantic flexibility for robotic task planning, their susceptibility

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

AnalogAgent: Self-Improving Analog Circuit Design Automation with LLM Agents

arXiv:2603.23910v1 Announce Type: new Abstract: Recent advances in large language models (LLMs) suggest strong potential for automating analog circuit design. Y

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments

arXiv:2603.23964v1 Announce Type: new Abstract: The remarkable progress of reinforcement learning (RL) is intrinsically tied to the environments used to train a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Language-Grounded Multi-Agent Planning for Personalized and Fair Participatory Urban Sensing

arXiv:2603.24014v1 Announce Type: new Abstract: Participatory urban sensing leverages human mobility for large-scale urban data collection, yet existing methods

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

ELITE: Experiential Learning and Intent-Aware Transfer for Self-improving Embodied Agents

arXiv:2603.24018v1 Announce Type: new Abstract: Vision-language models (VLMs) have shown remarkable general capabilities, yet embodied agents built on them fail

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Enhanced Mycelium of Thought (EMoT): A Bio-Inspired Hierarchical Reasoning Architecture with Strategic Dormancy and Mnemonic Encoding

arXiv:2603.24065v1 Announce Type: new Abstract: Current prompting paradigms for large language models (LLMs), including Chain-of-Thought (CoT) and Tree-of-Thoug

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Bridging the Evaluation Gap: Standardized Benchmarks for Multi-Objective Search

arXiv:2603.24084v1 Announce Type: new Abstract: Empirical evaluation in multi-objective search (MOS) has historically suffered from fragmentation, relying on he

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

AI-Supervisor: Autonomous AI Research Supervision via a Persistent Research World Model

arXiv:2603.24402v2 Announce Type: new Abstract: Existing automated research systems operate as stateless, linear pipelines -- generating outputs without maintai

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA

arXiv:2603.24481v1 Announce Type: new Abstract: Miscalibrated confidence scores are a practical obstacle to deploying AI in clinical settings. A model that is a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

From Liar Paradox to Incongruent Sets: A Normal Form for Self-Reference

arXiv:2603.24527v1 Announce Type: new Abstract: We introduce incongruent normal form (INF), a structural representation for self-referential semantic sentences.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Completeness of Unbounded Best-First Minimax and Descent Minimax

arXiv:2603.24572v1 Announce Type: new Abstract: In this article, we focus on search algorithms for two-player perfect information games, whose objective is to d

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence

arXiv:2603.24582v1 Announce Type: new Abstract: Agentic artificial intelligence (AI) in organizations is a sequential decision problem constrained by reliabilit

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct

arXiv:2410.02064v3 Announce Type: cross Abstract: It has been reported that LLMs can recognize their own writing. As this has potential implications for AI safe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Mitigating Many-Shot Jailbreaking

arXiv:2504.09604v3 Announce Type: cross Abstract: Many-shot jailbreaking (MSJ) is an adversarial technique that exploits the long context windows of modern LLMs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Evidence for Limited Metacognition in LLMs

arXiv:2509.21545v2 Announce Type: cross Abstract: The possibility of LLM self-awareness and even sentience is gaining increasing public attention and has major

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking

arXiv:2603.23506v1 Announce Type: cross Abstract: The rapid proliferation of large language models (LLMs) in healthcare creates an urgent need for scalable and

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

arXiv:2603.23507v1 Announce Type: cross Abstract: While Masked Diffusion Language Models (MDLMs) relying on token masking and unmasking have shown promise in la

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Internal Safety Collapse in Frontier Large Language Models

arXiv:2603.23509v1 Announce Type: cross Abstract: This work identifies a critical failure mode in frontier large language models (LLMs), which we term Internal

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Visuospatial Perspective Taking in Multimodal Language Models

arXiv:2603.23510v1 Announce Type: cross Abstract: As multimodal language models (MLMs) are increasingly used in social and collaborative settings, it is crucial

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

S-Path-RAG: Semantic-Aware Shortest-Path Retrieval Augmented Generation for Multi-Hop Knowledge Graph Question Answering

arXiv:2603.23512v1 Announce Type: cross Abstract: We present S-Path-RAG, a semantic-aware shortest-path Retrieval-Augmented Generation framework designed to imp

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

arXiv:2603.23514v1 Announce Type: cross Abstract: Large Language Models appear competent when answering general questions but often fail when pushed into domain

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Training a Large Language Model for Medical Coding Using Privacy-Preserving Synthetic Clinical Data

arXiv:2603.23515v1 Announce Type: cross Abstract: Improving the accuracy and reliability of medical coding reduces clinician burnout and supports revenue cycle

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

arXiv:2603.23516v1 Announce Type: cross Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information rem

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation

arXiv:2603.23517v1 Announce Type: cross Abstract: Accuracy-based evaluation cannot reliably distinguish genuine generalization from shortcuts like memorization,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Cluster-R1: Large Reasoning Models Are Instruction-following Clustering Agents

arXiv:2603.23518v1 Announce Type: cross Abstract: General-purpose embedding models excel at recognizing semantic similarities but fail to capture the characteri

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?

arXiv:2603.23519v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various specialist domains and h

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

From Physician Expertise to Clinical Agents: Preserving, Standardizing, and Scaling Physicians' Medical Expertise with Lightweight LLM

arXiv:2603.23520v1 Announce Type: cross Abstract: Medicine is an empirical discipline refined through long-term observation and the messy, high-variance reality

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages

arXiv:2603.23521v1 Announce Type: cross Abstract: Multimodal research has predominantly focused on single-image reasoning, with limited exploration of multi-ima

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Qworld: Question-Specific Evaluation Criteria for LLMs

arXiv:2603.23522v1 Announce Type: cross Abstract: Evaluating large language models (LLMs) on open-ended questions is difficult because response quality depends

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Navigating the Concept Space of Language Models

arXiv:2603.23524v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) trained on large language model activations output thousands of features that enabl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language

arXiv:2603.23529v1 Announce Type: cross Abstract: Large Language Models (LLMs) consistently under perform in low-resource linguistic contexts such as Konkani. T

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

arXiv:2603.23530v1 Announce Type: cross Abstract: Large language models often fail to satisfy formatting instructions when they must simultaneously perform dema

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs

arXiv:2603.23532v1 Announce Type: cross Abstract: This paper investigates whether structured representations can preserve the meaning of scientific sentences. T

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG

arXiv:2603.23533v1 Announce Type: cross Abstract: RAG pipelines typically rely on fixed-size chunking, which ignores document structure, fragments semantic unit

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Large Language Models and Scientific Discourse: Where's the Intelligence?

arXiv:2603.23543v1 Announce Type: cross Abstract: We explore the capabilities of Large Language Models (LLMs) by comparing the way they gather data with the way

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training

arXiv:2603.23559v1 Announce Type: cross Abstract: GUI agents are rapidly shifting from multi-module pipelines to end-to-end, native vision-language models (VLMs

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG

arXiv:2603.23562v1 Announce Type: cross Abstract: Synthetic data augmentation helps language models learn new knowledge in data-constrained domains. However, na

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Safe Reinforcement Learning with Preference-based Constraint Inference

arXiv:2603.23565v1 Announce Type: cross Abstract: Safe reinforcement learning (RL) is a standard paradigm for safety-critical decision making. However, real-wor

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization

arXiv:2603.23566v1 Announce Type: cross Abstract: AscendC (Ascend C) operator optimization on Huawei Ascend neural processing units (NPUs) faces a two-fold know

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

StateLinFormer: Stateful Training Enhancing Long-term Memory in Navigation

arXiv:2603.23571v1 Announce Type: cross Abstract: Effective navigation intelligence relies on long-term memory to support both immediate generalization and sust

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Dual-Criterion Curriculum Learning: Application to Temporal Data

arXiv:2603.23573v1 Announce Type: cross Abstract: Curriculum Learning (CL) is a meta-learning paradigm that trains a model by feeding the data instances increme

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

arXiv:2603.23574v1 Announce Type: cross Abstract: Federated Learning (FL), as a popular distributed learning paradigm, has shown outstanding performance in impr