Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

51,346

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 21,479 Reads 29,867

All Reads (29,867) Articles (12700)Blog Posts (5651)Tutorials (2409)Research Papers (8232)News (875)

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Invariant Reasoning Directions in Latent Trajectories of Language Models

arXiv:2606.29164v1 Announce Type: cross Abstract: Latent reasoning models perform multi-step inference directly in hidden-state space, yet the structure of thes

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

A Multi-Dataset Benchmark for Evaluating LLM Agents in Microservice Failure Diagnosis

arXiv:2606.29193v1 Announce Type: cross Abstract: LLM-based agents are reshaping microservice operations into AgentOps, where benchmarks are key to evaluating f

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

A Hybrid Framework for Song Lyric Annotation Based on Human-LLM Alignment

arXiv:2606.29273v1 Announce Type: cross Abstract: Emotion recognition of song lyrics is a challenging task since lyrics may not necessarily align with the overa

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Manufactured Confidence: How Memory Consolidation Turns Hearsay into Confident Facts

arXiv:2606.29279v1 Announce Type: cross Abstract: LLM agents carry conclusions across steps and sessions in compressed memory, and memory products (e.g., mem0,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Deterministic Decisions for High-Stakes AI. A Zero-Egress Pipeline with the Deployability of RAG and the Accuracy of Machine Learning

arXiv:2606.29280v1 Announce Type: cross Abstract: We identify intervention bias as a previously unquantified failure mode of zero-shot large-language-model (LLM

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

AMR: Adaptive Modality Routing for Multimodal Polyglot Speaker Identification

arXiv:2606.29335v1 Announce Type: cross Abstract: Multimodal speaker identification systems face two key challenges in real-world deployment: missing modalities

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Fast Enough to Act: Spatio-Temporal Visual Token Merging for Low-Latency Robotic VLMs and VLAs

arXiv:2606.29350v1 Announce Type: cross Abstract: Vision-language models and vision-language action models endow the robot with unprecedented capabilities. Howe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Solver-Verified Formulation Generation and Selection for Multi-Warehouse Inventory Allocation Using Large Language Models

arXiv:2606.29366v1 Announce Type: cross Abstract: Balance-oriented multi-warehouse inventory allocation is a recurring decision problem in large-scale e-commerc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

LC-ICL: Label-Guided Contrastive In-Context Learning for Robust Information Extraction

arXiv:2606.29407v1 Announce Type: cross Abstract: There has been increasing interest in exploring the capabilities of advanced large language models (LLMs) in t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

LLMography: Transforming Human-AI Conversations into Traceability, Oversight, and Auditability Indicators

arXiv:2606.29437v1 Announce Type: cross Abstract: The growing use of Large Language Models (LLMs) in education, software engineering, academic writing, and tech

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Closing the Activation-Cone Blind Spot: Response-Time Probing and Unified Defense

arXiv:2606.29441v1 Announce Type: cross Abstract: Inference-time safety methods for large language models have proliferated, yet no systematic comparison exists

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Bridging VideoQA and Video-Guided Agentic Tasks via Generalized Keyframe Extraction

arXiv:2606.29445v1 Announce Type: cross Abstract: Video understanding is a fundamental capability for multimodal intelligence, and recent Multimodal Large Langu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Interpretable Inverse Design of Metal-Organic Frameworks with Large Language Model Agents

arXiv:2606.29459v1 Announce Type: cross Abstract: Inverse design of metal-organic frameworks (MOFs) requires searching a combinatorially vast space where proper

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Rank-Aware Hyperbolic Alignment for Vision-Language Dataset Distillation

arXiv:2606.29464v1 Announce Type: cross Abstract: Vision-language dataset distillation (VLDD) compresses a large image-text paired dataset into a small set of s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

To Reason or to Fabricate: Reasoning Without Shortcuts via Hint-Anchored Pairwise Aggregation

arXiv:2606.29481v1 Announce Type: cross Abstract: While reinforcement learning (RL) significantly enhances LLM reasoning, its efficacy is severely undermined by

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Reported Confidence in LLMs Tracks Commitment More Than Correctness

arXiv:2606.29490v1 Announce Type: cross Abstract: Confidence is an estimate of the probability that a chosen answer is correct. Verbal confidence reports are wi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

The Verbose Context Problem in Medical Records

arXiv:2606.29503v1 Announce Type: cross Abstract: The verbose context problem occurs when structured concepts have token-inefficient textual representations. Th

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

SAKE: Software Architectural Knowledge Evaluation Benchmark for Large Language Models

arXiv:2606.29520v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly used as assistants across the software development lifecycle, ye

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

MotionAtlas: Detailed Region Captioning for Motion-Centric Videos

arXiv:2606.29531v1 Announce Type: cross Abstract: We propose MotionAtlas, a system for detailed captioning of motion-centric videos, comprising (1) a dedicated

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

SemJoin: Semantic Join Optimization

arXiv:2606.29532v1 Announce Type: cross Abstract: Integrating unstructured data into relational database systems is increasingly important as demand grows for n

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Em-ergence of the em-dash: a population-level rise in em-dash frequency in medRxiv preprints at the dawn of the large-language-model era

arXiv:2606.29540v1 Announce Type: cross Abstract: Large language models (LLMs) can leave subtle stylistic traces in assisted text; one of the most cited is the

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Coverage-Driven KV Cache Eviction for Efficient and Improved Inference of LLM

arXiv:2606.29563v1 Announce Type: cross Abstract: Large language models (LLMs) excel at complex tasks like question answering and summarization, thanks to their

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

ScAle: Attention Head Scaling as a Minimal Adapter for Spatial Reasoning in Vision Language Models

arXiv:2606.29579v1 Announce Type: cross Abstract: Spatial reasoning remains a persistent challenge for many vision language models (VLMs), and improving it typi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

The Joint Effect of Quantization and Sampling Temperature on LLM Safety Alignment: A Factorial Analysis

arXiv:2606.29581v1 Announce Type: cross Abstract: Modern LLM deployments routinely compress models and raise sampling temperature to reduce cost, latency, or re

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Bilevel Optimization for Neural Architecture Search

arXiv:2606.29582v1 Announce Type: cross Abstract: Bilevel optimization has become an influential and widely adopted framework for addressing hierarchical optimi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

SonoCLIP: Mask-Guided Region-Aware Vision-Language Pretraining for Fetal Ultrasound Analysis

arXiv:2606.29586v1 Announce Type: cross Abstract: Vision-language foundation models have shown strong potential in medical image analysis. Although foundation m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Mechanistically Eliciting Latent Behaviors in Language Models

arXiv:2606.29604v1 Announce Type: cross Abstract: We aim to discover diverse, generalizable perturbations of LLM internals that can surface hidden behavioral mo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Does Role Specialization Matter for Explanation Faithfulness in Mixture-of-Experts?

arXiv:2606.29613v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) architectures have recently been extended with role-based mechanisms for interpretabi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Do We Still Need Fine Tuning? Turkish Sentiment Analysis in the Era of Large Language Model

arXiv:2606.29614v1 Announce Type: cross Abstract: This study examines whether supervised fine-tuning remains necessary for Turkish sentiment analysis in the era

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Two-Stage Prompt Optimization for Few-Shot Relation Extraction: From Reasoning-Guided Search to Gradient-Guided Refinement

arXiv:2606.29639v1 Announce Type: cross Abstract: Automatic prompt optimization is still underexplored for episodic few-shot relation extraction with smaller la

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Fast Wireless Foundation Models with Early-Exits

arXiv:2606.29640v1 Announce Type: cross Abstract: While wireless foundation models (FMs) are demonstrating strong potential to enable AI-Native 6G networks, the

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Fuzzing Large Language Models to Elicit Hidden Behaviours

arXiv:2606.29646v1 Announce Type: cross Abstract: Sleeper agents are the canonical model organism of deception: models trained to behave normally but to emit an

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Hybrid Retriever Evolution for Multimodal Document Reasoning Agents

arXiv:2606.29648v1 Announce Type: cross Abstract: Different retrievers, including lexical, semantic, and multimodal approaches, provide highly complementary str

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

A Machine-Verified Proof of a Quantum-Optimization Conjecture

arXiv:2606.29687v1 Announce Type: cross Abstract: We report a machine-verified resolution of a problem open for over a decade in quantum optimization: the Farhi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Early Warning Signals for OpenVLA Failure under Visual Distribution Shift

arXiv:2606.29699v1 Announce Type: cross Abstract: Vision Language Action models combine perception, language grounding, and control in a single policy, but thei

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

ARMOR: Adaptive Retriever Optimization for Low-Resource Telecom Question Answering

arXiv:2606.29706v1 Announce Type: cross Abstract: Telecom question answering (QA) is a challenging setting for retrieval-augmented generation (RAG): evidence is

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

SEVA: Self-Evolving Verification Agent with Process Reward for Fact Attribution

arXiv:2606.29713v1 Announce Type: cross Abstract: Hallucination is the reliability bottleneck for LLM-based agents, and fact attribution verifiers are the last

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Optimizing Expert-Designed Crystal Graph Networks for Band-Gap Prediction with an Autonomous LLM Research Loop

arXiv:2606.29717v1 Announce Type: cross Abstract: Predicting a material's properties from its structure is a central, fast-advancing problem in computational ma

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Diagnosing and Mitigating Context Rot in Long-horizon Search

arXiv:2606.29718v1 Announce Type: cross Abstract: Extensive context has become the norm as Large Language Models (LLMs) are increasingly deployed in long-horizo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

PS-PPO: Prefix-Sampling PPO for Critic-Free RLHF

arXiv:2606.29758v1 Announce Type: cross Abstract: Reinforcement Learning from Human Feedback (RLHF) for Large Language Models increasingly relies on critic-free

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Towards Generalizable and Evidential Nuclear Magnetic Resonance-Based Molecular Structure Elucidation via Large Language Model Agent

arXiv:2606.29776v1 Announce Type: cross Abstract: Nuclear Magnetic Resonance (NMR) spectroscopy is the gold standard for molecular structure elucidation, yet in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Mandol: An Agglomerative Agent Memory System for Long-Term Conversations

arXiv:2606.29778v1 Announce Type: cross Abstract: Long-term conversational agents need to remember and query cross-session, multi-typed information with complex

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

HERO: Improving the Reliability and Sensitivity of Generative Model Evaluation Using Historical Data

arXiv:2606.29784v1 Announce Type: cross Abstract: Reliable generative AI models critically rely on expert human annotations to evaluate output quality, yet thes

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Making Multimodal LLMs Reliable Chart Data Extractors: A Benchmark and Training Framework

arXiv:2606.29808v1 Announce Type: cross Abstract: Chart data extraction, which reverse-engineers data tables from chart images, is essential for reproducibility

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

How Far Can You Get Without a GPU? A Systematic Benchmark of Lightweight Hallucination Detection Across Question Answering, Dialogue, and Summarisation

arXiv:2606.29809v1 Announce Type: cross Abstract: Hallucination detection has become a pressing requirement for trustworthy AI deployment at scale. The most acc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Dual-Flow Reinforcement Learning with State-Aware Exploration

arXiv:2606.29820v1 Announce Type: cross Abstract: In complex continuous-control reinforcement learning tasks, multimodal optimal actions often coincide with unc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

Neural Procedural Memory: Empowering LLM Agents with Implicit Activation Steering

arXiv:2606.29824v1 Announce Type: cross Abstract: While Large Language Models (LLMs) excel as static solvers, transforming them into autonomous agents remains c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 6d ago

MATCH: Modulating Attention via In-Context Retrieval for Long-Context Transformers

arXiv:2606.29844v1 Announce Type: cross Abstract: The quadratic computational cost of traditional attention mechanisms poses a major bottleneck to the scalabili