Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

51,404

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 21,486 Reads 29,918

All Reads (29,918) Articles (12722)Blog Posts (5658)Tutorials (2424)Research Papers (8239)News (875)

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

DysLexLens: A Low-Resource LLM Framework for Analysing Dyslexic Learners Insights from Online Forums

arXiv:2606.27619v1 Announce Type: new Abstract: Dyslexic learners increasingly use artificial intelligence (AI) tools to support reading, writing, organisation,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

MER-R1: Multimodal Emotion Reasoning via Slow-Fast Thinking Synergy

arXiv:2606.27652v1 Announce Type: new Abstract: We find that explicit reasoning does not necessarily translate into better multimodal emotion recognition (MER)

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

ToE: A Hierarchical and Explainable Claim Verification Framework with Dynamic Multi-source Evidence Retrieval and Aggregation

arXiv:2606.27736v1 Announce Type: new Abstract: The rapid spread of fake news poses increasing threats to information ecosystems, especially as AI-generated mis

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Grounded Iterative Language Planning: How Parameterized World Models Reduce Hallucination Propagation in LLM Agents

arXiv:2606.27806v1 Announce Type: new Abstract: World models for language agents come in two useful forms. An agent-based world model calls an LLM API and reaso

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

NormAct: A Benchmark for Hidden Social Norm Compliance in Embodied Planning

arXiv:2606.27826v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) are increasingly deployed as embodied planners in egocentric environmen

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

RelBall: Relation Ball with Quaternion Rotation for Knowledge Graph Completion

arXiv:2606.27967v1 Announce Type: new Abstract: Real-world knowledge graphs are often incomplete, lacking many valid facts. Knowledge Graph Completion (KGC) aim

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

JD Oxygen AI Item Center (Oxygen AIIC) V1: An Industrial-Scale LLM/VLM-Centric Solution for Item Understanding, Management, and Applications

arXiv:2606.28070v1 Announce Type: new Abstract: JD.com, one of the world's largest e-commerce platforms, serves over 700 million active users and millions of me

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Ontology-Guided Evidence Path Inference for Multi-hop Knowledge Graph Question Answering

arXiv:2606.28076v1 Announce Type: new Abstract: Knowledge graph question answering (KGQA) aims to answer natural-language questions by reasoning over structured

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Position: The Term "Machine Unlearning" Is Overused in LLMs

arXiv:2606.27379v1 Announce Type: cross Abstract: Large language models increasingly face demands to "forget" training data, knowledge, or behaviors due to regu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

CalBrief: A Pilot Diagnostic Benchmark for Evidence-Calibrated Scientific Briefing with Large Language Models

arXiv:2606.27383v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used as research assistants, yet it remains unclear whether they

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Towards Evaluation of Implicit Software World Models in Coding LLMs

arXiv:2606.27406v1 Announce Type: cross Abstract: Software engineering, whether performed by humans or by AI agents, requires reasoning about how software behav

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Compression-Driven Anomaly Detection in Brain MRI Using an Interpretable Quantum Autoencoder

arXiv:2606.27411v1 Announce Type: cross Abstract: We study a quantum autoencoder (QAE) for compression-driven anomaly detection in brain MRI data. The approach

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Supersede: Diagnosing and Training the Memory-Update Gap in LLM Agents

arXiv:2606.27472v1 Announce Type: cross Abstract: Large language model (LLM) agents operate over long, multi-session interactions in which facts change: a user

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Speculative Refinement: A Hybrid Autoregressive Diffusion Decoding Strategy and Its Behavior Across Benchmarks

arXiv:2606.27474v1 Announce Type: cross Abstract: How should we evaluate generation systems that combine autoregressive (AR) and diffusion decoding? We study th

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

DMV-Bench: Diagnosing Long-Horizon Multimodal Agents' Visual Memory with Incidental Cue Injection

arXiv:2606.27499v1 Announce Type: cross Abstract: Research on agent memory has matured rapidly, but almost entirely on the text side: few existing benchmarks as

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Large Language Model Teaches Visual Students: Cross-Modality Transfer of Fine-Grained Conceptual Knowledge

arXiv:2606.27527v1 Announce Type: cross Abstract: Large Language Models (LLMs) possess broad conceptual knowledge acquired through large-scale text pretraining,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

The Context-Ready Transformer

arXiv:2606.27538v1 Announce Type: cross Abstract: We introduce the context-ready transformer, a new recurrent neural network architecture built from a D-layer t

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models

arXiv:2606.27567v1 Announce Type: cross Abstract: Prompt injection is the top security risk for LLM-integrated applications, yet every defense proposed so far h

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

PEBS: Per-rater Empirical-Bayes Shrinkage for RLHF Reward-Model Calibration

arXiv:2606.27578v1 Announce Type: cross Abstract: Reward models for Reinforcement Learning from Human Feedback (RLHF) pool preferences across thousands of annot

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Dismantling Pathological Shortcuts: A Causal Framework for Faithful LVLM Decoding

arXiv:2606.27596v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) exhibit sophisticated reasoning but remain susceptible to object hallucin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Narrative-UFET: Narrative Generation for Ultra-Fine Entity Typing

arXiv:2606.27598v1 Announce Type: cross Abstract: Ultra-fine entity typing (UFET) assigns highly specific types to entity mentions, but current approaches strug

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

HybridCodec: Modeling Discrete and Continuous Representations for Efficient Speech Language Models

arXiv:2606.27627v1 Announce Type: cross Abstract: Discrete audio representations have become increasingly popular for building multimodal text-audio systems and

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Cross-Platform Chinese Offensive Comment Detection via Dual-Threshold Hard Example Mining

arXiv:2606.27629v1 Announce Type: cross Abstract: Cross-platform deployment of offensive comment detection for Chinese social media suffers performance degradat

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

From Signals to Transfer: A Factorised Study of Probe-Based Uncertainty Estimation in Large Language Models

arXiv:2606.27679v1 Announce Type: cross Abstract: Probe-based uncertainty estimation (UE) has emerged as a prominent approach to detect hallucinations in Large

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

CBD: API-Only LLM Black-Box Unlearning through Controlled Behavioral Divergence

arXiv:2606.27683v1 Announce Type: cross Abstract: Edge devices increasingly invoke large language models (LLMs) through API services for context aware edge inte

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Mitigating LLM-based p-Hacking by Preregistering for the Next LLM

arXiv:2606.27687v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used to generate, classify, and annotate data whose outputs feed

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Halt Fast! Early Stopping for Certified Robustness

arXiv:2606.27694v1 Announce Type: cross Abstract: Randomized Smoothing (RS) provides rigorous robustness guarantees for neural networks without architectural co

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Class-frequency Guided Noise Schedule for Diffusion Models

arXiv:2606.27696v1 Announce Type: cross Abstract: In this paper, we are the first to examine the correlations between class frequency and the multi-scale noise

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Low-Agreeableness Persona Conditioning for Safe LLM Fine-Tuning

arXiv:2606.27709v1 Announce Type: cross Abstract: Recent work has shown that fine-tuning large language models (LLMs) for social warmth degrades factual reliabi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Do Speech Emphasis Models Generalize across Languages and Emotions?

arXiv:2606.27717v1 Announce Type: cross Abstract: Prosodic emphasis varies across languages, emotions, and speaking styles, yet existing emphasis detection mode

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Enhancing Numerical Prediction in LLMs via Smooth MMD Alignment

arXiv:2606.27731v1 Announce Type: cross Abstract: Despite their strong general capabilities, large language models (LLMs) often remain unreliable when outputs m

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Bifocal Diffusion Language Models: Asymmetric Bidirectional Context for Parallel Generation

arXiv:2606.27732v1 Announce Type: cross Abstract: Discrete diffusion language models (dLLMs) recover masked tokens in parallel, offering significant speedups ov

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

KG2Cypher: Data-Centric Pipeline for Building Enterprise Text-to-Cypher Systems

arXiv:2606.27742v1 Announce Type: cross Abstract: Enterprise Knowledge Graphs (KGs) are increasingly used for internal search, analytics, and question answering

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

End-to-End Dynamic Sparsity for Resource-Adaptive LLM Inference

arXiv:2606.27743v1 Announce Type: cross Abstract: Large Language Models (LLMs) inference is typically deployed under a static resource assumption, where models

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Flexformer: Flexible Linear Transformer with Learnable Attention Kernel

arXiv:2606.27748v1 Announce Type: cross Abstract: Transformer models rely on attention mechanism to capture long-range dependencies but suffer from quadratic co

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Drop-Then-Recovery: How Redundant Are Vision-Language-Action Models?

arXiv:2606.27755v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models enable instruction-driven robotic manipulation, but they inherit oversized

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Output-Space Allocation Costs for Calibration-Guided LLM Compression: An Empirical Study

arXiv:2606.27785v1 Announce Type: cross Abstract: Training-free compression methods for large language models (LLMs) often use calibration data to guide compres

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

SHIFT: Gate-Modulated Activation Steering for Knowledge Conflict Mitigation in Retrieval-Augmented Generation

arXiv:2606.27786v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) enhances LLMs by incorporating external knowledge to support response gen

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

NLL-Guided Full-Attention Layer Selection for Training-Free Sliding-Window Adaptation

arXiv:2606.27791v1 Announce Type: cross Abstract: Hybrid attention models that mix full and sliding-window attention across layers offer a promising approach to

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Position Bias Correction is Insufficient for One-Pass Attention Sorting

arXiv:2606.27793v1 Announce Type: cross Abstract: Long-context language models suffer from position bias, where information in middle positions is underutilized

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Optimizing Teacher-Student Partitioning for Scalable Knowledge Distillation on HPC Systems

arXiv:2606.27797v1 Announce Type: cross Abstract: Knowledge Distillation (KD) enables training smaller student models under the guidance of larger teacher model

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

WattLayer: Get Layers Right to Estimate Inference Energy of Neural Networks

arXiv:2606.27841v1 Announce Type: cross Abstract: The widespread adoption of Artificial Intelligence (AI) has led to increasing concerns about energy consumptio

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

A Study of Temporal Fusion Strategies for Named Entity Recognition in Historical Texts

arXiv:2606.27881v1 Announce Type: cross Abstract: Temporal variation poses a unique challenge for named entity recognition (NER) in historical texts, where enti

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Triadic Werewolf: A Jester Role for Multi-Hop Theory of Mind in LLMs

arXiv:2606.27909v1 Announce Type: cross Abstract: Theory-of-mind evaluations of large language models typically use dyadic social-deduction games, where every o

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

Reflect-R1: Evidence-Driven Reflection for Self-Correction in Long Video Understanding

arXiv:2606.27922v1 Announce Type: cross Abstract: Current multimodal reflection mechanisms for long video understanding predominantly rely on closed-loop self-r

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

VASAE: Naming SAE Dictionary Directions with Vocabulary-Aligned Anchoring

arXiv:2606.27941v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) provide useful decompositions of Transformer residual streams, but their learned fe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

From Black-Box to Clinical Insight: A Multi-Stage Explainable Framework for Speech-Based Cognitive Impairment Detection

arXiv:2606.27973v1 Announce Type: cross Abstract: Speech-based cognitive impairment detection offers a noninvasive, accessible alternative to costly biomarker a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1w ago

ProMSA:Progressive Multimodal Search Agents for Knowledge-Based Visual Question Answering

arXiv:2606.27974v1 Announce Type: cross Abstract: Knowledge-based Visual Question Answering (KB-VQA) requires models to combine image understanding with externa