Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

53,036

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 21,427 Reads 31,609

All Reads (31,609) Articles (13468)Blog Posts (5960)Tutorials (2575)Research Papers (8684)News (922)

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Adaptive Margin RLHF via Preference over Preferences

arXiv:2509.22851v4 Announce Type: replace-cross Abstract: Margin-based optimization is fundamental to improving generalization and robustness in classification

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

OctoPipe: Reducing Pipeline Bubbles for Heterogeneous Models via Co-Optimizing Partitioning, Placement, and Scheduling

arXiv:2509.23722v2 Announce Type: replace-cross Abstract: Pipeline parallelism is widely used to train large language models (LLMs). However, increasing heterog

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

VISOR++: Universal Visual Inputs based Steering for Large Vision Language Models

arXiv:2509.25533v2 Announce Type: replace-cross Abstract: As Vision Language Models (VLMs) are deployed across safety-critical applications, understanding and c

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

TaoSR-AGRL: Adaptive Guided Reinforcement Learning Framework for E-commerce Search Relevance

arXiv:2510.08048v4 Announce Type: replace-cross Abstract: Query-product relevance prediction is fundamental to e-commerce search and has become even more critic

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

SoK: Systematizing LLM Prompt Security: Taxonomies, Datasets, and Unified Evaluation of Attacks and Defenses

arXiv:2510.15476v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are increasingly used as interfaces to information, code, and real-world

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

PuzzleMoE: Efficient Compression of Large Mixture-of-Experts Models via Sparse Expert Merging and Bit-packed inference

arXiv:2511.04805v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) models have shown strong potential in scaling language models efficiently by

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Large Language Models Develop Novel Social Biases Through Adaptive Exploration

arXiv:2511.06148v4 Announce Type: replace-cross Abstract: As large language models (LLMs) are adopted into frameworks that grant them the capacity to make real

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

SpatialThinker: Reinforcing Scene Graph-Grounded Spatial Reasoning via Dense Rewards

arXiv:2511.07403v2 Announce Type: replace-cross Abstract: Multimodal large language models (MLLMs) have achieved remarkable progress in vision-language tasks, b

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

InverseCrafter: Efficient Video ReCapture as a Latent Domain Inverse Problem

arXiv:2512.05672v2 Announce Type: replace-cross Abstract: Recent approaches in controllable novel view video generation often rely on fine-tuning pre-trained Vi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Parameter Efficient Multimodal Instruction Tuning for Romanian Vision Language Models

arXiv:2512.14926v2 Announce Type: replace-cross Abstract: Focusing on low-resource languages is an essential step toward democratizing generative AI. In this wo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Generative Semantic Multi-Object Tracking: A Large-Scale Benchmark and an MLLM-Driven Reasoning Framework

arXiv:2601.06550v3 Announce Type: replace-cross Abstract: Semantic Multi-Object Tracking (SMOT) is evolving from purely geometric localization toward comprehens

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

ARCQuant: Boosting NVFP4 Quantization with Augmented Residual Channels for LLMs

arXiv:2601.07475v2 Announce Type: replace-cross Abstract: The emergence of fine-grained numerical formats like NVFP4 presents new opportunities for efficient La

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Predicting Biased Human Decision-Making with Large Language Models in Conversational Settings

arXiv:2601.11049v2 Announce Type: replace-cross Abstract: We examine whether large language models (LLMs) can predict biased decision-making in conversational s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

R$^2$PO: Decoupling Rollout and Inference Policies for LLM Reasoning

arXiv:2601.11960v3 Announce Type: replace-cross Abstract: Existing reinforcement learning methods for LLM reasoning implicitly assume that the policy generating

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

No Reliable Evidence of Self-Reported Sentience in Small Large Language Models

arXiv:2601.15334v2 Announce Type: replace-cross Abstract: Whether language models possess sentience has no empirical answer. But whether they believe themselves

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

BoRP: Bootstrapped Regression Probing for Scalable and Human-Aligned LLM Evaluation

arXiv:2601.18253v2 Announce Type: replace-cross Abstract: Accurate evaluation of user satisfaction is critical for iterative development of conversational AI. H

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

FlashBlock: Attention Caching for Efficient Long-Context Block Diffusion

arXiv:2602.05305v3 Announce Type: replace-cross Abstract: Generating long-form content, such as minute-long videos and extended texts, is increasingly important

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Deriving Neural Scaling Laws from the statistics of natural language

arXiv:2602.07488v3 Announce Type: replace-cross Abstract: Despite the fact that experimental neural scaling laws have substantially guided empirical progress in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

pFedNavi: Structure-Aware Personalized Federated Vision-Language Navigation for Embodied AI

arXiv:2602.14401v2 Announce Type: replace-cross Abstract: Vision-Language Navigation VLN requires large-scale trajectory instruction data from private indoor en

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Data Driven Optimization of GPU efficiency for Distributed LLM-Adapter Serving

arXiv:2602.24044v2 Announce Type: replace-cross Abstract: Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex cachin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

OSF: On Pre-training and Scaling of Sleep Foundation Models

arXiv:2603.00190v2 Announce Type: replace-cross Abstract: Polysomnography (PSG) provides the gold standard for sleep assessment but suffers from substantial het

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Efficient Flow Matching for Sparse-View CT Reconstruction

arXiv:2603.00205v2 Announce Type: replace-cross Abstract: Generative models, particularly Diffusion Models (DM), have shown strong potential for Computed Tomogr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

BEVLM: Distilling Semantic Knowledge from LLMs into Bird's-Eye View Representations

arXiv:2603.06576v2 Announce Type: replace-cross Abstract: The integration of Large Language Models (LLMs) into autonomous driving has attracted growing interest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

SpreadsheetArena: Decomposing Preference in LLM Generation of Spreadsheet Workbooks

arXiv:2603.10002v2 Announce Type: replace-cross Abstract: We consider the task of end-to-end spreadsheet generation, where language models produce spreadsheet a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3d ago

Na\"ive PAINE: Lightweight Text-to-Image Generation Improvement with Prompt Evaluation

arXiv:2603.12506v2 Announce Type: replace-cross Abstract: Text-to-Image (T2I) generation is primarily driven by Diffusion Models (DM) which rely on random Gauss

Medium · Machine Learning 🧠 Large Language Models ⚡ AI Lesson 3d ago

Designing an Attention Mechanism That Keeps Untrusted Tokens Out of the Decision Path

Designing an Attention Mechanism That Keeps Untrusted Tokens Out of the Decision Path Continue reading on Medium »

What is a Knowledge Graph?

Medium · RAG 🧠 Large Language Models ⚡ AI Lesson 3d ago

What is a Knowledge Graph?

A knowledge graph is a database system that stores data as nodes (entities) and relationships (connections between entities) instead of as… Continue reading on

I Built My First RAG Application on the Free Tier. It Changed How I Think About the Cost of AI.

Medium · Data Science 🧠 Large Language Models ⚡ AI Lesson 3d ago

I Built My First RAG Application on the Free Tier. It Changed How I Think About the Cost of AI.

When I first started building with LLMs, I thought I had AI figured out. Continue reading on Medium »

Stop Fixing Your AI Writing Prompt. Make These 5 Decisions First

Dev.to · Sho Naka 🧠 Large Language Models ⚡ AI Lesson 3d ago

Stop Fixing Your AI Writing Prompt. Make These 5 Decisions First

A practical AI writing workflow: improve the AI writing prompt by making five editorial decisions before the model drafts.

Prompt Engineering for Threat Researchers: A Practical Field Guide

Medium · Cybersecurity 🧠 Large Language Models ⚡ AI Lesson 3d ago

Prompt Engineering for Threat Researchers: A Practical Field Guide

How to get sharper, more reliable output from LLMs when your job is finding adversaries, not hallucinating them. Continue reading on Medium »

ChatGPT vs Claude vs Gemini: I Used All Three for 60 Days — Here’s the Honest Verdict

Medium · ChatGPT 🧠 Large Language Models ⚡ AI Lesson 3d ago

ChatGPT vs Claude vs Gemini: I Used All Three for 60 Days — Here’s the Honest Verdict

Everyone has an opinion. Here’s one based on actual daily use. Continue reading on Medium »

Everything You Need to Know About Claude Fable 5

Medium · LLM 🧠 Large Language Models ⚡ AI Lesson 3d ago

Everything You Need to Know About Claude Fable 5

The Most Powerful AI for Developers in 2026 (And Why It Changes Everything) Continue reading on Medium »

Reddit r/deeplearning 🧠 Large Language Models ⚡ AI Lesson 3d ago

Betting LLMs learn math better from semantic IR than raw source tokens: 70% extraction on Mathlib so far

Hypothesis: LLMs are still mediocre at formal theorem proving partly because we're tokenizing the wrong thing. Lean source is full of notation, implicit argumen

The 10-Line Prompt That Turns ChatGPT Into a Fully Autonomous AI Agent

Dev.to · Yao Xiao 🧠 Large Language Models ⚡ AI Lesson 3d ago

The 10-Line Prompt That Turns ChatGPT Into a Fully Autonomous AI Agent

Most people treat Large Language Models like glorified search engines: ask a question, skim the...

Closed Source LLMs Bankroll the AI Companies and Bankrupt the World - And 5 Guidelines to Reverse It

Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3d ago

Closed Source LLMs Bankroll the AI Companies and Bankrupt the World - And 5 Guidelines to Reverse It

OpenAI and Anthropic are bankrupting the world with the most important resource: intelligence. Here are 5 steps to reverse that balance and give the power to ev

My $47 Deep Dive Into China's AI Models: The Surprising Winner

Dev.to · eagerspark 🧠 Large Language Models ⚡ AI Lesson 3d ago

My $47 Deep Dive Into China's AI Models: The Surprising Winner

My $47 Deep Dive Into China's AI Models: The Surprising Winner I've been obsessed with finding the...

Candidate Compliance Agent: Building a Multilingual RAG System for Tamil Nadu Election Affidavits

Dev.to · Hari Babu 🧠 Large Language Models ⚡ AI Lesson 3d ago

Candidate Compliance Agent: Building a Multilingual RAG System for Tamil Nadu Election Affidavits

The Problem Nobody Talks About Every election cycle in India, candidates are not easily...

Medium · AI 🧠 Large Language Models ⚡ AI Lesson 3d ago

Best YouTube Channels to Learn Artificial Intelligence

Artificial Intelligence is one of the fastest-growing fields in the world today. New tools, frameworks, research papers, and models are… Continue reading on Med

Medium · Machine Learning 🧠 Large Language Models ⚡ AI Lesson 3d ago

Best YouTube Channels to Learn Artificial Intelligence

Artificial Intelligence is one of the fastest-growing fields in the world today. New tools, frameworks, research papers, and models are… Continue reading on Med

Master Local Fine-Tuning with "gemma-trainer"

Dev.to · bebechien 🧠 Large Language Models ⚡ AI Lesson 3d ago

Master Local Fine-Tuning with "gemma-trainer"

Take control of your AI models with our newest skill, designed to make local fine-tuning efficient.

Tencent Just Released Hy3 — A 295B Open-Source AI Model Taking on GPT-5.5,

Medium · LLM 🧠 Large Language Models ⚡ AI Lesson 3d ago

Tencent Just Released Hy3 — A 295B Open-Source AI Model Taking on GPT-5.5,

For years, there was one simple rule in AI. Continue reading on CodeToDeploy »

Your LLM agent optimizes one knob. It should be optimizing all of them, together.

Medium · Machine Learning 🧠 Large Language Models ⚡ AI Lesson 3d ago

Your LLM agent optimizes one knob. It should be optimizing all of them, together.

An accessible walkthrough of DICO — a two-layer architecture that lets a single-loop agent reconfigure its entire inference setup, per… Continue reading on Medi

The Machine Has No Ocean

Medium · Machine Learning 🧠 Large Language Models ⚡ AI Lesson 3d ago

The Machine Has No Ocean

Machine prose is fluent, correct, and unoccupied. The vacancy is not a defect awaiting a patch. It is the signature of an origin. Continue reading on Medium »

Medium · LLM 🧠 Large Language Models ⚡ AI Lesson 3d ago

Dynamic Future-Claim Certification: A Simple Guide to Replayable Future Claims and the Future Claim…

Modern software often makes claims about the future. Continue reading on Medium »

Building Hierarchos: What We Learned From Training a 232M Recurrent Memory-Augmented Assistant…

Medium · Machine Learning 🧠 Large Language Models ⚡ AI Lesson 3d ago

Building Hierarchos: What We Learned From Training a 232M Recurrent Memory-Augmented Assistant…

A practical field report from building a non-Transformer language model with hierarchy, memory, recurrent state, and more debugging than… Continue reading on Me

Building Hierarchos: What We Learned From Training a 232M Recurrent Memory-Augmented Assistant…

Medium · Deep Learning 🧠 Large Language Models ⚡ AI Lesson 3d ago

Building Hierarchos: What We Learned From Training a 232M Recurrent Memory-Augmented Assistant…

A practical field report from building a non-Transformer language model with hierarchy, memory, recurrent state, and more debugging than… Continue reading on Me

How I Built a File-Timestamp-Based Feedback Loop to Enforce AI Output Quality

Dev.to · YuhaoLin2005 🧠 Large Language Models ⚡ AI Lesson 3d ago

How I Built a File-Timestamp-Based Feedback Loop to Enforce AI Output Quality

The problem: AI outputs are probabilistic, and prompts have a ceiling LLMs produce...

How to Actually Use Supabase RLS With Next.js App Router (Without Losing Your Mind)

Dev.to · FreezeOrange 🧠 Large Language Models ⚡ AI Lesson 3d ago

How to Actually Use Supabase RLS With Next.js App Router (Without Losing Your Mind)

A practical guide to integrating Supabase Row Level Security with Next.js 14. Real patterns from a production app with 28K+ users.