Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,485
lessons
Skills in this topic
View full skill map →
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding

Showing 5,091 reads from curated sources

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Making Prompts First-Class Citizens for Adaptive LLM Pipelines
arXiv:2508.05012v2 Announce Type: replace-cross Abstract: Modern LLM pipelines increasingly resemble complex data-centric applications: they retrieve data, corr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference
arXiv:2508.16703v2 Announce Type: replace-cross Abstract: On-device running Large Language Models (LLMs) is nowadays a critical enabler towards preserving user
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks
arXiv:2509.24186v2 Announce Type: replace-cross Abstract: Accuracy-based evaluation of Large Language Models (LLMs) measures benchmark-specific performance rath
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
ACT: Agentic Classification Tree
arXiv:2509.26433v4 Announce Type: replace-cross Abstract: When used in high-stakes settings, AI systems are expected to produce decisions that are transparent,
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Autonomy Reshapes How Personalization Affects Privacy Concerns and Trust in LLM Agents
arXiv:2510.04465v2 Announce Type: replace-cross Abstract: LLM agents require personal information for personalization in order to effectively act on users' beha
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline
arXiv:2510.06800v3 Announce Type: replace-cross Abstract: As large language models (LLMs) advance in role-playing (RP) tasks, existing benchmarks quickly become
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Fewer Weights, More Problems: A Practical Attack on LLM Pruning
arXiv:2510.07985v3 Announce Type: replace-cross Abstract: Model pruning, i.e., removing a subset of model weights, has become a prominent approach to reducing t
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
A Linguistics-Aware LLM Watermarking via Syntactic Predictability
arXiv:2510.13829v2 Announce Type: replace-cross Abstract: As large language models (LLMs) continue to advance rapidly, reliable governance tools have become cri
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models
arXiv:2510.15148v2 Announce Type: replace-cross Abstract: Omni-modal large language models (OLLMs) aim to unify audio, vision, and text understanding within a s
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation
arXiv:2510.15746v2 Announce Type: replace-cross Abstract: Ideal or real - that is the question.In this work, we explore whether principles from game theory can
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
A Model Can Help Itself: Reward-Free Self-Training for LLM Reasoning
arXiv:2510.18814v2 Announce Type: replace-cross Abstract: Can language models improve their reasoning performance without external rewards, using only their own
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
ATLAS: A Layered Constraint-Guided Framework for Structured Artifact Generation in LLM-Assisted MDE
arXiv:2510.25890v3 Announce Type: replace-cross Abstract: ATLAS is a constraint-guided generation framework for structured engineering artifacts whose outputs m
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection
arXiv:2511.06391v3 Announce Type: replace-cross Abstract: Optimization of offensive content moderation models for different types of hateful messages is typical
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms
arXiv:2511.06448v2 Announce Type: replace-cross Abstract: In this work, we study the risks of collective financial fraud in large-scale multi-agent systems powe
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
FAST-CAD: A Fairness-Aware Framework for Non-Contact Stroke Diagnosis
arXiv:2511.08887v4 Announce Type: replace-cross Abstract: Stroke is an acute cerebrovascular disease, and timely diagnosis significantly improves patient surviv
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models
arXiv:2512.18388v2 Announce Type: replace-cross Abstract: Generative AI has democratized content creation, but popular chatbot-based interfaces often prioritize
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counterfactual Example Generation
arXiv:2601.00263v2 Announce Type: replace-cross Abstract: Counterfactuals refer to minimally edited inputs that cause a model's prediction to change, serving as
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Path Integral Solution for Dissipative Generative Dynamics
arXiv:2601.00860v2 Announce Type: replace-cross Abstract: Can purely mechanical systems generate intelligent language? We prove that dissipative quantum dynamic
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models
arXiv:2601.01162v2 Announce Type: replace-cross Abstract: Categorical data are prevalent in domains such as healthcare, marketing, and bioinformatics, where clu
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Projected Autoregression: Autoregressive Language Generation in Continuous State Space
arXiv:2601.04854v3 Announce Type: replace-cross Abstract: Standard autoregressive language models generate text by repeatedly selecting a discrete next token, c
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning
arXiv:2601.11109v3 Announce Type: replace-cross Abstract: Vision-as-inverse-graphics, the concept of reconstructing images into editable programs, remains chall
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Self-Improving Pretraining: using post-trained models to pretrain better models
arXiv:2601.21343v3 Announce Type: replace-cross Abstract: Large language models are classically trained in stages: pretraining on raw text followed by post-trai
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Predicting Intermittent Job Failure Categories for Diagnosis Using Few-Shot Fine-Tuned Language Models
arXiv:2601.22264v2 Announce Type: replace-cross Abstract: In principle, Continuous Integration (CI) pipeline failures provide valuable feedback to developers on
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs
arXiv:2602.01554v2 Announce Type: replace-cross Abstract: Unified multimodal large language models (MLLMs) aim to unify image understanding and image generation
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models
arXiv:2602.04448v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) language models introduce unique challenges for safety alignment due to their
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations
arXiv:2602.09924v3 Announce Type: replace-cross Abstract: Running LLMs with extended reasoning on every problem is expensive, but determining which inputs actua
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control
arXiv:2602.14351v2 Announce Type: replace-cross Abstract: Model-based reinforcement learning promises strong sample efficiency but often underperforms in practi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Explainable Token-level Noise Filtering for LLM Fine-tuning Datasets
arXiv:2602.14536v3 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have seen remarkable advancements, achieving state-of-the-art results in
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Flow Map Language Models: One-step Language Modeling via Continuous Denoising
arXiv:2602.16813v2 Announce Type: replace-cross Abstract: Language models based on discrete diffusion have attracted widespread interest for their potential to
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Autorubric: Unifying Rubric-based LLM Evaluation
arXiv:2603.00077v2 Announce Type: replace-cross Abstract: Techniques for reliable rubric-based LLM evaluation -- ensemble judging, bias mitigation, few-shot cal
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction
arXiv:2603.10047v2 Announce Type: replace-cross Abstract: Hallucinations in large language models (LLMs) are outputs that are syntactically coherent but factual
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Truth as a Compression Artifact in Language Model Training
arXiv:2603.11749v3 Announce Type: replace-cross Abstract: Why do language models trained on contradictory data prefer correct answers? In controlled experiments
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Brittlebench: Quantifying LLM robustness via prompt sensitivity
arXiv:2603.13285v2 Announce Type: replace-cross Abstract: Existing evaluation methods largely rely on clean, static benchmarks, which can overestimate true mode
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Generate Then Correct: Single Shot Global Correction for Aspect Sentiment Quad Prediction
arXiv:2603.13777v2 Announce Type: replace-cross Abstract: Aspect-based sentiment analysis (ABSA) extracts aspect-level sentiment signals from user-generated tex
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 2w ago
Adaptive Stopping for Multi-Turn LLM Reasoning
arXiv:2604.01413v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) increasingly rely on multi-turn reasoning and interaction, such as adapti
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
Is Claude Cowork an Agent Yet? I Tested Dispatch, Computer Use, and 50 Connectors
I tested Claude's new agent features for a day. Cowork, Dispatch, computer use, Claude Code in the desktop app. All of it. My honest take: Anthropic is getting
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
SEO Is Dead. Long Live GEO: How Brands Can Track Their Visibility in AI Answers
Introduction If you ranked #1 on Google five years ago, you were unstoppable. Today, your potential customers are asking ChatGPT, Gemini, and Perplexity for rec
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
Clai TALOS: The Self-Hosted AI Agent I Wish Existed
Why I Built a Better Alternative to OpenClaw (And You Can Too) I spent six years building AI agents. Most of that time, I was fighting my tools instead of using
Why The Quantum Computing Industry Needs Logical Qubit Standards
Forbes Innovation 🧠 Large Language Models ⚡ AI Lesson 2w ago
Why The Quantum Computing Industry Needs Logical Qubit Standards
Quantum vendors and national agencies are aligning to establish common standards for logical qubits, which should enable better collaboration and interoperabili
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
How CRAFT Gives Claude Cowork Persistent Memory Across Sessions
Last week I released CRAFT for Cowork as a free public beta. This week: the capability that started it all — session handoffs. The Problem Every Claude Cowork s
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
What Is GEO and Why It Matters More Than SEO
SEO Is Dead. Long Live GEO. If you have spent the last decade mastering Google SEO, I have some uncomfortable news: the search landscape has fundamentally shift
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
OpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert Tasks
OpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert Tasks AI is accelerating: OpenAI expands funding, Holos reimagines multi-agent systems, an
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
QIS Outcome Routing with Redis Pub/Sub — When Your Transport Layer Thinks in Topics
QIS (Quadratic Intelligence Swarm) is a decentralized architecture discovered by Christopher Thomas Trevethan on June 16, 2025. Intelligence scales as Θ(N²) acr
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
The AI Stack: A Practical Guide to Building Your Own Intelligent Applications
Beyond the Hype: What Does "Building with AI" Actually Mean? Another week, another wave of AI headlines. From speculative leaks to existential debates, the conv
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 2w ago
Google quietly launched an AI dictation app that works offline
Google's new offline-first dictation app uses Gemma AI models to take on the apps like Wispr Flow.
Surviving SuperIntelligence: 6 Things OpenAI Says We Need To Do Now
Forbes Innovation 🧠 Large Language Models ⚡ AI Lesson 2w ago
Surviving SuperIntelligence: 6 Things OpenAI Says We Need To Do Now
AI giveth, but AI also taketh away. How will we survive superintelligence? ChatGPT maker OpenAI has some ideas ...