Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

42,291
lessons
Skills in this topic
View full skill map →
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding
All Reads (20,828) Articles (9979)Blog Posts (3670)Tutorials (2144)Research Papers (4755)News (280)
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Ask the World Before Acting: Budgeted Environment Probing for World-Model Calibration
arXiv:2606.31422v1 Announce Type: new Abstract: Long-horizon language agents do not only choose actions; they carry a private model of the world from one decisi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
CDR-Bench: Evaluating Faithful Execution of Compositional, Order-Sensitive Data Refinement Recipes
arXiv:2606.31435v1 Announce Type: new Abstract: Data refinement involves executing multi-step recipes over evolving text states, where both composition and exec
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
CSTrader: A Testbed for Language-Grounded Trading in a Community-Driven Virtual Asset Market
arXiv:2606.31461v1 Announce Type: new Abstract: Niche asset markets, such as Counter-Strike 2 (CS2) weapon skins, are small, volatile, and heavily driven by com
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Surprise as a Signal for Plasticity and Metacognition
arXiv:2606.31495v1 Announce Type: new Abstract: We study a single idea across two settings: that a prediction-error signal, computed by a small predictor over t
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Modality-Driven Search with Holistic Trace Judging for ARC-AGI-2
arXiv:2606.31543v1 Announce Type: new Abstract: Large language models can produce fluent, internally coherent reasoning traces for abstract reasoning tasks whil
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
ACE: Pluggable Adaptive Context Elasticizer across Agents
arXiv:2606.31564v1 Announce Type: new Abstract: The increasing complexity of agentic tasks has led to rapidly growing trajectory lengths, which poses significan
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Which Tokens Matter? Adaptive Token Selection for RLVR with the Relative Surprisal Index
arXiv:2606.31575v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a powerful tool for propelling Large Language Models (LLMs) beyond imitat
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Scientific Explanations in Health Sciences: Causality, Trust, and Epistemic Adequacy
arXiv:2606.31616v1 Announce Type: new Abstract: Medical Artificial Intelligence (AI) is widely expected to transform clinical practice, yet the decision-making
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Think in English, Answer in Korean: Efficient Adaptation of Multilingual Tool-Using Agents
arXiv:2606.31648v1 Announce Type: new Abstract: We present LuckyStar 111B, a 111B-parameter hybrid reasoning model developed through a collaboration between Coh
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
FARS: A Fully Automated Research System Deployed at Scale
arXiv:2606.31651v1 Announce Type: new Abstract: Recent automated research systems show that language-model agents can generate hypotheses, run experiments, and
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Arena-T2I Hard: Benchmarking and Improving Faithfulness with Dependency-Aware Checklist
arXiv:2606.31711v1 Announce Type: new Abstract: Faithfulness -- how precisely a generated image aligns with its prompt -- is increasingly central to the real-wo
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Evo-PI: Aligning Medical Reasoning via Evolving Principle-Guided Supervision
arXiv:2606.31800v1 Announce Type: new Abstract: Despite recent progress, the reasoning capabilities of large multimodal language models (MLLMs) remain fundament
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
RAISE: LLM-based Automated Heuristic Design with Robust Adversary Instance Search
arXiv:2606.31801v1 Announce Type: new Abstract: Automated Heuristic Design (AHD) with Large Language Models (LLMs) has shown remarkable progress in discovering
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Large Databases Need Small, Open-Weight Language Models
arXiv:2606.31808v1 Announce Type: new Abstract: Language model systems built around proprietary APIs often operate on a token-based cost model. This becomes pro
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Harnessing Textual Refusal Directions for Multimodal Safety
arXiv:2606.31876v1 Announce Type: new Abstract: To improve safety in Large Language Models (LLMs) we can either perform post-training alignment or exploit refus
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Self-Study Reconsidered: The Hidden Fragility of Learning from Self-Generated QA
arXiv:2606.32002v1 Announce Type: new Abstract: Language models are increasingly taught from synthetic question--answer (QA) supervision: a model generates ques
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
The Consistency Dilemma in LLMs: Generator-Evaluator Agreement and Vulnerability to Mistakes
arXiv:2606.30653v1 Announce Type: cross Abstract: Large language models are increasingly deployed in agentic pipelines that depend on the model evaluating its o
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
ELEVATE: Designing Human-Centered GenAI Virtual Tutors for Scalable and Inclusive Education
arXiv:2606.30662v1 Announce Type: cross Abstract: The advent of Generative Artificial Intelligence (GenAI), and in particular Large Language Models (LLMs), is r
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Emergent Culture in Minimal LLM Systems
arXiv:2606.30668v1 Announce Type: cross Abstract: What happens when LLM agents operate with no context outside a turn, minimal prompting, and simple tools? Insp
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Listening Between the Lines: Joint Learning of ASR Embeddings and LLM-Augmented Linguistics for Dementia Detection
arXiv:2606.30675v1 Announce Type: cross Abstract: Early detection of dementia through speech analysis offers a non-invasive screening alternative, but capturing
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
ALM2Vec: Learning Audio Embeddings for Universal Audio Retrieval with Large Audio-Language Models
arXiv:2606.30682v1 Announce Type: cross Abstract: Recent advances in language--audio retrieval have been largely driven by contrastive dual-encoder architecture
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
A Coherence Law for Trainability in Noisy Equivariant Quantum Neural Networks
arXiv:2606.30688v1 Announce Type: cross Abstract: Symmetry provides a quantum neural network structure, but on its own it does not keep the network trainable on
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Citation Discipline in Spec-Driven Development: A Cross-Model Empirical Study of Output Determinism and Automated Hallucination Detection in LLM-Generated Code
arXiv:2606.30689v1 Announce Type: cross Abstract: Spec-Driven Development (SDD) frameworks guide Large Language Model (LLM)-powered code generation through form
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
BEST-RQ-2: Contextualize-Then-Predict, a Two-Step Approach for Self-Supervised Audio Representations
arXiv:2606.30700v1 Announce Type: cross Abstract: Self-supervised learning enables audio representations that transfer across domains and tasks. We present BEST
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Why Do Few-Step Text Latents Fail When Image Latents Work? Non-Commitment at Sharp Categorical Readouts
arXiv:2606.30705v1 Announce Type: cross Abstract: Deterministic few-step generation succeeds on continuous image latents but collapses to incoherent text on con
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Hierarchical Global Attention (HGA)
arXiv:2606.30709v1 Announce Type: cross Abstract: Hierarchical Global Attention (HGA) is a drop-in replacement for dense causal attention in pretrained long-con
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
A Single Rewrite Suffices: Empirical Lessons from Production Skill Description Optimization
arXiv:2606.30775v1 Announce Type: cross Abstract: Enterprise AI agents route user queries to specialized skills by matching queries against natural language ski
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Detecting Audio Deepfakes on the Edge:Lightweight SSL-Based Detection in a Browser Plugin
arXiv:2606.30780v1 Announce Type: cross Abstract: Audio deepfakes are a growing challenge for the general public, as well as for journalists and fact-checkers.
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Security--Fidelity Tradeoffs: The Hidden Cost of Prompt Injection Defense
arXiv:2606.30783v1 Announce Type: cross Abstract: We identify a security-fidelity tradeoff in defending LLMs against indirect prompt injection: defenses resist
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Indi-RomCoM: Code-Mixed Benchmark for Evaluating LLMs on Romanized Indic-English Instructions
arXiv:2606.30790v1 Announce Type: cross Abstract: Romanized Code Mixing (RCM), where bilingual speakers fluidly blend local languages with English in Roman scri
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
When transformers learn "impossible" languages, what do they learn?
arXiv:2606.30815v1 Announce Type: cross Abstract: Recent work suggests that transformer language models show a bias towards human languages over unnatural ("imp
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
AI-Generated PowerShell Malware: An Experimental Framework and Dataset
arXiv:2606.30819v1 Announce Type: cross Abstract: Generative AI has emerged as a significant cybersecurity threat, with several recent attack campaigns leveragi
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Test-Time Verification for Text-to-SQL via Outcome Reward Models
arXiv:2606.30851v1 Announce Type: cross Abstract: Improving the reliability of large language models (LLMs) at inference time is a central challenge in structur
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
The Label Imitation Game: Turing Test Network for Zero-Shot Pseudo-Label Pruning
arXiv:2606.30875v1 Announce Type: cross Abstract: Foundation model pseudo-labeling - labeling data strictly via zero-shot inference - enables massive scale, but
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Training Therapeutic Judges and Multi-Agent Systems for Human-Aligned Mental Health Support
arXiv:2606.30887v1 Announce Type: cross Abstract: Large language models show promise for mental health support, yet therapeutic quality improves only when evalu
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Curvature-Guided Module Localization for Low-Rank Detoxification of Backdoored Large Language Models
arXiv:2606.30899v1 Announce Type: cross Abstract: Backdoor attacks pose a serious threat to large language models (LLMs) by causing otherwise benign systems to
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
How Human Feedback Shapes AI-generated Community Notes
arXiv:2606.30905v1 Announce Type: cross Abstract: Community Notes, a bridging-based crowd-sourced fact-checking system, has emerged as a new mechanism for moder
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Behavior Cloning is Not All You Need: The Optimality of On-Policy Distillation for Noisy Expert Feedback
arXiv:2606.30923v1 Announce Type: cross Abstract: Imitation Learning is a natural framework for learning in sequential decision-making systems and has emerged a
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Learning Where to Look: A Reinforcement Learning Framework for Robust Micro-Ultrasound Prostate Cancer Detection
arXiv:2606.30951v1 Announce Type: cross Abstract: Micro-ultrasound ($\mu$US) is a new, emerging, and promising imaging modality for prostate cancer (PCa) detect
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Loc2Repair: A Framework for Evaluating the Impact of File-Level Issue Localization in Repo-Level LLM Repair
arXiv:2606.30963v1 Announce Type: cross Abstract: Repository-grounded automated repair is often reported as a single end-to-end capability, which hides distinct
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Wait, am I Being Fair? Characterizing Deductive Stereotyping and Mitigating It with Fair-GCG
arXiv:2606.30989v1 Announce Type: cross Abstract: Warning: This paper contains several toxic and offensive statements. While reasoning generally improves fairne
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
OTCache: Optimal Transport for Geometry-Aware Caching in Diffusion Models
arXiv:2606.31026v1 Announce Type: cross Abstract: We propose OTCache, a training-free framework for accelerating diffusion sampling via caching schedule predict
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
LLM-Driven Personalities for Decision Making in Emergency Simulations
arXiv:2606.31038v1 Announce Type: cross Abstract: For virtual humans to appear believable, they must exhibit agency and spatial awareness while interacting with
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
Knowledge Distillation from Large Reasoning Models to Compact Student Models: A Case Study on the John O Bryan Mathematics Competition
arXiv:2606.31048v1 Announce Type: cross Abstract: This paper investigates knowledge distillation from a large reasoning model (DeepSeek-R1) to a compact student
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
ADAPT: Attention Dynamics Alignment with Preference Tuning for Faithful MLLMs
arXiv:2606.31054v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) are critically hampered by hallucination, generating content inconsis
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
When Reranking Hurts: Uncertainty-Based Gating for Few-Shot Reranking
arXiv:2606.31087v1 Announce Type: cross Abstract: Few-shot selection typically assumes that reranking retrieved examples always improves performance. We challen
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
LLM-Powered Interactive Robotic Action Synthesis from Multimodal Speech, Gestures, and Music
arXiv:2606.31158v1 Announce Type: cross Abstract: The quest for intuitive and natural human-robot interaction (HRI) remains a significant challenge in robotics.
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 17h ago
ComplianceGate: Classifier-Gated Multi-Tier LLM Routing for Inference in Regulated Industries
arXiv:2606.31163v1 Announce Type: cross Abstract: Large language models deployed in regulated industries operate under two constraints: compliance enforcement a