Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,754

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,450 Reads 5,304

Showing 5,304 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids

arXiv:2603.28385v1 Announce Type: cross Abstract: Maritime surveillance missions, such as search and rescue and environmental monitoring, rely on the efficient

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

EdgeDiT: Hardware-Aware Diffusion Transformers for Efficient On-Device Image Generation

arXiv:2603.28405v1 Announce Type: cross Abstract: Diffusion Transformers (DiT) have established a new state-of-the-art in high-fidelity image synthesis; however

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Evolutionary Discovery of Reinforcement Learning Algorithms via Large Language Models

arXiv:2603.28416v1 Announce Type: cross Abstract: Reinforcement learning algorithms are defined by their learning update rules, which are typically hand-designe

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Spectral Higher-Order Neural Networks

arXiv:2603.28420v1 Announce Type: cross Abstract: Neural networks are fundamental tools of modern machine learning. The standard paradigm assumes binary interac

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

FeDMRA: Federated Incremental Learning with Dynamic Memory Replay Allocation

arXiv:2603.28455v1 Announce Type: cross Abstract: In federated healthcare systems, Federated Class-Incremental Learning (FCIL) has emerged as a key paradigm, en

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

arXiv:2603.28458v1 Announce Type: cross Abstract: Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification

arXiv:2603.28488v1 Announce Type: cross Abstract: Large language models (LLMs) remain unreliable for high-stakes claim verification due to hallucinations and sh

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Next-Token Prediction and Regret Minimization

arXiv:2603.28499v1 Announce Type: cross Abstract: We consider the question of how to employ next-token prediction algorithms in adversarial online decision-maki

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

The Unreasonable Effectiveness of Scaling Laws in AI

arXiv:2603.28507v1 Announce Type: cross Abstract: Classical AI scaling laws, especially for pre-training, describe how training loss decreases with compute in a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Hydra: Unifying Document Retrieval and Generation in a Single Vision-Language Model

arXiv:2603.28554v1 Announce Type: cross Abstract: Visual document understanding typically requires separate retrieval and generation models, doubling memory and

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Domain-Invariant Prompt Learning for Vision-Language Models

arXiv:2603.28555v1 Announce Type: cross Abstract: Large pre-trained vision-language models like CLIP have transformed computer vision by aligning images and tex

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Fine-Tuning Large Language Models for Cooperative Tactical Deconfliction of Small Unmanned Aerial Systems

arXiv:2603.28561v1 Announce Type: cross Abstract: The growing deployment of small Unmanned Aerial Systems (sUASs) in low-altitude airspaces has increased the ne

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments

arXiv:2603.28569v1 Announce Type: cross Abstract: The increasing agentic capabilities of Large Language Models (LLMs) have enabled their deployment in real-worl

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Learning Partial Action Replacement in Offline MARL

arXiv:2603.28573v1 Announce Type: cross Abstract: Offline multi-agent reinforcement learning (MARL) faces a critical challenge: the joint action space grows exp

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ChemCLIP: Bridging Organic and Inorganic Anticancer Compounds Through Contrastive Learning

arXiv:2603.28575v1 Announce Type: cross Abstract: The discovery of anticancer therapeutics has traditionally treated organic small molecules and metal-based coo

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Moving Beyond Review: Applying Language Models to Planning and Translation in Reflection

arXiv:2603.28596v1 Announce Type: cross Abstract: Reflective writing is known to support the development of students' metacognitive skills, yet learners often s

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning

arXiv:2603.28610v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) achieve stronger visual understanding by scaling input fidelity, yet

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Trust-Aware Routing for Distributed Generative AI Inference at the Edge

arXiv:2603.28622v1 Announce Type: cross Abstract: Emerging deployments of Generative AI increasingly execute inference across decentralized and heterogeneous ed

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AMIGO: Agentic Multi-Image Grounding Oracle Benchmark

arXiv:2603.28662v1 Announce Type: cross Abstract: Agentic vision-language models increasingly act through extended interactions, but most evaluations still focu

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

arXiv:2603.28696v1 Announce Type: cross Abstract: Long video understanding remains challenging for Multi-modal Large Language Models (MLLMs) due to high memory

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Stepwise Credit Assignment for GRPO on Flow-Matching Models

arXiv:2603.28718v1 Announce Type: cross Abstract: Flow-GRPO successfully applies reinforcement learning to flow models, but uses uniform credit assignment acros

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining

arXiv:2603.28737v1 Announce Type: cross Abstract: We introduce ParaSpeechCLAP, a dual-encoder contrastive model that maps speech and text style captions into a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers

arXiv:2603.28762v1 Announce Type: cross Abstract: Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Retrieving Classes of Causal Orders with Inconsistent Knowledge Bases

arXiv:2412.14019v4 Announce Type: replace Abstract: Traditional causal discovery methods often depend on strong, untestable assumptions, making them unreliable

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Synergizing Large Language Models and Task-specific Models for Time Series Anomaly Detection

arXiv:2501.05675v5 Announce Type: replace Abstract: In anomaly detection, methods based on large language models (LLMs) can incorporate expert knowledge by read

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Inspire or Predict? Exploring New Paradigms in Assisting Classical Planners with Large Language Models

arXiv:2508.11524v2 Announce Type: replace Abstract: Addressing large-scale planning problems has become one of the central challenges in the planning community,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking

arXiv:2509.23392v3 Announce Type: replace Abstract: Large Reasoning Models (LRMs) have achieved impressive performance on challenging tasks, yet their deep reas

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Searching Meta Reasoning Skeleton to Guide LLM Reasoning

arXiv:2510.04116v3 Announce Type: replace Abstract: Meta reasoning behaviors work as a skeleton to guide large language model (LLM) reasoning, thus help to impr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

ShortcutBreaker: Low-Rank Noisy Bottleneck and Frequency Filtering Block for Multi-Class Unsupervised Anomaly Detection

arXiv:2510.18342v2 Announce Type: replace Abstract: Multi-class unsupervised anomaly detection (MUAD) has garnered growing research interest, as it seeks to dev

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

From Questions to Queries: An AI-powered Multi-Agent Framework for Spatial Text-to-SQL

arXiv:2510.21045v3 Announce Type: replace Abstract: The complexity of SQL and the spatial semantics of PostGIS create barriers for non-experts working with spat

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

FlipVQA: Scaling Multi-modal Instruction Tuning via Textbook-to-Knowledge Synthesis

arXiv:2511.16216v2 Announce Type: replace Abstract: Textbooks are among the richest repositories of human-verified reasoning knowledge, yet their complex layout

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

An Attention Mechanism for Robust Multimodal Integration in a Global Workspace Architecture

arXiv:2602.08597v2 Announce Type: replace Abstract: Robust multimodal systems must remain effective when some modalities are noisy, degraded, or unreliable. Exi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems

arXiv:2602.11510v2 Announce Type: replace Abstract: Multi-agent Large Language Model (LLM) systems create privacy risks that current benchmarks cannot measure.

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Evaluating and Understanding Scheming Propensity in LLM Agents

arXiv:2603.01608v2 Announce Type: replace Abstract: As frontier language models are increasingly deployed as autonomous agents pursuing complex, long-term objec

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Seed1.8 Model Card: Towards Generalized Real-World Agency

arXiv:2603.20633v2 Announce Type: replace Abstract: We present Seed1.8, a foundation model aimed at generalized real-world agency: going beyond single-turn pred

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Silicon Bureaucracy and AI Test-Oriented Education: Contamination Sensitivity and Score Confidence in LLM Benchmarks

arXiv:2603.21636v2 Announce Type: replace Abstract: Public benchmarks increasingly govern how large language models (LLMs) are ranked, selected, and deployed. W

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Continual Graph Learning: A Survey

arXiv:2301.12230v2 Announce Type: replace-cross Abstract: Continual Graph Learning (CGL) enables models to incrementally learn from streaming graph-structured d

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Deep Neural Networks: A Formulation Via Non-Archimedean Analysis

arXiv:2402.00094v2 Announce Type: replace-cross Abstract: We introduce a new class of deep neural networks (DNNs) with multilayered tree-like architectures. The

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Learning the Model While Learning Q: Finite-Time Sample Complexity of Online SyncMBQ

arXiv:2402.11877v2 Announce Type: replace-cross Abstract: Reinforcement learning has witnessed significant advancements, particularly with the emergence of mode

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Bidirectional Multimodal Prompt Learning with Scale-Aware Training for Few-Shot Multi-Class Anomaly Detection

arXiv:2408.13516v2 Announce Type: replace-cross Abstract: Few-shot multi-class anomaly detection is crucial in real industrial settings, where only a few normal

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Explainable AI needs formalization

arXiv:2409.14590v5 Announce Type: replace-cross Abstract: The field of "explainable artificial intelligence" (XAI) seemingly addresses the desire that decisions

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Recent Advances of Multimodal Continual Learning: A Comprehensive Survey

arXiv:2410.05352v3 Announce Type: replace-cross Abstract: Continual learning (CL) aims to empower machine learning models to learn continually from new data, wh

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Gradient Compression Beyond Low-Rank: Wavelet Subspaces Compact Optimizer States

arXiv:2501.07237v4 Announce Type: replace-cross Abstract: Large language models (LLMs) have shown impressive performance across a range of natural language proc

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning

arXiv:2502.18535v2 Announce Type: replace-cross Abstract: Machine learning is increasingly deployed through outsourced and cloud-based pipelines, which improve

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs

arXiv:2503.05371v3 Announce Type: replace-cross Abstract: We present a novel approach to bias mitigation in large language models (LLMs) by applying steering ve

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text

arXiv:2504.19467v4 Announce Type: replace-cross Abstract: Large language models (LLMs) hold great promise for medical applications and are evolving rapidly, wit

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models

arXiv:2505.03821v2 Announce Type: replace-cross Abstract: We investigate the ability of Vision Language Models (VLMs) to perform visual perspective taking using

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago

Self-Bootstrapping Automated Program Repair: Using LLMs to Generate and Evaluate Synthetic Training Data for Bug Repair

arXiv:2505.07372v2 Announce Type: replace-cross Abstract: This paper presents a novel methodology for enhancing Automated Program Repair (APR) through synthetic