Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,935

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

LLM Foundations

Explain how transformers generate text

Write zero-shot and few-shot prompts

LLM Engineering

Call LLM APIs with function/tool use

Fine-tuning LLMs

Prepare fine-tuning datasets

Multimodal LLMs

Use GPT-4V / Claude Vision for image understanding

Videos 19,459 Reads 5,476

Showing 5,476 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Schr\"odinger's Navigator: Imagining an Ensemble of Futures for Zero-Shot Object Navigation

arXiv:2512.21201v2 Announce Type: replace-cross Abstract: Zero-shot object navigation (ZSON) requires robots to locate target objects in unseen environments wit

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

AI-Generated Code Is Not Reproducible (Yet): An Empirical Study of Dependency Gaps in LLM-Based Coding Agents

arXiv:2512.22387v3 Announce Type: replace-cross Abstract: The rise of Large Language Models (LLMs) as coding agents promises to accelerate software development,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

VLM-CAD: VLM-Optimized Collaborative Agent Design Workflow for Analog Circuit Sizing

arXiv:2601.07315v4 Announce Type: replace-cross Abstract: Vision Language Models (VLMs) have demonstrated remarkable potential in multimodal reasoning, yet they

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Hierarchical Long Video Understanding with Audiovisual Entity Cohesion and Agentic Search

arXiv:2601.13719v2 Announce Type: replace-cross Abstract: Long video understanding presents significant challenges for vision-language models due to extremely l

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

arXiv:2601.18858v2 Announce Type: replace-cross Abstract: Compositional generalization-the ability to interpret novel combinations of familiar components-remain

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

arXiv:2601.22060v3 Announce Type: replace-cross Abstract: Multimodal large language models (MLLMs) have achieved remarkable success across a broad range of visi

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Residual Decoding: Mitigating Hallucinations in Large Vision-Language Models via History-Aware Residual Guidance

arXiv:2602.01047v3 Announce Type: replace-cross Abstract: Large Vision-Language Models (LVLMs) can reason from image-text inputs and perform well in various mul

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning

arXiv:2602.01976v3 Announce Type: replace-cross Abstract: General continual learning (GCL) challenges intelligent systems to learn from single-pass, non-station

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Behavioral Consistency Validation for LLM Agents: An Analysis of Trading-Style Switching through Stock-Market Simulation

arXiv:2602.07023v2 Announce Type: replace-cross Abstract: Recent works have increasingly applied Large Language Models (LLMs) as agents in financial stock marke

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Energy-Aware Reinforcement Learning for Robotic Manipulation of Articulated Components in Infrastructure Operation and Maintenance

arXiv:2602.12288v3 Announce Type: replace-cross Abstract: With the growth of intelligent civil infrastructure and smart cities, operation and maintenance (O&M)

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

KDFlow: A User-Friendly and Efficient Knowledge Distillation Framework for Large Language Models

arXiv:2603.01875v2 Announce Type: replace-cross Abstract: Knowledge distillation (KD) is an essential technique to compress large language models (LLMs) into sm

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

arXiv:2603.03292v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) exhibit high reasoning capacity in medical question-answering, but their

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift

arXiv:2603.04648v2 Announce Type: replace-cross Abstract: Real-world reinforcement learning systems must operate under distributional drift in their observation

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Human Presence Detection via Wi-Fi Range-Filtered Doppler Spectrum on Commodity Laptops

arXiv:2603.10845v2 Announce Type: replace-cross Abstract: Human Presence Detection (HPD) is key to enable intelligent power management and security features in

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL

arXiv:2603.13606v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) architectures have become essential for scaling large language models, drivin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

SARE: Sample-wise Adaptive Reasoning for Training-free Fine-grained Visual Recognition

arXiv:2603.17729v2 Announce Type: replace-cross Abstract: Recent advances in Large Vision-Language Models (LVLMs) have enabled training-free Fine-Grained Visual

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

EVA: Aligning Video World Models with Executable Robot Actions via Inverse Dynamics Rewards

arXiv:2603.17808v2 Announce Type: replace-cross Abstract: Video generative models are increasingly used as world models for robotics, where a model generates a

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Elastic Weight Consolidation Done Right for Continual Learning

arXiv:2603.18596v2 Announce Type: replace-cross Abstract: Weight regularization methods in continual learning (CL) alleviate catastrophic forgetting by assessin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Mi:dm K 2.5 Pro

arXiv:2603.18788v2 Announce Type: replace-cross Abstract: The evolving LLM landscape requires capabilities beyond simple text generation, prioritizing multi-ste

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Children's Intelligence Tests Pose Challenges for MLLMs? KidGym: A 2D Grid-Based Reasoning Benchmark for MLLMs

arXiv:2603.20209v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) combine the linguistic strengths of LLMs with the ability to

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

CRoCoDiL: Continuous and Robust Conditioned Diffusion for Language

arXiv:2603.20210v2 Announce Type: replace-cross Abstract: Masked Diffusion Models (MDMs) provide an efficient non-causal alternative to autoregressive generatio

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

An Industrial-Scale Retrieval-Augmented Generation Framework for Requirements Engineering: Empirical Evaluation with Automotive Manufacturing Data

arXiv:2603.20534v2 Announce Type: replace-cross Abstract: Requirements engineering in Industry 4.0 faces critical challenges with heterogeneous, unstructured do

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning

arXiv:2603.20586v2 Announce Type: replace-cross Abstract: As long-context language modeling becomes increasingly important, the cost of maintaining and attendin

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning

arXiv:2603.21289v2 Announce Type: replace-cross Abstract: Recent progress in multimodal large language models has led to strong performance on reasoning tasks,

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

DeepXplain: XAI-Guided Autonomous Defense Against Multi-Stage APT Campaigns

arXiv:2603.21296v2 Announce Type: replace-cross Abstract: Advanced Persistent Threats (APTs) are stealthy, multi-stage attacks that require adaptive and timely

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

LLM-Powered Workflow Optimization for Multidisciplinary Software Development: An Automotive Industry Case Study

arXiv:2603.21439v2 Announce Type: replace-cross Abstract: Multidisciplinary Software Development (MSD) requires domain experts and developers to collaborate acr

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

mSFT: Addressing Dataset Mixtures Overfitting Heterogeneously in Multi-task SFT

arXiv:2603.21606v2 Announce Type: replace-cross Abstract: Current language model training commonly applies multi-task Supervised Fine-Tuning (SFT) using a homog

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 1mo ago

Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models

arXiv:2603.22042v2 Announce Type: replace-cross Abstract: While Vision-Language Models (VLMs) have achieved remarkable performance, their Euclidean embeddings r

OpenAI Discontinues AI Video Gen App Sora

Forbes Innovation 🧠 Large Language Models ⚡ AI Lesson 1mo ago

OpenAI Discontinues AI Video Gen App Sora

OpenAI has quietly shut down Sora, its short-form AI video app that promised to let anyone create viral videos from text prompts, after just six months.

OpenAI News 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Introducing the OpenAI Safety Bug Bounty program

OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration.

AI-Native Subdomains Make AI-Ready Websites Without Technical Overhaul

Forbes Innovation 🧠 Large Language Models ⚡ AI Lesson 1mo ago

AI-Native Subdomains Make AI-Ready Websites Without Technical Overhaul

AI agents struggle with modern, content heavy websites. It's slow and expensive to crawl. The markdown standard makes your business discoverable to AI without r

Pentagon’s ‘Attempt to Cripple’ Anthropic Is Troubling, Judge Says

Wired AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Pentagon’s ‘Attempt to Cripple’ Anthropic Is Troubling, Judge Says

During a hearing Tuesday, a district court judge questioned the Department of Defense’s motivations for labeling the Claude AI developer a supply-chain risk.

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Stop Guessing Your API Costs: Track LLM Tokens in Real Time

If you're building with LLMs in 2026, you already know the pain: API costs can spiral fast, and most of the time you have no idea how many tokens you're actuall

TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Anthropic hands Claude Code more control, but keeps it on a leash

Anthropic’s new auto mode for Claude Code lets AI execute tasks with fewer approvals, reflecting a broader shift toward more autonomous tools that balance speed

OpenAI open-sources teen safety policies for developers amid mounting lawsuits over ChatGPT deaths

The Next Web AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

OpenAI open-sources teen safety policies for developers amid mounting lawsuits over ChatGPT deaths

OpenAI has spent the past year fielding lawsuits from the families of young people who died after extended interactions with ChatGPT. Now it is trying to give t

TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

OpenAI’s plans to make ChatGPT more like Amazon aren’t going so well

OpenAI says it's moving away from Instant Checkout, which allowed users to buy items directly through the ChatGPT interface.

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Agent Flow: The VS Code Extension That Shows You Exactly What Claude Code Is Doing

Agent Flow is a new VS Code extension that visualizes Claude Code's internal agent behavior, tool calls, and token usage in real-time, turning a black box into

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Microsoft and NVIDIA Partner to Apply AI Across Nuclear Energy Lifecycle: Permitting, Design, and Operations

Microsoft and NVIDIA are collaborating to apply AI tools—including generative AI for regulatory paperwork and digital twins for simulation—to streamline nuclear

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Construí um gerador de playlists no Spotify com Claude <

Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

OpenAI MCP Servers — AI Agents for GPT-4o, o3, DALL-E, and the OpenAI API Platform

At a glance: lastmile-ai/openai-agents-mcp (197 stars) + pierrebrunelle/mcp-server-openai (79 stars). OpenAI has 900+ million weekly ChatGPT users and a $730B v

TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

OpenAI adds open source tools to help developers build for teen safety

Rather than working from scratch to figure out how to make AI safer for teens, developers can use these policies to fortify what they build.

TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Talat’s AI meeting notes stay on your machine, not in the cloud

The subscription-free AI meeting notes app is a local-first twist on notetaking tools like Granola.

AWS Machine Learning 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Accelerating custom entity recognition with Claude tool use in Amazon Bedrock

This post introduces Claude Tool use in Amazon Bedrock which uses the power of large language models (LLMs) to perform dynamic, adaptable entity recognition wit

TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Doss raises $55M for AI inventory management that plugs into ERP

Doss's AI-powered inventory management system integrates with existing ERP systems. The Series B round was co-led by Madrona and Premji Invest.

How Moda Builds Production-Grade AI Design Agents with Deep Agents

LangChain Blog 🧠 Large Language Models ⚡ AI Lesson 1mo ago

How Moda Builds Production-Grade AI Design Agents with Deep Agents

Moda uses a multi-agent system built on Deep Agents and traced through LangSmith to let non-designers create and iterate on professional-grade visuals.

KDnuggets 🧠 Large Language Models ⚡ AI Lesson 1mo ago

ChatLLM Review: Tired of Multiple AI Tools? Here’s a Smarter All-in-One Alternative

Explore ChatLLM by Abacus AI, an all-in-one AI platform that brings together tools like ChatGPT, Claude, and Midjourney into a single workflow. Learn about its

Arm Is Now Making Its Own Chips

Wired AI 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Arm Is Now Making Its Own Chips

The chip design firm says Meta, OpenAI, Cerebras, and Cloudflare are among the first customers of its new artificial intelligence hardware.

The Verge 🧠 Large Language Models ⚡ AI Lesson 1mo ago

Arm’s first CPU ever will plug into Meta’s AI data centers later this year

After decades of only licensing its chip designs for others to use, UK-based Arm revealed the first chip it's producing on its own, and the first customer. Dubb