Core AI

Large Language Models

Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI

24,759
lessons
Skills in this topic
View full skill map →
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding

Showing 5,309 reads from curated sources

ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
A Theoretical Analysis of Test-Driven LLM Code Generation
arXiv:2602.06098v2 Announce Type: replace-cross Abstract: Coding assistants are increasingly utilized in test-driven software development, yet the theoretical m
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform
arXiv:2602.08482v2 Announce Type: replace-cross Abstract: Vessel trajectory data from the Automatic Identification System (AIS) is used widely in maritime analy
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
CoPE-VideoLM: Leveraging Codec Primitives For Efficient Video Language Modeling
arXiv:2602.13191v2 Announce Type: replace-cross Abstract: Video Language Models (VideoLMs) enable AI systems to understand temporal dynamics in videos. To fit w
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
MALLVI: A Multi-Agent Framework for Integrated Generalized Robotics Manipulation
arXiv:2602.16898v4 Announce Type: replace-cross Abstract: Task planning for robotic manipulation with large language models (LLMs) is an emerging area. Prior ap
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning
arXiv:2602.21655v2 Announce Type: replace-cross Abstract: Image captioning remains a fundamental task for vision language understanding, yet ground-truth superv
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Efficient Encoder-Free Fourier-based 3D Large Multimodal Model
arXiv:2602.23153v2 Announce Type: replace-cross Abstract: Large Multimodal Models (LMMs) that process 3D data typically rely on heavy, pre-trained visual encode
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models
arXiv:2603.01305v2 Announce Type: replace-cross Abstract: Large multimodal models (LMMs) exhibit strong task generalization capabilities, offering new opportuni
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
MetaState: Persistent Working Memory Enhances Reasoning in Discrete Diffusion Language Models
arXiv:2603.01331v2 Announce Type: replace-cross Abstract: Discrete diffusion language models (dLLMs) generate text by iteratively denoising a masked sequence. H
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Towards Privacy-Preserving LLM Inference via Covariant Obfuscation (Technical Report)
arXiv:2603.01499v2 Announce Type: replace-cross Abstract: The rapid development of large language models (LLMs) has driven the widespread adoption of cloud-base
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection
arXiv:2603.04427v4 Announce Type: replace-cross Abstract: Standard Transformer attention uses identical dimensionality for queries, keys, and values, yet these
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR
arXiv:2603.07554v2 Announce Type: replace-cross Abstract: Nepal Bhasha (Newari), an endangered language of the Kathmandu Valley, remains digitally marginalized
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Distributional Regression with Tabular Foundation Models: Evaluating Probabilistic Predictions via Proper Scoring Rules
arXiv:2603.08206v4 Announce Type: replace-cross Abstract: Tabular foundation models such as TabPFN and TabICL already produce full predictive distributions, yet
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision People
arXiv:2603.09964v2 Announce Type: replace-cross Abstract: As social virtual reality (VR) grows more popular, addressing accessibility for blind and low vision (
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
arXiv:2603.13793v2 Announce Type: replace-cross Abstract: Low resource languages present unique challenges for natural language processing due to the limited av
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Deconfounded Lifelong Learning for Autonomous Driving via Dynamic Knowledge Spaces
arXiv:2603.14354v2 Announce Type: replace-cross Abstract: End-to-End autonomous driving (E2E-AD) systems face challenges in lifelong learning, including catastr
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
EngGPT2: Sovereign, Efficient and Open Intelligence
arXiv:2603.16430v3 Announce Type: replace-cross Abstract: EngGPT2-16B-A3B is the latest iteration of Engineering Group's Italian LLM and it's built to be a Sove
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
SpecMoE: Spectral Mixture-of-Experts Foundation Model for Cross-Species EEG Decoding
arXiv:2603.16739v2 Announce Type: replace-cross Abstract: Decoding the orchestration of neural activity in electroencephalography (EEG) signals is a central cha
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds
arXiv:2603.18532v2 Announce Type: replace-cross Abstract: The strong performance of large vision-language models (VLMs) trained with reinforcement learning (RL)
ArXiv cs.AI 🧠 Large Language Models 📄 Paper ⚡ AI Lesson 3w ago
SmaAT-QMix-UNet: A Parameter-Efficient Vector-Quantized UNet for Precipitation Nowcasting
arXiv:2603.21879v2 Announce Type: replace-cross Abstract: Weather forecasting supports critical socioeconomic activities and complements environmental protectio
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
I finally stopped wasting tokens with Universal Claude.md
Key Takeaways Universal Claude.md can cut token use by up to 63%, which means you actually spend way less money using LLMs. Developers are fed up with prompt ha
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Dev quietly rebels against Claude’s polite padding in AI outputs
Key Takeaways Devs have been quietly frustrated with Claude’s overly polite, wordy answers for a while. Trimming Claude’s output isn’t just about saving tokens,
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Universal Claude.md lets devs hack verbosity but risks breaking Claude
Key Takeaways Devs are using Universal Claude.md to cut down Claude's wordiness and save on tokens, which means lower API bills. Cutting Claude’s longer answers
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Open source Claude.md tool just slashed my token costs
Key Takeaways An open-source tool called Claude.md just helped someone cut their AI token costs by 63%, which is wild. Most LLMs like Claude spit out a ton of u
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
My AI remembered the wrong thing and broke my build. So I built memory governance.
Six weeks ago I gave my AI assistant a memory . It worked. No more re-explaining the project every session. Bugs got fixed once and stayed fixed. Then it follow
ZDNet 🧠 Large Language Models ⚡ AI Lesson 3w ago
This privacy-first chatbot is taking off - here's why and how to try it
Users are flocking to Duck.ai. Is it a reaction to increasing concerns about AI companies and privacy? Here's what you should know.
The Crow-9b-heretic Model by Crownelius: Here's What You Need to Know
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
The Crow-9b-heretic Model by Crownelius: Here's What You Need to Know
Crow-9B-HERETIC is a 9-billion-parameter language model built on the Qwen 3.5 architecture and distilled from Claude Opus 4.6. The model excels at reasoning tas
What Is LMEB? Long-Horizon Memory Embedding Benchmark Explained
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
What Is LMEB? Long-Horizon Memory Embedding Benchmark Explained
The benchmark itself isn't the solution. It's the beginning of a new research direction, one forced by reality rather than chosen by preference. Models that loo
AI Doesn’t Lie - It Reflects
How Fragmented Signals Distort What LLMs Think Your Company Is
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
AI Doesn’t Lie - It Reflects How Fragmented Signals Distort What LLMs Think Your Company Is
AI systems don’t “understand” your company—they reconstruct it from public signals. When those signals are fragmented, outdated, or inconsistent, AI outputs bec
TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
15% of Americans say they’d be willing to work for an AI boss, according to new poll
According to a Quinnipiac University poll, 15% of Americans say they'd be willing to have a job where their direct supervisor was an AI program that assigned ta
TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Popular AI gateway startup LiteLLM ditches controversial startup Delve
LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last week.
From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs
Machine Learning Mastery 🧠 Large Language Models ⚡ AI Lesson 3w ago
From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs
This article is divided into three parts; they are: • How Attention Works During Prefill • The Decode Phase of LLM Inference • KV Cache: How to Make Decode More
Apple Just Released iOS 26.5 For Developers, But 1 Major iPhone Feature Is Missing
Forbes Innovation 🧠 Large Language Models ⚡ AI Lesson 3w ago
Apple Just Released iOS 26.5 For Developers, But 1 Major iPhone Feature Is Missing
Another iPhone update has just reached its first developer beta. There was a chance it would include the first glimpse of the brand-new Siri, but so far there’s
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
The AI landscape is experiencing unprecedented growth and transformation. This post delves into the key developments shaping the future of artificial intelligen
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Five Hundred Copies of the Same Message in Your Agent's Brain
You send your AI agent a message. The upstream model returns a 429 — rate limited, try again later. Your agent framework dutifully retries. And retries. And ret
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
How to Get Cited within AI Searches
4 core pillars to get cited within AI searches You must shift your strategy from traditional SEO to Generative Engine Optimization (GEO). AI engines do not read
Dev.to AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
How We Built an AI Layer That Understands an Entire Agency Workspace (Not Just One Module)
We shipped the AI layer for Kobin today — an agency operating system that replaces Slack, Notion, HubSpot, Linear, and Buffer. This is the technical story of ho
How AI’s capital explosion signals opportunity but also reveals a critical need for measurable ROI and meaningful impact
The Next Web AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
How AI’s capital explosion signals opportunity but also reveals a critical need for measurable ROI and meaningful impact
The current wave of investment in artificial intelligence reflects one of the largest capital shifts in modern technology, yet questions around financial return
I Gave 5 Frontier Models the Same Email Thread. Here's What They Missed.
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
I Gave 5 Frontier Models the Same Email Thread. Here's What They Missed.
Five frontier models were given a 31-message email thread. They were asked to tell us what was decided, who owns what, and what changed. None of them got all of
Lightview Earns a 49 Proof of Usefulness Score by Building an AI-Safe UI Toolkit for LLM and Human Collaboration
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
Lightview Earns a 49 Proof of Usefulness Score by Building an AI-Safe UI Toolkit for LLM and Human Collaboration
Lightview is an open-source UI toolkit designed to enable safe collaboration between large language models and developers. By introducing a sandboxed computatio
From Pipelines to AI Platforms: How Agentic AI Is Redefining the Role of Data Engineers
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
From Pipelines to AI Platforms: How Agentic AI Is Redefining the Role of Data Engineers
This article explains how agentic AI is transforming data engineering by shifting systems from batch-based analytics to real-time, context-driven architectures.
Latest open artifacts (#20): New orgs! New types of models! With Nemotron Super, Sarvam, Cohere Transcribe, & others
Interconnects 🧠 Large Language Models ⚡ AI Lesson 3w ago
Latest open artifacts (#20): New orgs! New types of models! With Nemotron Super, Sarvam, Cohere Transcribe, & others
New orgs! New types of models! With Nemotron Super, Sarvam, Cohere Transcribe, & others
TechCrunch AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
AI chip startup Rebellions raises $400 million at $2.3B valuation in pre-IPO round
The startup, which is planning to go public later this year, designs chips specifically for AI inference, another challenger to Nvidia's dominance.
Macy's 4.75X Shopping Jump Proves AI Can Move The Top Line
Forbes Innovation 🧠 Large Language Models ⚡ AI Lesson 3w ago
Macy's 4.75X Shopping Jump Proves AI Can Move The Top Line
OpenAI abandoned Instant Checkout the same week with conversions at 1/3 retailer site rates. Same AI generation, opposite results: the gap is not about the mode
Import AI 451: Political superintelligence; Google's society of minds, and a robot drummer
Import AI 🧠 Large Language Models ⚡ AI Lesson 3w ago
Import AI 451: Political superintelligence; Google's society of minds, and a robot drummer
Are there any genies that can be put back in the bottle?
Towards Data Science 🧠 Large Language Models ⚡ AI Lesson 3w ago
Why Data Scientists Should Care About Quantum Computing
Sara A. Metwalli on the rise of a promising new technology, the effects of LLM on her work, and more. The post Why Data Scientists Should Care About Quantum Com
Search Engine Journal 🧠 Large Language Models ⚡ AI Lesson 3w ago
Why New Google-Agent May Be A Pivot Related To OpenClaw Trend via @sejournal, @martinibuster
Why Google's new AI user agent may be tied to shift of resources from Project Mariner To Gemini Agent The post Why New Google-Agent May Be A Pivot Related To Op
Textbooks, Not the Internet, Trained This Powerful AI
Hackernoon 🧠 Large Language Models ⚡ AI Lesson 3w ago
Textbooks, Not the Internet, Trained This Powerful AI
phi-1.5 is a 1.3B-parameter Transformer trained mainly on synthetic, textbook-quality data. Despite its small size, it matches or beats much larger models on co