[1hr Talk] Intro to Large Language Models

Andrej Karpathy · Beginner ·🧠 Large Language Models ·2y ago
This is a 1 hour general-audience introduction to Large Language Models: the core technical component behind systems like ChatGPT, Claude, and Bard. What they are, where they are headed, comparisons and analogies to present-day operating systems, and some of the security-related challenges of this new computing paradigm. As of November 2023 (this field moves fast!). Context: This video is based on the slides of a talk I gave recently at the AI Security Summit. The talk was not recorded but a lot of people came to me after and told me they liked it. Seeing as I had already put in one long weekend of work to make the slides, I decided to just tune them a bit, record this round 2 of the talk and upload it here on YouTube. Pardon the random background, that's my hotel room during the thanksgiving break. - Slides as PDF: https://drive.google.com/file/d/1pxx_ZI7O-Nwl7ZLNk5hI3WzAsTLwvNU7/view?usp=share_link (42MB) - Slides. as Keynote: https://drive.google.com/file/d/1FPUpFMiCkMRKPFjhi9MAhby68MHVqe8u/view?usp=share_link (140MB) Few things I wish I said (I'll add items here as they come up): - The dreams and hallucinations do not get fixed with finetuning. Finetuning just "directs" the dreams into "helpful assistant dreams". Always be careful with what LLMs tell you, especially if they are telling you something from memory alone. That said, similar to a human, if the LLM used browsing or retrieval and the answer made its way into the "working memory" of its context window, you can trust the LLM a bit more to process that information into the final answer. But TLDR right now, do not trust what LLMs say or do. For example, in the tools section, I'd always recommend double-checking the math/code the LLM did. - How does the LLM use a tool like the browser? It emits special words, e.g. |BROWSER|. When the code "above" that is inferencing the LLM detects these words it captures the output that follows, sends it off to a tool, comes back with the result and continues the genera
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

How I Made My Android App Discoverable on 4 LLMs in 24 Hours (llms.txt, IndexNow, JSON-LD, the Bing Cycle)
Make your Android app discoverable on 4 LLMs in 24 hours using llms.txt, IndexNow, JSON-LD, and the Bing Cycle
Dev.to · TAMSIV
What LLMs Can Actually Do for Your Business
Discover how LLMs can revolutionize your business by automating written content generation, improving email management, and enhancing overall productivity
Medium · AI
MiMo-V2.5-Pro: The Long-Context LLM I’d Actually Test Before Paying More for Claude or GPT
Learn about MiMo-V2.5-Pro, a long-context LLM, and why you should test it before paying for alternatives like Claude or GPT
Medium · Programming
25 Deep Learning Questions Every GenAI Engineer Gets Asked (And How to Answer Them)- Part I
Learn how to answer 25 deep learning questions for GenAI engineers, covering topics like RAG pipelines and multi-agent workflows
Medium · Deep Learning
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →