How to Build an LLM from Scratch | An Overview

Shaw Talebi ยท Advanced ยท๐Ÿง  Large Language Models ยท2y ago
๐Ÿค Work with me: https://aibuilder.academy/yt/ZLbVdvOoTKM ๐Ÿš€ Ship AI apps in weeks, not months: https://aibuilder.academy/courses/yt/ZLbVdvOoTKM This is the 6th video in a series on using large language models (LLMs) in practice. Here, I review key aspects of developing a foundation LLM based on the development of models such as GPT-3, Llama, Falcon, and beyond. More Resources: โ–ถ๏ธ Series Playlist: https://www.youtube.com/playlist?list=PLz-ep5RbHosU2hnz5ejezwaYpdMutMVB0๐Ÿ“ฐ Read more: https://medium.com/towards-data-science/how-to-build-an-llm-from-scratch-8c477768f1f9?sk=18c351c5cae9ac89df682dd14736a9f3 [1] BloombergGPT: https://arxiv.org/pdf/2303.17564.pdf [2] Llama 2: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/ [3] LLM Energy Costs: https://www.statista.com/statistics/1384401/energy-use-when-training-llm-models/ [4] arXiv:2005.14165 [cs.CL] [5] Falcon 180b Blog: https://huggingface.co/blog/falcon-180b [6] arXiv:2101.00027 [cs.CL] [7] Alpaca Repo: https://github.com/gururise/AlpacaDataCleaned [8] arXiv:2303.18223 [cs.CL] [9] arXiv:2112.11446 [cs.CL] [10] arXiv:1508.07909 [cs.CL] [11] SentencePience: https://github.com/google/sentencepiece/tree/master [12] Tokenizers Doc: https://huggingface.co/docs/tokenizers/quicktour [13] arXiv:1706.03762 [cs.CL] [14] Andrej Karpathy Lecture: https://www.youtube.com/watch?v=kCc8FmEb1nY&t=5307s [15] Hugging Face NLP Course: https://huggingface.co/learn/nlp-course/chapter1/7?fw=pt [16] arXiv:1810.04805 [cs.CL] [17] arXiv:1910.13461 [cs.CL] [18] arXiv:1603.05027 [cs.CV] [19] arXiv:1607.06450 [stat.ML] [20] arXiv:1803.02155 [cs.CL] [21] arXiv:2203.15556 [cs.CL] [22] Trained with Mixed Precision Nvidia: https://docs.nvidia.com/deeplearning/performance/mixed-precision-training/index.html [23] DeepSpeed Doc: https://www.deepspeed.ai/training/ [24] https://paperswithcode.com/method/weight-decay [25] https://towardsdatascience.com/what-is-gradient-clipping-b8e815cdfb48 [26] arXiv:2001.083
Watch on YouTube โ†— (saves to browser)
Sign in to unlock AI tutor explanation ยท โšก30

Playlist

Uploads from Shaw Talebi ยท Shaw Talebi ยท 43 of 60

1 biometricDashboard2 DEMO
biometricDashboard2 DEMO
Shaw Talebi
2 biometricDahboard3 DEMO
biometricDahboard3 DEMO
Shaw Talebi
3 Time Series, Signals, & the Fourier Transform | Introduction
Time Series, Signals, & the Fourier Transform | Introduction
Shaw Talebi
4 The Fast Fourier Transform | How does it (actually) work?
The Fast Fourier Transform | How does it (actually) work?
Shaw Talebi
5 The Wavelet Transform | Introduction & Example Code
The Wavelet Transform | Introduction & Example Code
Shaw Talebi
6 Principal Component Analysis (PCA) | Introduction & Example (Python) Code
Principal Component Analysis (PCA) | Introduction & Example (Python) Code
Shaw Talebi
7 Independent Component Analysis (ICA) | EEG Analysis Example Code
Independent Component Analysis (ICA) | EEG Analysis Example Code
Shaw Talebi
8 Kmeans-based Blink Detecter DEMO
Kmeans-based Blink Detecter DEMO
Shaw Talebi
9 Shit Happens, Stay Solution Oriented
Shit Happens, Stay Solution Oriented
Shaw Talebi
10 Why Conflict Is Good & How You Can Use It
Why Conflict Is Good & How You Can Use It
Shaw Talebi
11 Causality: An Introduction | How (naive) statistics can fail us
Causality: An Introduction | How (naive) statistics can fail us
Shaw Talebi
12 Causal Inference | Answering causal questions
Causal Inference | Answering causal questions
Shaw Talebi
13 Causal Discovery | Inferring causality from observational data
Causal Discovery | Inferring causality from observational data
Shaw Talebi
14 How to Be Antifragile | 7 Practical Tips
How to Be Antifragile | 7 Practical Tips
Shaw Talebi
15 Multi-kills: How to Do More With Less (no, not by multi-tasking)
Multi-kills: How to Do More With Less (no, not by multi-tasking)
Shaw Talebi
16 Topological Data Analysis (TDA) | An introduction
Topological Data Analysis (TDA) | An introduction
Shaw Talebi
17 The Mapper Algorithm | Overview & Python Example Code
The Mapper Algorithm | Overview & Python Example Code
Shaw Talebi
18 Persistent Homology | Introduction & Python Example Code
Persistent Homology | Introduction & Python Example Code
Shaw Talebi
19 What Is Data Science & How To Start? | A Beginner's Guide
What Is Data Science & How To Start? | A Beginner's Guide
Shaw Talebi
20 How to do MORE with LESS - multikills
How to do MORE with LESS - multikills
Shaw Talebi
21 Causal Effects | An introduction
Causal Effects | An introduction
Shaw Talebi
22 Causal Effects via Propensity Scores | Introduction & Python Code
Causal Effects via Propensity Scores | Introduction & Python Code
Shaw Talebi
23 Causal Effects via the Do-operator | Overview & Example
Causal Effects via the Do-operator | Overview & Example
Shaw Talebi
24 Causal Effects via DAGs | How to Handle Unobserved Confounders
Causal Effects via DAGs | How to Handle Unobserved Confounders
Shaw Talebi
25 Smoothing Crypto Time Series with Wavelets | Real-world Data Project
Smoothing Crypto Time Series with Wavelets | Real-world Data Project
Shaw Talebi
26 Causal Effects via Regression w/ Python Code
Causal Effects via Regression w/ Python Code
Shaw Talebi
27 5 Reasons Why Every Data Scientist Should Consider Freelancing
5 Reasons Why Every Data Scientist Should Consider Freelancing
Shaw Talebi
28 An Introduction to Decision Trees | Gini Impurity & Python Code
An Introduction to Decision Trees | Gini Impurity & Python Code
Shaw Talebi
29 10 Decision Trees are Better Than 1 | Random Forest & AdaBoost
10 Decision Trees are Better Than 1 | Random Forest & AdaBoost
Shaw Talebi
30 Dimensionality Reduction & Segmentation with Decision Trees | Python Code
Dimensionality Reduction & Segmentation with Decision Trees | Python Code
Shaw Talebi
31 How to Make a Data Science Portfolio With GitHub Pages (2025)
How to Make a Data Science Portfolio With GitHub Pages (2025)
Shaw Talebi
32 My $100,000+ Data Science Resume (what got me hired)
My $100,000+ Data Science Resume (what got me hired)
Shaw Talebi
33 How to Create a Custom Email Signature in Gmail (2025)
How to Create a Custom Email Signature in Gmail (2025)
Shaw Talebi
34 I Spent $675.92 Talking to Top Data Scientists on Upworkโ€”Hereโ€™s what I learned
I Spent $675.92 Talking to Top Data Scientists on Upworkโ€”Hereโ€™s what I learned
Shaw Talebi
35 Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience
Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience
Shaw Talebi
36 A Practical Introduction to Large Language Models (LLMs)
A Practical Introduction to Large Language Models (LLMs)
Shaw Talebi
37 The OpenAI (Python) API | Introduction & Example Code
The OpenAI (Python) API | Introduction & Example Code
Shaw Talebi
38 The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio
The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio
Shaw Talebi
39 Why I Quit My $150,000 Data Science Job
Why I Quit My $150,000 Data Science Job
Shaw Talebi
40 Prompt Engineering: How to Trick AI into Solving Your Problems
Prompt Engineering: How to Trick AI into Solving Your Problems
Shaw Talebi
41 The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness
The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness
Shaw Talebi
42 Fine-tuning Large Language Models (LLMs) | w/ Example Code
Fine-tuning Large Language Models (LLMs) | w/ Example Code
Shaw Talebi
โ–ถ How to Build an LLM from Scratch | An Overview
How to Build an LLM from Scratch | An Overview
Shaw Talebi
44 I Have 90 Days to Make $10k/moโ€”Here's my plan
I Have 90 Days to Make $10k/moโ€”Here's my plan
Shaw Talebi
45 I Spent $716.46 Talking to Data Scientists on Upworkโ€”Hereโ€™s what I learned.
I Spent $716.46 Talking to Data Scientists on Upworkโ€”Hereโ€™s what I learned.
Shaw Talebi
46 Pareto, Power Laws, and Fat Tails
Pareto, Power Laws, and Fat Tails
Shaw Talebi
47 Do NOT become an entrepreneur #entrepreneurship
Do NOT become an entrepreneur #entrepreneurship
Shaw Talebi
48 Detecting Power Laws in Real-world Data | w/ Python Code
Detecting Power Laws in Real-world Data | w/ Python Code
Shaw Talebi
49 How Iโ€™d learn data analytics (if I had to start over in 2024) #dataanalytics
How Iโ€™d learn data analytics (if I had to start over in 2024) #dataanalytics
Shaw Talebi
50 4 Ways to Measure Fat Tails with Python (+ Example Code)
4 Ways to Measure Fat Tails with Python (+ Example Code)
Shaw Talebi
51 Fine-tuning EXPLAINED in 40 sec #generativeai
Fine-tuning EXPLAINED in 40 sec #generativeai
Shaw Talebi
52 How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)
How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)
Shaw Talebi
53 5 Questions Every Data Scientist Should Hardcode into Their Brain
5 Questions Every Data Scientist Should Hardcode into Their Brain
Shaw Talebi
54 AI for Business: A (non-technical) introduction
AI for Business: A (non-technical) introduction
Shaw Talebi
55 LLMs EXPLAINED in 60 seconds #ai
LLMs EXPLAINED in 60 seconds #ai
Shaw Talebi
56 3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning
3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning
Shaw Talebi
57 What is #ai? โ€” Simply Explained
What is #ai? โ€” Simply Explained
Shaw Talebi
58 QLoRAโ€”How to Fine-tune an LLM on a Single GPU (w/ Python Code)
QLoRAโ€”How to Fine-tune an LLM on a Single GPU (w/ Python Code)
Shaw Talebi
59 How to Improve LLMs with RAG (Overview + Python Code)
How to Improve LLMs with RAG (Overview + Python Code)
Shaw Talebi
60 Text Embeddings, Classification, and Semantic Search (w/ Python Code)
Text Embeddings, Classification, and Semantic Search (w/ Python Code)
Shaw Talebi

Related AI Lessons

โšก
LlamaIndex + x711: enrich your RAG pipeline with real-time tools
Enhance your RAG pipeline with real-time data using LlamaIndex and x711 to provide up-to-date answers
Dev.to AI
โšก
Neutral-Atom Quantum: What Is It, And Why Infleqtion Stands Out
Learn about neutral-atom quantum computing and why Infleqtion stands out in this field
Forbes Innovation
โšก
The Human-in-the-Loop Trap
Learn why human-in-the-loop is more than a compliance checkbox for enterprise AI teams and how to effectively implement it
Medium ยท Machine Learning
โšก
I thought LLM tool calling would kill glue code and then my lights still wouldnโ€™t turn on
LLM tool calling and MCP solve interoperability issues but don't eliminate glue code, and teams still face challenges with auth and proxies
Dev.to ยท Lars Winstand
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch โ†’