Round 2 - I use CodeLlama 70B vs Mixtral MoE to write code to finetune a model on 16 GPUs ๐Ÿคฏ๐Ÿคฏ

william falcon ยท Beginner ยท๐Ÿง  Large Language Models ยท2y ago
I try to see if I how well three different LLMs work for writing a python script to finetune a model on 16 GPUs (multi-node). This video is not edited in any way. It shows a realistic workflow for coding without gimmicks or hype. I ask CodeLlama 70B, Mixtral MoE to write a python program to finetune a computer vision model on the CIFAR10 dataset. You can validate all this for yourself by running the 3 studios for free: This is an unedited video... so here are some corrections: - To clarify what "based on Llama 2 means". Mistral 7B tweaks the way llama 2 does attention but is then pretrainedโ€ฆ
Watch on YouTube โ†— (saves to browser)

Chapters (26)

Introduction
0:40 Run CodeLlama 70B
1:13 Run Mixtral 8x7B (MoE)
1:34 Run Mistral 7B
1:47 How to get a GPU
2:08 What is a Lightning Studio
3:47 Basic CodeLlama 70B test
4:20 Basics of model monitoring
4:39 Connect a local VSCode
6:20 Basic Mixtral MoE coding test
8:46 Create the prompt to generate the ML code
9:04 Connect an S3 bucket
10:10 Full prompt for ML code
13:16 Prompt Mistral 7B
13:50 Debug the finetuning script
14:16 About the Lightning Trainer
14:56 Sanity check the finetuning script
15:30 Monitor with Tensorboard
16:20 About model RAM and model size
16:44 A quick TL;DR about profiling a model
17:40 Scale to multi-node (16 GPUs)
19:10 CodeLlama 70B results
20:00 About finetuning
22:10 Monitoring the 16 GPUs
22:54 CodeLlama 70B code results
25:35 Look at multi-node logs, weights
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)