Running "CODE LLAMA" on Free Colab [Full Code Inside]!!!

1littlecoder · Advanced ·🧠 Large Language Models ·2y ago

Skills: LLM Foundations90%LLM Engineering85%Prompt Craft80%Fine-tuning LLMs70%

Key Takeaways

The video demonstrates how to run the CODE LLAMA model with 7 billion parameters on Google Colab, utilizing the Transformers and Accelerate libraries for text generation and fine-tuning, with a focus on pipeline construction and inference.

Full Transcript

nope I didn't write this code I don't know rejects much but I can use code Lama and do this rejects in this video I'm going to show you how you can run core llama 7 billion parameter model the instruct fine-tuned model on your free Google collaboration let's get started with the video all you need is use this Google Cloud notebook that I'm going to give in the YouTube description click and get started I'm going to explain you line by line the first thing that we need to make sure is we need to make sure whether we have got a GPU so you can make sure it quite easily by just typing Nvidia SMI and they're running the code click runtime Click Change runtime and then you have got the T4 GPU I mean if you are using this Google app notebook most likely it should be already running on GPU but still it is a good practice to make sure that you have GPU the next thing is we need to install the latest Transformers accelerate these are the two libraries that we need to install Transformers and accelerate once we have installed transformation Accelerate from Transformers Import Auto tokenizer input Transformers input torch then the next thing is you need to specify the model this code is especially written for the instruct fine tune model but if you want to use the base model you can still use it I've just commented it and I'll tell you what are the changes that you have to do once you specify the model then you are going to use the tokenizer from pre-trained model and also you're going to build the pipeline if you're not familiar with pipeline pipeline is one of the easiest way the highly abstracted way using Transformers to build a text generation use case so pipeline is also available for a lot of other things like X classification image classification but in this particular case we are going to use pipeline for the particular task which is text generation and we're going to use the same model and we're going to use the model with float 16. here there is no quantization happening just for Clarity on the device map Auto will help the accelerate library to manage memory between CPU and GPU memory this process will take couple of minutes in my case it took two minutes which will download all the required models you can also see that it is a quite a bit of like 12 to 13 gigs of model so just make sure that you have that much of memory for you to run it you're running to some Google collab this should ideally work fine without any issue then the next thing is we are going to give a system prompt and then we are going to give a user question and we're going to create that in the template that Lama model usually accepts that is in this particular format we have got the system prompt and also the user message once you have the system prompt under user message specified in this particular fan part then we're going to use the pipeline to create this particular final step which will take the prompt and also take higher extra other hyper parameters and finally create the sequences which we are finally going to print it out and in this particular case you can play with temperature also you can play with maximum length just to make sure what kind of output on how long do you want it to be once you have specified this all you have to do is now you have to print the sequences and you will have the code ready so let me go and show the question that I've asked I'd also show the output and then we are going to try this ourselves the first question that I've asked is provide answers in Python that is a system prompt and the user messages write a function that detects a pattern that matches the style of 23 0 1 2023 from the given text so what I'm expecting the model to do is I'm expecting the model to create a rejects not necessarily hard code the state but something that looks like this and the code is quite fine I mean it generates a lot more than that but the code is working quite fine so I can just copy this entire thing paste it a Google grab notebook and then you can see that it works completely fine even if I detect the if I change the pattern something like this and run this it works pretty much fine because it manages to write the rejects properly so let's go ahead and ask a different question in this case write a function that detects a pattern I'm going to just give this as an option now so it should be like this uh Jan and I'm going to just give this I'm going to run this code first we have specified the system prompt user prompt then run this code this will take more than few seconds and then finally I'm going to rank this once I run this this should I really given me the code that I can use this in Google collab notebook further sections to practically run and then see if Lama is hallucinating or if it does the job properly meanwhile another thing that I wanted to highlight here is that if you're using not using the InStep model if you are using the bills model then the prompt template that you are going to use here should be slightly different this prompt template is specifically given as a an instruct prompt template if you're using the base model you will use a different method like a prefix and suffix method if you're interested in that I can make a separate tutorial all about how to use that particular model as well but I'm also trying to put together like a web application that will be easier for you to use anyways now this has been done now I'm going to run the final one the that place where we're going to print the result so I write a function that detects a pattern that matches the style of this from the given text now we have got an answer I can copy that come back down and I can paste this after I paste this I can use the function basically and I can paste this and I can run this just to make sure whether it works it didn't work because you know we are still sticking with the world what but let's say in this case I'm going to say Feb the same date Feb and then do this either it should work fine because as you can see the rejects it looks for two digit three letters and then also four digits I think this works completely fine and that is the power of code llama seven billion instant model I don't want to stop with only python even if I don't understand JavaScript necessarily I'm going to go and then say typescript and ask it to do the same thing so I'm going to give the same thing but instead of asking it to do with JavaScript instead of asking you to do it with python I'm asking you to do its typescript and then again this entire thing runs so the system prompt user prompt grows into the prompt template that prompt template goes into the pipeline and the pipeline basically runs the model for inference once that is done we're going to finally print the result unfortunately I do not have enough skill set to evaluate the output but let's see if the rejects matches so meanwhile all you have to do is you have to just go click the link in the YouTube description which I'll give the collab notebook and get started with this little new llama world the model has been run successfully and then print the sequences and now all we change this we change the system prompt typescript and you can see that in typescript it is looking for the constant pattern I think in the typescript probably it didn't do proper good job because it is hard coded it here so maybe we need more prompt Engineering in that particular case it was really good to test so overall I hope this video was helpful to you in learning how to run code Lama 7 billion parameter model on Google collab it's a very simple set of instructions install the latest Transformers from GitHub install accelerate which will help us in memory management and then from Transformers Loadout organizer and improved Transformers which we're going to use for pipeline then input torch which is something we're going to use for the data type and then specify the model and then downward the tokenizer specify the pipeline which is a highly abstracted class for us to do certain tasks using hugging phase Transformers library and in this particular case we are doing text generation this is not a quantized model it's with fluid 16 precision and then provide the system prompt the user prompt and create the prompt template and send the prompt template to Pipeline with the certain hyper parameters and finally get the result and get it printed and you are going to experience a new AI writing computer program for you I hope this was helpful to you if you have any question let me know in the comment section otherwise make sure you check out the description for this Google collab notebook and try it out yourself and let me know the comments what do you feel about this latest llama model

Original Description

Code Llama 7B Instruct Google Colab https://colab.research.google.com/drive/1lyEj1SRw0B9I2UUI2HOrtiJ_fjvbXtA2?usp=sharing ❤️ If you want to support the channel ❤️ Support here: Patreon - https://www.patreon.com/1littlecoder/ Ko-Fi - https://ko-fi.com/1littlecoder

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from 1littlecoder · 1littlecoder · 0 of 60

← Previous Next →

How to create your Free Data Science Blog on Github with Fastpages from Fastai

How to create your Free Data Science Blog on Github with Fastpages from Fastai

Making Interactive Matplotlib Plots for Data Science Visualizations on Jupyter (Python)

Making Interactive Matplotlib Plots for Data Science Visualizations on Jupyter (Python)

Create your first Data Science Web App using R Shiny

Create your first Data Science Web App using R Shiny

How to create a Reproducible Example in R using reprex

How to create a Reproducible Example in R using reprex

No Code Visualization using esquisse with Tableau-like Drag and Drop GUI in R

No Code Visualization using esquisse with Tableau-like Drag and Drop GUI in R

Scrape HTML Table using rvest and Process them for insights using tidyverse in R

Scrape HTML Table using rvest and Process them for insights using tidyverse in R

Google Teachable Machine Learning Build No Code AI solution

Google Teachable Machine Learning Build No Code AI solution

Create meaningful fake tidy datasets in R using fakir [#rstats Package]

Create meaningful fake tidy datasets in R using fakir [#rstats Package]

How to enable using R Programming with Visual Studio VS Code

How to enable using R Programming with Visual Studio VS Code

Python, Community, Books - with Abhiram R - Bangpypers Co-organizers | 1littlecoder podcast

Python, Community, Books - with Abhiram R - Bangpypers Co-organizers | 1littlecoder podcast

Growing a Tech Community across India - Anubha Maneshwar, Founder Girlscript | 1littlecoder Podcast

Growing a Tech Community across India - Anubha Maneshwar, Founder Girlscript | 1littlecoder Podcast

Intro to Google Colab - How to use Colab

Intro to Google Colab - How to use Colab

Intro to Plotly Express - Complex Interactive Charts with One-Line of Python Code

Intro to Plotly Express - Complex Interactive Charts with One-Line of Python Code

Indic NLP Python Toolkit Open Source Development - iNLTK Creator Gaurav Arora | 1littlecoder Podcast

Indic NLP Python Toolkit Open Source Development - iNLTK Creator Gaurav Arora | 1littlecoder Podcast

Do you want a career in Data Science - Tamil Webinar

Do you want a career in Data Science - Tamil Webinar

Android Smartphone Analysis in R [Live Coding Screencast]

Android Smartphone Analysis in R [Live Coding Screencast]

Programmatically create Images, Memes, Watermarks using Python with imgmaker

Programmatically create Images, Memes, Watermarks using Python with imgmaker

Kaggle Walkthrough to get you started with Data Science - Webinar

Kaggle Walkthrough to get you started with Data Science - Webinar

Community, Corporate Job, Coding - Gnana Lakshmi T C aka Gyan, WomenWhoCode Leadership Fellow

Community, Corporate Job, Coding - Gnana Lakshmi T C aka Gyan, WomenWhoCode Leadership Fellow

Easy ggplot2 Theme Customization with {ggeasy} | Data Visualization in R

Easy ggplot2 Theme Customization with {ggeasy} | Data Visualization in R

Excel to R - Pivot + Bar Chart in Excel & R using tidyverse [Live Coding]

Excel to R - Pivot + Bar Chart in Excel & R using tidyverse [Live Coding]

Excel to R #2 - VLOOKUP in Excel to LEFT_JOIN, MERGE in R

Excel to R #2 - VLOOKUP in Excel to LEFT_JOIN, MERGE in R

5 websites to get Free Real-World Datasets for Data Science/ML Projects

5 websites to get Free Real-World Datasets for Data Science/ML Projects

Excel to R #3 - APPROXIMATE VLOOKUP in Excel to FUZZY LEFT_JOIN in R

Excel to R #3 - APPROXIMATE VLOOKUP in Excel to FUZZY LEFT_JOIN in R

Correlation-alternative PPS (Predictive Power Score) Python Package Demo

Correlation-alternative PPS (Predictive Power Score) Python Package Demo

Automated Website Screenshots in R using {webshot}

Automated Website Screenshots in R using {webshot}

Installing Custom RStudio Theme (Synthwave85)

Installing Custom RStudio Theme (Synthwave85)

Analyse Google Trends Search Data in R using {gtrendsR}

Analyse Google Trends Search Data in R using {gtrendsR}

3 Tips to ask question on Stack Overflow the right way to get answers

3 Tips to ask question on Stack Overflow the right way to get answers

Learn Data Science with R - Mini Projects - Web Scraping Zomato

Learn Data Science with R - Mini Projects - Web Scraping Zomato

Easily make Dumbbell Chart using {ggcharts} | Data Visualization in R

Easily make Dumbbell Chart using {ggcharts} | Data Visualization in R

GET Hackernews Front Page Results using REST API in R

GET Hackernews Front Page Results using REST API in R

Quickly deploy ML WebApps from Google Colab using ngrok

Quickly deploy ML WebApps from Google Colab using ngrok

Use Jupyter Notebooks within VSCode (Visual Studio Code) in 2020

Use Jupyter Notebooks within VSCode (Visual Studio Code) in 2020

Plotly Interactive Plots as Pandas Plotting Backend df.plot()

Plotly Interactive Plots as Pandas Plotting Backend df.plot()

Stack Overflow Developer Survey 2020 Highlights for New Programmers

Stack Overflow Developer Survey 2020 Highlights for New Programmers

Matplotlib Animation Charts in Python using Celluloid

Matplotlib Animation Charts in Python using Celluloid

Coding, Postwoman, Passion Project Book - Liyas Thomas Open Source Developer - 1littlecoder podcast

Coding, Postwoman, Passion Project Book - Liyas Thomas Open Source Developer - 1littlecoder podcast

Aspiring Data Scientist, Tips on How to learn Business Domain Knowledge

Aspiring Data Scientist, Tips on How to learn Business Domain Knowledge

Bokeh Interactive Charts as Pandas Plotting Backend df.plot_bokeh()

Bokeh Interactive Charts as Pandas Plotting Backend df.plot_bokeh()

Easy Fast Python Pandas Summary with Sidetable | Pandas Tips & Tricks

Easy Fast Python Pandas Summary with Sidetable | Pandas Tips & Tricks

Inception, Content Ideas, Consistency - Srivatsan Srinivasan AIEngineering YouTube Content Creator

Inception, Content Ideas, Consistency - Srivatsan Srinivasan AIEngineering YouTube Content Creator

ggplot2 Text Customization with ggtext | Data Visualization in R

ggplot2 Text Customization with ggtext | Data Visualization in R

Penguins Dataset Overview - iris alternative | EDA Data Visualization in R

Penguins Dataset Overview - iris alternative | EDA Data Visualization in R

YouTube Growth Tips, Content Creation - Bhavesh Bhatt, YouTuber (Data Science & Machine Learning) #7

YouTube Growth Tips, Content Creation - Bhavesh Bhatt, YouTuber (Data Science & Machine Learning) #7

Matplotlib Animated Bar Chart Race in Python | Data Visualization

Matplotlib Animated Bar Chart Race in Python | Data Visualization

Simple Python GUI Development using {guietta}

Simple Python GUI Development using {guietta}

#8 Niche, Growth, Monetization - David Langer - YouTuber Dave on Data

#8 Niche, Growth, Monetization - David Langer - YouTuber Dave on Data

Simple Fast 3-step Python OCR using Deep Learning 40+ Languages

Simple Fast 3-step Python OCR using Deep Learning 40+ Languages

Github New Feature Profile Summary/Mini-Resume - Profile Views

Otto ML Assistant, GPT-3 on Philosophers, Nvidia-ARM - 3 ML Tech News

Otto ML Assistant, GPT-3 on Philosophers, Nvidia-ARM - 3 ML Tech News

What is OpenAI GPT-3 - Hype, Examples, Worries

What is OpenAI GPT-3 - Hype, Examples, Worries

Julia 1.5, Datamuse API, Live HDR+ Pixel 4a - Machine Learning Tech News

Julia 1.5, Datamuse API, Live HDR+ Pixel 4a - Machine Learning Tech News

Self-driving Car Engineer sentenced, arXiv Dataset, AI/ML Startup Idea - Machine Learning Tech News

Self-driving Car Engineer sentenced, arXiv Dataset, AI/ML Startup Idea - Machine Learning Tech News

GPT-3 Explorer, Ciphey (Automated Decryption), Py-Sudoku - ML Tech News

GPT-3 Explorer, Ciphey (Automated Decryption), Py-Sudoku - ML Tech News

How to use Advanced Google Search to extract Email Ids from Linkedin

How to use Advanced Google Search to extract Email Ids from Linkedin

Cartoonizer Toon-IT (AI Web App), GPT-3 Advice, Android Earthquake Detection - ML Tech News

Cartoonizer Toon-IT (AI Web App), GPT-3 Advice, Android Earthquake Detection - ML Tech News

Flow - R Package to visualize code logic, functions as a Flow Diagram

Flow - R Package to visualize code logic, functions as a Flow Diagram

Build GPT-3-like Language Model on Google Colab with minGPT [PyTorch]

Build GPT-3-like Language Model on Google Colab with minGPT [PyTorch]

Create a Pencil Sketch Portrait with Python OpenCV

Create a Pencil Sketch Portrait with Python OpenCV

This video teaches how to run the CODE LLAMA model on Google Colab, covering the installation of necessary libraries, construction of pipelines, and inference techniques. It provides a comprehensive guide to utilizing the LLaMA model for text generation and fine-tuning.

Key Takeaways

Install Transformers and Accelerate libraries
Specify the model (instruct fine-tuned or base model)
Use tokenizer from pre-trained model
Build pipeline for text generation
Download and load model into memory
Run CODE LLAMA on Google Colab
Specify system prompt and user prompt
Run code and wait for inference
Print result

💡 The video highlights the importance of pipeline construction and memory management when working with large language models like LLaMA, and demonstrates how to utilize the Transformers and Accelerate libraries to optimize model performance.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related Reads

Claude Sonnet 5 Just Launched. Is It Actually Better Or Just Newer?

Learn how Claude Sonnet 5 compares to other models like Opus 4.8 and GPT 5.6 in terms of pricing, performance, and benchmarking, and understand what these differences mean for your projects

Claude Sonnet 5 Just Launched. Is It Actually Better Or Just Newer?

Learn how Claude Sonnet 5 compares to Frontier models in pricing, performance, and benchmarking, and what this means for your ML projects

Medium · Machine Learning

Claude Sonnet 5 Just Launched. Is It Actually Better Or Just Newer?

Learn how Claude Sonnet 5 compares to Frontier models in terms of pricing, performance, and benchmarking, and understand what these differences mean for your projects

Claude Sonnet 5 Didn’t Just Get Smarter. It Changed the Economics of AI.

Claude Sonnet 5's advancements have transformed the economics of AI, making it more viable for production

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)