Code Llama powered Gradio App for Coding: Runs on CPU

AI Anytime · Intermediate ·🧠 Large Language Models ·2y ago

Skills: LLM Foundations90%LLM Engineering80%Fine-tuning LLMs80%Prompt Craft80%

Key Takeaways

The video demonstrates how to harness the power of Code Llama, a cutting-edge large language model, to build a Gradio app for coding tasks that runs on CPU. It covers various aspects of LLM engineering, fine-tuning, and prompt crafting.

Full Transcript

hello everyone welcome to AI anytime channel in this video we are going to talk about Cod llama which is a new large language model by meta AI you know it has been released in the last couple of days and the community is talking about it that this is the most SED uh coding is specific llms uh nowadays okay so we're going to take this llm and see how we can utilize it in a python python application so we going to uh you know build a simple gradio application where we are going to leverage this llm you know to test this on few of the coding problems okay so we'll ask few questions and see what kind of responses we are getting out of this uh code llama model so if you see here currently on my screen it says introducing llama to the next generation of our open source llm all credit goes to meta AI for strengthening the open source Community they started with llama you know few months ago and which was not available commercially as a license but they came up with llama too and uh you know the industry has followed you know after the initial release of llama the first version of llama there was a lot of models like mpts Falcons you know vicunas alides etc etc right so the now we we see every week a new model has been released but when it comes to coding problems which had a different grammar different vocabularies you know different syntaxes etc etc it's difficult for a a model you know which is which which which might be a general purpose large language model for them it's difficult you know to solve coding problems or to respond to your coding questions or queries but on when you talk about uh this llama 2 uh you see here if we come down you know here they will have a new Option called code Lama so they have a Blog on this so if you see it says introducing code llama state-of-the-art large language model for coding and coding is itself uh a very diff very complex when it comes to training models on code CPUs due to multiple reasons as I've said previously so they have trained on 500 billion tokens and that's that's very impressive license is something that you have to uh get in touch with your lawyer to understand the license of code Lama jokes apart by the licenses are a little you know you know weird to understand but it says code Lama is free for research and commercial use but there are some Clauses that you have to be really careful where you know depends what kind of Enterprise you are representing so if you see it says code Lama is built on top of Lama 2 and it's available on on three different models so the foundational code model and which is a python specialized code llama and then an instruction tune so instruction tune is something that we are going to look in this video because we want more humanlike conversation know because this model the foundational model of code Lama you know they also have fill in the middle that's called fim fill in the middle features and the way they had been trained on those 500 billion tokens but we are more you know uh we going to focus on the instruction tuned uh model of uh this this code Lama variant that we have right and you know again the same human eval mbpp and all has been again the evaluation evaluation Benchmark has been done on top of it has performed really good but we have to test it out right uh it performs good on some set of problems but when you do it uh test it out it kind of struggles right so if you see here is a high level view you know single pan view that shows how the three variants in this code Lama or the Llama 2 has been trained also the python code training which is specialized for Python and then we have instruction fine tuning and the foundational code Lama that we have and all of these models are available in three different model wids 7B 13B and 34b in this video I'm going to stick on 7B because I want all of you to try it out and everybody would not have probably the machine to run this in their local machine but I will show that how you can run it on a a CPU machine or even on a GPU with the 7B uh model okay and you can see the performances on human eval mbpp and multilingual human that has been the benchmarking criteria for this L lims available on open AI leaderboard as well you know the uh detail and comprehens see views of this benchmarking and you can see it's performed good okay star coder again leads star coder prompted just to give you information guys if you don't know which model to select for coding I will recommend three models and then again you go and pick one of those star coder by big coder they also have OCTA pack now recently that they have released and then the other model is vigard coder it's optimized lightweight is very optimized when it compare star coder and wiard coder they I'm talking about coding a specific l LMS three models star coder also has a star chat variant of it instruction tuned then we have uh wigard coder and then we have now code llama is these are specific to your coding problems go and pick any one of these model to build Solutions okay or chat board whatever you are building okay a lot of talking guys so what we have to do here now okay uh we are going to rely on Tom jobin again uh the block from hugging face you can see he has more than 800 models on his pluging face repository okay and we going to rely on one of the model because I have applied for the EXs and all it takes around few hours to get the excess from meta Ai and probably that's why in this video I'm going to show you that how you can take a gml gptq or GG you know uh basically let me show ggf model okay which has been supported by Lama CPP and Cobalt by last week after last week they have a new variant or new way of you know a model that you can learn on commodity Hardware that's called ggf and also that okay so I'm going to rely on uh code Lama 7B instruct gz ml okay that's the model I'm going to rely on this video okay so if you see here this is the model that I'm going to take okay you can see this code Lama 7B instruct gml because I'm going to use C Transformers in this video by Mela okay which is available on GitHub I'll give you the link in description I also show you that I'm going to use C Transformers which is a python binding for cc++ and I'm going to you know take that uh library to load this model okay through C Transformers to run this on a CPU Machine by the way and I also show you that how you can easily change it to uh a Cuda Cuda based kernels or something like that GPU so this is the model I'm going to take now how you can download that model is if you come to files and versions inside this repository you will see a lot of you know uh checkpoints basically the model weights okay that that is available here the bin file okay probably you know you can take it this I'm going to use a 4bit quantized model but you can also use a 8 bit quanti which is little heavier okay little bigger in the size I'm going to take a 4bit quanti so this is the model I'm going to use it and that is already available you know here I have downloaded this in my vs code you can see in my this directory which says code Lama project I have downloaded this file which is around 3.7 GB okay so it's a directory we have to keep this you have to keep requirements txt let's go back to vs code in this V code vs code you will see these are the requirements that you need you don't need a stimulate I'll just remove stimate from here let me just remove that so you need c Transformers gradio Lang chain fast API and uvon maybe I'll also remove fast API and Ion phone now we'll do in the next video where we will build a embedding based application through an uh micros service or API back end so in this video you need see Transformers gradio and L chain that's what you need I already have torch everything installed like Transformers and all and maybe I'll just do that in this as well okay and I'm going to use WSL to uh in this so let me just do a clear I'm going to use wsl2 for this you can see I'm already in this uh directory and I'm just going to use that okay so let's create an app.py guys so app.py and the first thing as I said I'm going to use something called [Applause] from L chain. llms import C Transformers C Transformers so I'm going to use C Transformers you can if you don't want to use Lang chain to at least load the model of course you can use it for chain I recommend that as well so you can do it auto causal you know as well you can also do it from from pre-trained so let me just show what I'm talking about it so what you have to do you have to come here and say see Transformers okay which is by Mela here on GitHub and you you can see they you can also load it from Auto model for Cal LM Cal language model which is a task like you can Define your task as text generation or something and you can also do it from from pre-train but langin already has this class in their uh uh like langin has this modu so we can use this C Transformer directly from langin that's what I'm going to use and then I need few few more things here guys so from Lang chain. chains I'm going to use llm chain because we are not going to create our knowledge base here so let me just do llm chain just going to use how we can use uh code Lama in a local machine and just through an app or something like that okay so from Lang chain. chains import llm chain from Lang chain the next thing is import prompt template so going I write a simple prompt prompt template that's it from Lang chain import prom template we need gradio let me just bring few things I'm going to have import OS import iio if if need something then from okay not from uh it's import gradio and you can also do make it sort so we're going to use it later and then let's import time or something like that okay okay cool so that's for import excuse me guys import time and what we need is now let's define a custom prompt so what I'm going to do here I'm going to say okay custom prompt and in this custom prompt let's call this custom prom template little self-explanatory we're defining a simple prom template and just let's have a doc string and within this what I'm going to do I'm going to just write here that you are uh an AI coding assistant okay so now assigning a role to this language model so you would have seen different type of prompting techniques I a very simple it's not a very comprehensive technique but a simple technique that you assign a role you assign a task and you set some limitations you know to that llm okay through your promt template so you are an AI coding assistant and your task is to solve coding problems coding problems and return code Snippets code snippet based on let me just do an ALT G based on users uh based on given users query something like that based on given users query and let's start say below is the users query and then here we'll Define our query so let's just call it let's call this excuse me I don't know what happened query and then here you go query uh looks like know something like that okay basically this this is dynamic this is your query that query will take it from the gradio app now this becomes your query and now you can just write now you just return the helpful answers or something like that okay just return the helpful answer or let's make it more code okay so I'm just going to you just just return the helpful code there okay and here we can write uh let's let's just return the helpful code and uh related details okay and here we can just write helpful code and related details that's it so this is a simple promt templ that we have defined now what I'm going to do I'm going to write a function let's call this sit custom set custom prompt and that basically we going to use this prompt template so set custom prompt so the first thing is the prompt and for that we are going to use from template module from Lang chain and in that you have to pass a couple of things the first thing that we have to do is pass our template so template is nothing but the variable that we have on top which is custom prom template and then it also accepts if you are passing a variable so we are passing query as a variable in llm chain you probably need a single variable when you have embeddings and all then you'll have your own context so you can also pass that context your knowledge base now in this input variables which is very variables which takes the list inside that list you're to pass your query that's it so and just return the prompt here we go so we have one function done return prompt that's it fantastic now what I'm going to do next is okay now we have this return prompt guys now what we do next is that we're going to load the model so let's load the model and then use this so as you can also find it on the GitHub repository that how you can use without Lang chain so let's now do that so what I'm going to write is Define load model and in this load model just let's let's let's uh load this model so I'm going to call variable llm C Transformers and here you can pass many parameters the parameters will be available on this is available on gith what input params basically the inference parameters that you can pass here within C Transformer the first thing that you have to pass is the oh I don't know why I have caps on here so let me just do model so if you are loading it through Lang chain llm you have to write model and if you are using directly from C Transformers then you have to do model path or repo ID then you have to write model path okay so please take care of that okay because you might get error if you just loading it directly from C Transformers if you're not using Lang chain because probably it makes slow so you can also do that so model and here I'm going to pass my model called code llama and that's 7B model so code Lama 7B are going to we we are using an instruction tuned model so instruct the file name we just passing the file name here and I hope you downloaded that file now gml and then version 3 gml version 3. Q4 which is quantized 4bit underscore not and then just bin this is our excuse me this is our file name so we are okay with the file name you can see the left hand side here code Lama 7B you know instruct uh GG MLV three and do Q4 andore bin something now what we're going to do next is we also have to write which model type is that which family it belongs to so C Transformer supports GPT family it support now you have gpt2 GPT neox GPT 6j whatever whatever it also support MPT models now you have MPT 7B 30b whatever it supports Falcon it support llama so it supports different model family so for that you have to write your model type so in this we're going to write model type of model is nothing but llama so simply just write model type llama the next we have to write is Max new tokens and of course you can use all those parameters inference parameters after referring to the GitHub repy let's have one96 keep it little you can keep one night you can also pass contact size guys you know it's supported for Lama model right now through C Transformers now Max new tokens and what else we need is let's pass a temperature value so I'm just going to write temperature I'll have a less temperature here because it's more coding problems I don't want it to be really creative and return something which does not make sense so temperature is 0.2 I think it's floating value if I'm not wrong so temperature and also have a repetition penalty so let's write reputation penalty and I'm going to just uh pass 1.13 which is very standard for large language model reputation falty to look at the reputations of the tokens that you know we have now there are few question that people ask me that can we also USC Transformer through our Cuda kernel or the vram that we have so what you can do in that case you can pass your GPU layers I have two gpus in my machine but I'm not going to use GPU here for this because it's probably fast takes around 20 seconds to generate a response but you can bring it down to 4 to 5 seconds on if you have a good V so on GPU layers what you can do you can you know offload some of the gpus to layers to uh Cuda and for example if you can pass 10 you can pass you know six whatever GPU layers you have to pass of course you should look that which model you are using in that case and how many layers you can pass it to it right so GPU layers and what it also does it also has something called ex stream but I think that is available through the auto uh model that we have right the Cal LM because the stream equals to true so you'll get an streaming response so I'm not going to do in this case but let let me show you what I'm talking about so if you come over here they have all the configuration parameter you can see the config and the parameter now if you see this stream it's a Boolean value either you pass true or false by default it is false whether to stream the generated text if you you can try it out in the terminal first if it's generating the tokens in a streaming format like you see in the chat GPT and also had GPU layers the number of layers to run on GPU it's zero by default it's an integer value and you can see that how you can you have to do this uh C Transformers Cuda because it will install Cub less and all right for Cuda okay so if you want to use Cuda on the C Transformers they will use uh Cub blast and all which is little heavy around 500 MB size you know you have to install that okay and repetion penalty that's it right so I'm not going to do a lot of things here and just return llm so let's just return this llm here so that's it we are okay with it guys now so we have our Define set custom prompt Define load model with C Transformers we have this model model type Lama Max new tokens temperature equals 0.2 repetion penalty 1.13 and return llm and you can see the typos and all if you get any error it's fine now what I'm going to do here is I'm going to write the chain pipeline so let me just write chain uncore Pipeline and in this I'm just going to use the function that I have written on top the first thing is llm uh this is load model so let's do that and the second thing is uh custom prompt so let's just do prompt equals sit custom prompt and just going to use this function here and here I'm going to use my chain so let's call it QA chain a variable and in this QA chain I'm going to pass uh excuse me in this keyway chain what I'm going to pass is the llm chain so I'm going to use llm chain module from Lang chain because I'm not utilizing my own knowledge base so I'm not going to use retrieval or conversational retrieval chain but of course we can do that I'm going to do that in the next video I'm going to create a video on some programming languages reference guide or some documentation and we'll test it out how it responds now QA chain llm chain and in this I'm going to pass my prompt and prompt equals so let's let's call this not let's not call this prompt better to call this keyway prompt but it doesn't make sense you can do prompt equals prompt but just to avoid some I know terminologies uh conflict so I'm just going to call it QA prompt so let's just call it a prompt and in the next what I'm going to do is llm equals llm okay uh this makes sense so prompt and QA prompt and llm equals llm now let's just return the QA chain so I'm just going to return this QA chain now this QA chain will help us pass our query and retrieve information out of code Lama large language model so that's what we have to do now guys so let's let's now do that so maybe we can also print that but let's first use it so what I'm going to do here I'm going to call a variable called LM chain just to test it out so llm chain I'm going to use this chain pipeline here okay and we'll we we we may test this out so llm chain chain pipeline now I'm going to use a grad application few lines of code so let's do that so Define function called bot function that will take query as an input parameter and here I'm going to write my I'm going to use that code the the function that we have written on top chain pipeline so LM response and here let's use that variable called LM chain. run you can directly call this in function also just like chain pipeline something and LM response. run and LM response. run and here you have to pass it's a function okay that do run chain. run or LM response. run which is a variable in this case then we pass a dictionary where we Define our query and then in that query it's query it's a key value pairer so let's just do that query equals query query colon query not equals and then what I'm going to do here is just return this llm response so let's just return this llm response that's it fantastic and now let's build the simple grad application so what I'm going to do here is with gr. block so let's do that g. blocks and it's a function and inside this blocks what I'm going to do is I'm going to have my title and of course you can use markdown HTML to make the title better but let's do that so code llama demo or something code llama demo and you know something like as demo so let's call this as demo and within that I'm going to you maybe you can use that right now it will uh create the demo name the app name in the left hand side but you can configure this through HTML something like that markdown okay now let's have a markdown here so gr. markdown gr. markdown and here I'm going to write uh let's write code L code Lama demo okay you can pass CSS also guys you know custom CSS file something like that okay code Lama now what I'm going to do is I'm going to write my variable called chatbot so in chatboard so in chatboard then G do uh G do chatboard and in this chatbot so g. chatbot and you can also give an element ID which is _ ID basically the element ID just to if you want to use it some sessions and all later on if better to give a element ID so let's do chatboard for this element ID and then you can pass height of that uh Windows where we're going to interact so so height equal maybe you can have six let's keep 700 by now okay so we have defined our interface we need a chat bot which will have an empty list for now element ID just assigning an ID and height equals 700 now this let's have a message variable we going to we have a text box so g. textbox gradio do textbox that's it g text box and you can also have a clear button so let's have clear button that will clear all the output if we need okay so G g r dot you know clear button so gr do clear button and I think it's caps so G do clear button and just message and chatbot that's it and message and jetboard clear button uh I think it's a list it has to go inside a list okay so okay clear and I think we are good now so clear so I'm going to use that function for we need a respond function so let's call it respond so Define respond and I'm going to pass my message and you can also have chat history here so let's do that message Chat history and it's a function so I'm going to have bot message a variable which is going to which is return by the Lama code Lama bot message I'm going to use that function that we have written on line number 44 the bot function and I'm going to pass the message inside it so let's do bot and then just pass our message B excuse me bot message that's it so bot message and then I'm going to have chat history. append you can append that chat history and you can pass uh message and uh bot message so best bot message both history and then you can also have a slip for few so Ms so let's keep two for now time do slip and then just return so return chat history that's it we are okay with it now let's submit that guys so what I'm going to do here is return come outside of this function message do submit and in this message do submit I'm going to have respond I'm going to have a list which will have message and chatbot and then I'm going to have message and chatbot again so that s message and chatboard okay fantastic now what we have to do is now let's uh launch this so with gr blocks and I think we're going to launch this now so let's launch this application which is a gradio app so what I'm going to do is demo dot launch and let me show you what we have done here guys from Top quickly so we have imported all the libraries you can see C Transformers to load the model LM chain as a chain to interact with it uh to retrieve some information through a chain then we have prompt template to Define some template custom prompt you know gradio to help you build a simple interface and then we have couple of functions the first function is a custom prompt leveraging The Prompt then we have load model to load the model then we have chain pipeline chaining the pipe llm chain bot and gr blocks title code Lama demo as demo markdown etc etc so now let's run this guys so that's what I'm going to do in this video video uh now to run this okay so let's now run it and see what we are getting so for that what you have to do I have to come here also so I'm using a wsl2 you can use any terminal you know Anaconda whatever you are using Okay virtual environment you know whatever now let's run it and see what we are getting here [Applause] guys [Applause] for okay uh line 45 llm response ah okay it's not LM response it's llm chain so let's do that so we did that LM chain let's ask that question again write uh function to design a login component error local variable L respon reference okay let's let's close this okay to run it again so now let's do pyth 3 appp and we'll again open this okay and now what we have to do let's just come here let's just refresh this just copy this okay so write uh function to design a login component in angular so this is the question that we have asked here guys like write a function to basically design the you know uh write a function to basically design the login component now you suppose for example if you want to design a login page in angular how we can use code llama you know for that particular task and you know to generate some code and then take that code to test it out if it if that makes sense we ask couple of questions you know that that this being the first question that that we're going to ask and let's see what it responds so if you see it over here here we have asked this question you know to create and it runs on this port 7873 so we are using a grad application you know here to see so let's see what we get and you can see we have got uh I don't know if it looks like me it's a Java code has given guys to be honest okay you can see public class login component it does give a login component probably it would have been hallucinated you know and uh write a function to design a login component in angular so fail first fail and makes sense because we're using a Sab model so don't expect that you know it really going to be good maybe you have to use the 34b or 13B model uh you know to uh but even with uh I was expecting a much better respond than this I will ask this question again and see if I'm getting the same thing so let me just ask this question again so I'm again asking the same question and see if it generates the same answer or it generates something else because we not using SED and all right so I probably assume that you know it will give me some other answer that I'm expecting because right now it I think it has generated a Java code which probably might be Java and angular people have been using I don't know okay so and the code I'm not worried about the if the code is completed or not maybe you have to look at the max new tokens you know to increase the token size and also you can pass for Lama contact size as well you know context window size for that so now let's see for this what ask this question again and see if it's generating the uh same response or if you get gives you the better code now meanwhile let's go back to and you can see here it says login form okay email this is this is great okay uh this is not what I was expecting but let's ask a simple question okay so I'm going to ask how to uh okay let's ask this question write up python code that's why first I tried with angular but now I'm trying with python so write a python code to connect with of SQL database SQL database and uh uh retrieve all the tables name okay probably a far more you know easier question than the previous one now this is the question I asked now let's see if it's uh response for that okay now let's see that so we were expecting much more than that you know even with the 7B model but no which which is okay maybe we have to use 13B model and try it out okay how are we getting it so when I see it for the first time and compare with star coder or Star Chat beta uh to be honest it didn't perform that well uh let's see let's check this out and I think this makes sense so this this is okay so it's it's uses Pi obdc you know and it's connect with that SQL server and this looks good okay so this is fine okay we'll try it out with uh some other questions and let's let's do that so let's so this this makes sense to me and let's see what MAA is MAA has asked okay on their blob in B how do I list all text files in the current directory that have been modified how do I list Define create eval Define create evaluate Ops okay I have a CSV file with this those headers and correlation let me ask a question uh [Applause] convert below or let's write this let me go back to a Java or or simply let's ask this write a python code that uses pandas to load a csb file and then and then uh then us psyit learn to train a linear regression model to train a linear regression model OKAY psy psy also write the code to evaluate evaluate the model on some matrices uh okay uh let's click enter and see if it's able to so this one the last one was okay write a python code to connect with a SQL database which which is at least decent when you compare that with the previous two responses okay okay uh let's see that okay so we'll see that if the complexity increases how the model responds okay uh okay and I think it has done well so what it does it Imports pandas as PD linear model linear regression class then it loads a file train linear regression X and Y fit intercept through fit and X and Y no training and testing by the way no splitter and all it assume that you would do that okay and then it evaluate linear regressions has some matrices like MC and Mae looks good by the way at gives you some uh templat it gives you some boiler plate to work with and uh let me just take a r code R code to linear reg R code for linear regression I want to test it out for the code conversion use case so if it also performs for code conversion R code for linear regression and I'll open any data what they do data C and all okay so let me just copy this I'm going to copy this and I hope this is for linear regression or what okay okay let's use this and see if it's able to convert this code so convert the r code to python okay I'm going to paste this here and click enter and you know you can see right it it works well for pythonic uh at least python as a language uh maybe you can also try it out some uh different programming languages but angular it it didn't work that well maybe I I didn't ask the right setup question you can try it out but this is my last question I'm going to try let's see if it's able to perform a better R code R to uh python conversion and what I can also do I can look at a Java code for database connection Java code for database connection and in that what we can do here uh let me just copy this set of code here okay not much this set of code I will just delete cut this out now I just copied it and after that what we can do we can also try it out on some automated test cases generation and you can see it says import pandas as PD maybe what you can also do guys you know the prom template that we have defined right only return the code for and helpful helpful answer you can remove the Cod s and write helpful answer maybe you have to write a better prompt to get the best out of it but it works fine you can see import pandas SPD from escalar linear model import linear regression I think this is fine right it got your R code understood what you were trying to do and maybe from the doc string or whatever like comment and all now let's see this okay uh write automated test cases test cases for for the code snippet let's see if it's write some code snippet okay now I've taken a Java code and you want to generate some J unit test cases or some kind of test cases using Code Lama how we can do that guys and that's what I'm going to try and this is going to be my last to be honest you know and it's it satisfies me as a 7B model I liked it and I'm really sure that when you go on use 13B and 34b maybe we can try 34b after deploying it on AWS or somewhere and test it out and I think that will be really fine you know when it comes to 34b model but this this works good for me so let's wait for this and see what how kind of responses we are getting here okay it takes right now you know I'm running on CPU but I'll just show you my task manager as well so you can run this on CA as well if you want if you go to Performance you can see on GPU 1 GPU Z GPU not I'm not utilizing any one of the gpus GPU Z has been utilized by the way you know you can see the memory it's sting up but that's fine and uh right automated test cases for the C Sate import didn't give you didn't give you anything here that what I was expecting and probably to the promt that we have but that's all guys this is what I wanted to you know SE in this uh in this video okay and uh wa I I need your comment and thought and feedback in the comment box let me know if you are doing something fantastic or some extraordinary thing with code Lama I'll be more than happy to collaborate you know if needed on that that that part but I liked it as a 7B model and I shown you in this video how you can take this model from Tom jobin the blo hugging face one of the model try it out you can also look at ggf model which has been supported by Lama CPP and Cobalt as well and you can use this as this as well this is what I wanted to do in this video I hope you like this video guys if you have any comment please let me know in the comment box and the code will be available on the GitHub repository and that's all okay if you like the content please hit the like icon if you haven't subscribed the channel yet please subscribe the channel and please share the video and and the channel to your friends and peer thank you so much for watching see you in the next one

Original Description

In this video, I'll walk you through how I harnessed the power of Code Llama, a cutting-edge large language model designed specifically for coding tasks. Code Llama is at the forefront of publicly available language models, offering incredible potential to streamline workflows for experienced developers and democratize coding education for newcomers. I'll demonstrate how I took a quantized version of Code Llama and seamlessly integrated it into a Gradio app, unlocking its remarkable coding capabilities for everyone to use. With Langchain, I'll show you how to craft prompts and chains, harnessing the full potential of Code Llama for coding solutions, all running efficiently on a CPU. Don't miss out on this opportunity to supercharge your coding journey. Watch the video now and let's dive into the world of Code Llama! LLM Playlist: https://www.youtube.com/playlist?list=PLrLEqwuz-mRIdlmvhddd7nGiNh8exqsBG AI Anytime's GitHub: https://github.com/AIAnytime The Bloke HF: https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGML/tree/main Llama 2 Meta AI: https://ai.meta.com/llama/ Code Llama: https://ai.meta.com/blog/code-llama-large-language-model-coding/ CTransformers: https://github.com/marella/ctransformers #generativeai #python #ai

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from AI Anytime · AI Anytime · 0 of 60

← Previous Next →

Spelling and Grammar Checking Streamlit App: Building Docker Image

Spelling and Grammar Checking Streamlit App: Building Docker Image

Spelling and Grammar Checking Streamlit App: Docker Image and Docker Hub

Spelling and Grammar Checking Streamlit App: Docker Image and Docker Hub

Image Caption Generator: Google Colab and Hugging Face

Image Caption Generator: Google Colab and Hugging Face

Low Code/No Code AI Platform Teachable Machine: Brain MRI Image Classification

Low Code/No Code AI Platform Teachable Machine: Brain MRI Image Classification

Low Code/No Code AI Platform Teachable Machine: Testing the Model

Low Code/No Code AI Platform Teachable Machine: Testing the Model

Low Code/No Code AI Platform: Streamlit App for Brain MRI Image Classification

Low Code/No Code AI Platform: Streamlit App for Brain MRI Image Classification

Readme Generator Streamlit App using ChatGPT

Readme Generator Streamlit App using ChatGPT

Generate Minutes of Meeting (MoM) from Video using ChatGPT: AI as an API

Generate Minutes of Meeting (MoM) from Video using ChatGPT: AI as an API

The Great AI Showdown: ChatGPT vs ChatSonic 🔥

The Great AI Showdown: ChatGPT vs ChatSonic 🔥

Generating Transcripts and News Article with Whisper, GPT-3.5, ChatGPT and Streamlit

Generating Transcripts and News Article with Whisper, GPT-3.5, ChatGPT and Streamlit

Toxicity Classifier using Machine Learning and NLP

Toxicity Classifier using Machine Learning and NLP

Toxicity Classifier API using FastAPI

Toxicity Classifier API using FastAPI

Toxicity Classifier Streamlit App

Toxicity Classifier Streamlit App

Low-Code Insurance Prediction with PyCaret and Streamlit

Low-Code Insurance Prediction with PyCaret and Streamlit

Deploy Streamlit Python Application for Free

Deploy Streamlit Python Application for Free

GPT3 Powered Text Analytics App

GPT3 Powered Text Analytics App

AI Image Generation Streamlit App

AI Image Generation Streamlit App

Streamlit and txtai: Building an Abstractive Summarization App in Python

Streamlit and txtai: Building an Abstractive Summarization App in Python

Building a Topic Modeling and Labeling app with Streamlit

Building a Topic Modeling and Labeling app with Streamlit

The Art of AI: Exploring Midjourney, Dall-E, and Lexica

The Art of AI: Exploring Midjourney, Dall-E, and Lexica

Exploring the latest Large Language Models (LLaMA and Alpaca)

Exploring the latest Large Language Models (LLaMA and Alpaca)

Comparing LLMs like GPT-X, LLaMA, and Alpaca: Analyzing the Perplexity Score

Comparing LLMs like GPT-X, LLaMA, and Alpaca: Analyzing the Perplexity Score

GPT-3 powered Q&A App using Langchain, GPT-Index, and Gradio

GPT-3 powered Q&A App using Langchain, GPT-Index, and Gradio

All things #ai . Latest and greatest in AI. #tech #python #chatgpt #youtubeshorts #shorts #gpt3

All things #ai . Latest and greatest in AI. #tech #python #chatgpt #youtubeshorts #shorts #gpt3

Text-to-Video Generation using a Generative AI Model

Text-to-Video Generation using a Generative AI Model

#ai brand name generator. #artificialintelligence #tech #shorts #youtubeshorts #youtube #chatgpt

Talking AGI with Sam Altman: A Deepfake Showcase

Talking AGI with Sam Altman: A Deepfake Showcase

A conversation with ChatGPT creator Sam Altman. #tech #technology #ai #shorts #viral

A conversation with ChatGPT creator Sam Altman. #tech #technology #ai #shorts #viral

Get to Know Anthropic's Claude: The Ultimate ChatGPT Competitor

Get to Know Anthropic's Claude: The Ultimate ChatGPT Competitor

#shorts #chatgpt #python #datascience #tech #coding

#shorts #chatgpt #python #datascience #tech #coding

Recipe Generator App from Cooking Videos using Whisper and ChatGPT

Recipe Generator App from Cooking Videos using Whisper and ChatGPT

Segment Anything Model by Meta AI: An Image Segmentation Model

Segment Anything Model by Meta AI: An Image Segmentation Model

One of the best #ai #books based on #tensorflow. #tech #coding #shorts #chatgpt #machinelearning

One of the best #ai #books based on #tensorflow. #tech #coding #shorts #chatgpt #machinelearning

Music Generation using Mubert #ai . #music #shorts #youtubeshorts #chatgpt #generativeai

Music Generation using Mubert #ai . #music #shorts #youtubeshorts #chatgpt #generativeai

Image to Text Prompt: Reverse Engineering AI Image Generation

Image to Text Prompt: Reverse Engineering AI Image Generation

Image Generation for #ramadan using #ai. #midjourney #chatgpt #shorts #youtubeshorts #islam

Image Generation for #ramadan using #ai. #midjourney #chatgpt #shorts #youtubeshorts #islam

How to build an AI-ready organization: Cultivating a Data-Driven Culture

How to build an AI-ready organization: Cultivating a Data-Driven Culture

Midjourney: Generate AI-powered Images

Midjourney: Generate AI-powered Images

Getting Started with Graphs: A Beginner's Guide (Part 1 of GNN Series)

Getting Started with Graphs: A Beginner's Guide (Part 1 of GNN Series)

Build India's First ChatGPT like App for Politics: BJP-GPT

Build India's First ChatGPT like App for Politics: BJP-GPT

Meet BJP-GPT.... @AIAnytime #bjp #news #shorts #tech #chatgpt #ai #youtubeshorts #coding #video

Meet BJP-GPT.... @AIAnytime #bjp #news #shorts #tech #chatgpt #ai #youtubeshorts #coding #video

ChatPDF... #chatgpt for PDF files. #ai #generativeai #shorts #youtubeshorts #coding #tech #ai

ChatPDF... #chatgpt for PDF files. #ai #generativeai #shorts #youtubeshorts #coding #tech #ai

Free AI Image Generation #ai #chatgpt #coding #tech #shorts #youtubeshorts #shortvideo #generativeai

Free AI Image Generation #ai #chatgpt #coding #tech #shorts #youtubeshorts #shortvideo #generativeai

Transform old photos into Vibrant Memories with Deoldify AI: Build a Streamlit App

Transform old photos into Vibrant Memories with Deoldify AI: Build a Streamlit App

Open Assistant: The Real Open-sourced LLM

Open Assistant: The Real Open-sourced LLM

Thanks to @YannicKilcherand team for the open sourced LLM Open Assistant. #ai #shorts #tech

Thanks to @YannicKilcherand team for the open sourced LLM Open Assistant. #ai #shorts #tech

Search Engine for AI generated images. #ai #tech #technology #generativeai #chatgpt #shorts #video

Search Engine for AI generated images. #ai #tech #technology #generativeai #chatgpt #shorts #video

Generative AI Video Platform "Synthesia" #shorts #youtubeshorts #ai #tech #chatgpt #generativeai

Generative AI Video Platform "Synthesia" #shorts #youtubeshorts #ai #tech #chatgpt #generativeai

Text to speech Voice AI platform. #shorts #youtubeshorts #ai #tech #technology #python #coding

Text to speech Voice AI platform. #shorts #youtubeshorts #ai #tech #technology #python #coding

Create Amazing Videos with ChatGPT and Pictory: Free AI-powered Video Creation

Create Amazing Videos with ChatGPT and Pictory: Free AI-powered Video Creation

Want to create beautiful video using #chatgpt and #pictory ? Watch the tutorial on channel. #ai

Want to create beautiful video using #chatgpt and #pictory ? Watch the tutorial on channel. #ai

Animate your photos using AI. Bring old family photos to life. #ai #tech #shorts #shortvideo #coding

Animate your photos using AI. Bring old family photos to life. #ai #tech #shorts #shortvideo #coding

Create a PDF Search and Summarization Tool in less than 100 Lines of Code: GPT-Index and Streamlit

Create a PDF Search and Summarization Tool in less than 100 Lines of Code: GPT-Index and Streamlit

Text to Video Generation using Videocrafter: Intuitive Math behind Latent Diffusion Model

Text to Video Generation using Videocrafter: Intuitive Math behind Latent Diffusion Model

Gamma AI: Create presentation PPT easily with #ai . #chatgpt #shorts #shortvideo #tech #coding

Gamma AI: Create presentation PPT easily with #ai . #chatgpt #shorts #shortvideo #tech #coding

Tripnotes: Free AI tools for your trip planning. #ai #chatgpt #shorts #youtubeshorts #video

Tripnotes: Free AI tools for your trip planning. #ai #chatgpt #shorts #youtubeshorts #video

Meet Bark (New Text to Speech Model): Clone Any Voice to Generate Music and Speech

Meet Bark (New Text to Speech Model): Clone Any Voice to Generate Music and Speech

Fliki: The free AI video creation tool. #ai #shorts #shortvideo #youtubeshorts #chatgpt #tech #news

Fliki: The free AI video creation tool. #ai #shorts #shortvideo #youtubeshorts #chatgpt #tech #news

Ask Anything Tool: Chat with Your Video using ChatGPT, MiniGPT4, and StableLM

Ask Anything Tool: Chat with Your Video using ChatGPT, MiniGPT4, and StableLM

HuggingChat: Open Source ChatGPT (Interface and Model)

HuggingChat: Open Source ChatGPT (Interface and Model)

This video teaches how to build a Gradio app for coding tasks using Code Llama, a cutting-edge large language model. It covers LLM engineering, fine-tuning, and prompt crafting, and provides practical steps for building and deploying the app.

Key Takeaways

Download model weights from GitHub repository
Load model using C Transformers library
Run model on CPU machine
Change model to use GPU
Develop app using Gradio
Define a custom prompt template
Assign a role to the language model
Define a query to take from the Gradio app

💡 The video demonstrates how to harness the power of Code Llama to build a Gradio app for coding tasks, and provides practical steps for fine-tuning and prompt crafting.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related Reads

From Raw Text to Intelligent AI: The 5 Stages Behind Every Large Language Model

Learn the 5 stages behind building large language models, from raw text to intelligent AI, and understand the complexity of AI development

From Raw Text to Intelligent AI: The 5 Stages Behind Every Large Language Model

Learn the 5 stages behind training large language models, from raw text to intelligent AI, and understand the complexity of LLM development

Medium · Machine Learning

From Raw Text to Intelligent AI: The 5 Stages Behind Every Large Language Model

Discover the 5 stages behind training large language models, from raw text to intelligent AI, and why understanding these stages matters for building effective NLP systems

I built an open, from-scratch MT pipeline + parallel corpus for Tunisian Darija (Arabizi) early baseline, and I'm growing it into a curated community corpus [P]

Learn how an 18-year-old student built an open machine-translation pipeline for Tunisian Darija and discover the importance of community-driven NLP resources

Reddit r/MachineLearning

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)