StableVicuna: The New King of Open ChatGPTs?
Key Takeaways
The video explores StableVicuna, an open-source Vicuna style model using LLaMa and RLHF training, and provides a tutorial on using the model in a Colab notebook.
Full Transcript
okay welcome back in this video we're going to look at a new model that stability AI has released in the past day or so and this is a model they're calling stable vikuna I and they're claiming that this is the world's first open source RL HF llm chatbot it sounds like a lot of pre-qualifiers in there so they've released a blog post and it basically is just talking about how this is of a kuna model just the original bakuna model but then fine-tuned on a number of different data sets so they've basically taken a llama model and they've trained it up if we look down here at the data sets we can see that they've trained it up on the open Assistant conversations on the GPT for all prompts generation and on the alpaca data set I presume this is the clean one perhaps not so a lot of these are still distilled data sets meaning that it's taken from gbt3 or chat gbt and we can't use them for commercial use the data sets themselves there's definitely an argument around whether people can use them or not but certainly the model itself is non-commercial this is using the original llama weights from meta and they for whatever reason still haven't allowed people to use this model commercially so my guess is that as stable Ai and others also so training up llama models this is like a soft run for seeing what it's going to be like to train once they've got their own llama model something that could be released for commercial use so it's interesting that the data sets they've gone for they haven't put in like the dolly 2 data set and there are a number of other you know data sets that they've decided not to go for that say the koala model had which were quite interesting data sets as well the other thing is that they've gone for three main data sets for the rlhf and one of them comes from open Assistant one of them is from anthropic and one of them is this Stanford human preferences and these data sets are publicly available if we come and have a look we can see here that this is the anthropic data set and we can see that there's the strings where it's got a human and it's got the answer and it has two lots of them and then people choose which one was the better one anyway if we have a look at the results in here we can see that they've basically benchmarked their model against a number of the other models like this over the past a month or so so we've got things like GPT for all koala we've got fukuna you know 1.1 we've got the alpaca model in there as well and we can see from most things this model does seem to do very well we can see on certain stats like the truthful q a it perhaps is not as good as alpaca and not as good as the the Cuna 1.1 or even the the koala model in there but on on the whole it seems to do pretty good so let's jump in and look at the codelabs I've set up this codelab you will need an a100 to be able to run it unfortunately this is a big model it's 13-bit even in even loading it in 8-bit you'll still need a pretty decent GPU to be able to do this because it's a llama model we can just bring in the llama tokenizer and the Llama for causal language modeling lucky for us this hugging face user that bloke has already converted the weights over so they're all there we can just bring them in and use them straight away and then once you bring it in you basically just set up a pipeline for doing text generation I'm going to set the max length in here to 512 but you could certainly extend that I'm going to set temperature to 0.7 and then just going to set up a few little things to clean up the prompts as we go through so one of the important things with all these models is that you must prompt it in the way that it expects to prompt so I saw some people pointing out that some of the other models don't do as well and they certainly don't do as well when you don't prompt them in the right way in this way you basically have to have it hashtag hashtag spacehuman colon then whatever you want to put in there and then a new line hashtag hashtag hashtag assistant if you don't put this in you'll find that it will work some of the times but it won't work all of the time and you'll definitely get you know some really weird outputs at times and sometimes even no output at times as well so with any of these models you want to go and check what is the format of the prompt that's going on in there all right so if we ask it the standard question that we've been asking all these what is the difference between llamas alpacas and pecunias I we can see that okay it's it's getting a response back probably on par with what vukune was delivering before I I think for this one it's it's a good response it's not necessarily outstandingly better than the other ones I would say that there are quite a few of them are getting good responses for uh this kind of prompt already if we look at write the write short note to Sam Altman giving reasons to open source GPT for here we've got a a nice sort of email slash note going through the various reasons that it it comes up with I don't think this is going to influence Samuel or open AI anytime soon to actually open source this but it it does show us that okay that this can write an email one of the tricks that I always do is basically just to ask it a very simple to the point question in this case what is the capital of England the capital of England is London it's nice and succinct in its answer it's also response time was actually quite quick for this as well story writing I think in some ways the koala models do better at story writing because they were actually also pre-trained on some data sets around story writing and poems and stuff like that where I don't think this model has been pre-trained with those in there that said though it's still able to come up with a story it understands that playing pool is it it's pull the game not pull something that you swim in and overall it puts together a story that okay makes sense we can look at that and understand it went on one said I was pleased with was this as an AI do you like The Simpsons and what do you know about homer so again this is one of the ones that we've asked for a lot of the models and looked at it and often the answer we'll get back is that it cannot have a preference because it's an AI model this one doesn't say that we get back yes I am a fan of The Simpsons it's one of my favorite TV shows and has been around for many years and it goes into a whole thing about Homer and gets facts about the TV show it's able to then work out that summer it up that homo is a lovable character with plenty of flaws that make him relatable to audiences I think the answer for this one is very good compared to some of the ones that we've seen where it basically just doesn't want to give an answer so the other thing that I thought I'd do is take it and try it out on some of the the flan paper examples so a while back when I did a notebook for the flan 20 billion we went through some of the examples in the paper and some of those examples are really good here we can see we've got answer the following question by reasoning step by step the cafeteria had 23 apples if they used 20 for lunch and bought six more how many apples do they have so the answer should be nine and this gets it very well this is not the case with things like wizard LM with a lot of the other LMS where its math is really not good and it's not able to work out these kinds of things I'm not sure you know this is part of the maybe advantages that they're getting from the rlhf I was a bit concerned that maybe this is just in the training set somewhere and that's where it picks it up so the interesting thing worth trying a few things like this next one answer the following question by yes or no by reasoning step by step can you write a whole Haiku in a single tweet this gets wrong right this one it gives us an interesting answer and it certainly does you know give us the reasoning but it comes up with a wrong answer for this next one is another one from the flan paper can Jeffrey Hinton have a conversation with George Washington give the rationale before answering and there's no it's not possible for Jeffrey engine to have a conversation with George Washington as they lived in different centuries and were born over 200 years apart additionally communication between people from different time periods would require some form of time travel which has yet to be discovered or developed it's nice that it put in that last bit I anyway that one again this may have been in the training set so I asked it about Marcus Aurelius and George Washington again it gave a very good coherent answer explaining that these two people lived in different times and did very good job with that actually so then I started to ask it a few questions about Marcus Aurelius to sort of just test it for facts and it does pretty well with this if we ask it tell me three facts about Marcus Lewis that most people don't know it's able to come up with three lesser known facts in there I but then certain times it will just fail miserably so in this case we ask it okay who is Marcus aurelius's son and a here it just says that the name of his son is not known as there are no historical records indicating that and that's not true at all his son went on to become emperor this one fails but then the amazing thing is if we just add to this a little bit and say who was Marcus aurelius's son and what was he like now it suddenly says oh yes Marcus Aurelius had a son named Commodus correct who later became emperor of Rome correct however Commerce is remembered for his tyrannical rule correct and even assassination correct so it has the facts in there but at times it's it's not very good at getting some of those out and then when I asked it this you know about communist directly it was able to put this together and and give us some information about him that is accurate as well so overall I think stable for kuna is definitely a cool model I will make a follow-up video of talking about using this as an open source model with Lang chain for react reasoning I've put together a notebook for that so I think maybe one of the next videos will walk through and look at can this model be used for that and see the the results of that anyway overall it's definitely worth checking out if you have the ability to run this model I think there are versions now already out there with four bits so that you could run this locally or you could run this with a smaller GPU it's definitely worth checking out having a play with and seeing for your particular use case how good is it for you and I think that's the key thing with all these models now is that for each person's use case they tends to be one of these top models will be the right one for you and we're not seeing sort of massive jumps like we were perhaps a month ago in some of these models anyway as always if you've got any questions please put them in the comments if you like the video please click click and subscribe and I will see you in the next video bye for now
Original Description
Colab StableVicuna 8bit: https://colab.research.google.com/drive/1Kvf3qF1TXE-jR-N5G9z1XxVf5z-ljFt2?usp=sharing
Blog post: https://stability.ai/blog/stablevicuna-open-source-rlhf-chatbot
In this video I look at the model StableVicuna from Stability AI, a Vicuna style model using the base LLaMa and RLHF training.
For more tutorials on using LLMs and building Agents, check out my Patreon:
Patreon: https://www.patreon.com/SamWitteveen
Twitter: https://twitter.com/Sam_Witteveen
My Links:
Linkedin: https://www.linkedin.com/in/samwitteveen/
Github:
https://github.com/samwit/langchain-tutorials
https://github.com/samwit/llm-tutorials
00:00 Intro
02:15 datasets
03:15 Colab notebook
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Sam Witteveen · Sam Witteveen · 48 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
▶
49
50
51
52
53
54
55
56
57
58
59
60
LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
Sam Witteveen
LangChain Basics Tutorial #2 Tools and Chains
Sam Witteveen
ChatGPT API Announcement & Code Walkthrough with LangChain
Sam Witteveen
Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Sam Witteveen
LangChain - Conversations with Memory (explanation & code walkthrough)
Sam Witteveen
LangChain Chat with Flan20B
Sam Witteveen
LangChain - Using Hugging Face Models locally (code walkthrough)
Sam Witteveen
PAL : Program-aided Language Models with LangChain code
Sam Witteveen
Building a Summarization System with LangChain and GPT-3 - Part 1
Sam Witteveen
Building a Summarization System with LangChain and GPT-3 - Part 2
Sam Witteveen
Microsoft's Visual ChatGPT using LangChain
Sam Witteveen
Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Sam Witteveen
LangChain Agents - Joining Tools and Chains with Decisions
Sam Witteveen
Investigating Alpaca 7B - Finetuned LLaMa LLM
Sam Witteveen
Comparing LLMs with LangChain
Sam Witteveen
Running Alpaca7B in Colab
Sam Witteveen
How to finetune your own Alpaca 7B
Sam Witteveen
How to make a custom dataset like Alpaca7B
Sam Witteveen
Understanding Constitutional AI - the paper and key concepts
Sam Witteveen
Using Constitutional AI in LangChain
Sam Witteveen
Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Sam Witteveen
Text-to-video-synthesis with Diffusers and Colab
Sam Witteveen
Meet Dolly the new Alpaca model
Sam Witteveen
Checking out the Cerebras-GPT family of models
Sam Witteveen
A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
Sam Witteveen
Is GPT4All your new personal ChatGPT?
Sam Witteveen
Raven - RWKV-7B RNN's LLM Strikes Back
Sam Witteveen
Talk to your CSV & Excel with LangChain
Sam Witteveen
Vicuna - 90% of ChatGPT quality by using a new dataset?
Sam Witteveen
Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Sam Witteveen
Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Sam Witteveen
BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
Sam Witteveen
Auto-GPT - How to Automate a Task Based AI with GPT-4
Sam Witteveen
Improve your BabyAGI with LangChain
Sam Witteveen
Generative Agents - Deep Dive and GPT-4 Recreation
Sam Witteveen
GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
Sam Witteveen
Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!
Sam Witteveen
Red Pajama - Operation: Freeing LLaMA
Sam Witteveen
Investigating Open Assistant - Models, Datasets and Addons
Sam Witteveen
Investigating MiniGPT-4 - The Secret behind GPT-V?
Sam Witteveen
Stable LM 3B - The new tiny kid on the block.
Sam Witteveen
Bard can now code and put that code in Colab for you.
Sam Witteveen
Checking out Bark: a Text to Speech system by Suno AI
Sam Witteveen
Fine-tuning LLMs with PEFT and LoRA
Sam Witteveen
Master PDF Chat with LangChain - Your essential guide to queries on documents
Sam Witteveen
Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Sam Witteveen
Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Sam Witteveen
StableVicuna: The New King of Open ChatGPTs?
Sam Witteveen
WizardLM: Evolving Instruction Datasets to Create a Better Model
Sam Witteveen
LaMini-LM - Mini Models Maxi Data!
Sam Witteveen
Finding the Best Free ChatGPT
Sam Witteveen
MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
Sam Witteveen
LangChain Retrieval QA Over Multiple Files with ChromaDB
Sam Witteveen
LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
Sam Witteveen
LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
Sam Witteveen
Transformers Agent - Is this Hugging Face's LangChain Competitor?
Sam Witteveen
StarCoder - The LLM to make you a coding star?
Sam Witteveen
Testing Starcoder for Reasoning with PAL
Sam Witteveen
The New Wizards - Unfiltered & Unaligned
Sam Witteveen
Camel + LangChain for Synthetic Data & Market Research
Sam Witteveen
More on: LLM Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Debugging Benchmark: DeepSeek V4 Pro vs MiMo V2.5 Pro
Dev.to · Stanislav
How I'm re-discovering computer science with LLM revolution
Dev.to · popiol
I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing
Medium · AI
I Asked ChatGPT to Fix My Life. It Couldn’t — Until I Changed One Thing
Medium · ChatGPT
Chapters (3)
Intro
2:15
datasets
3:15
Colab notebook
🎓
Tutor Explanation
DeepCamp AI