Stable LM 3B - The new tiny kid on the block.
Key Takeaways
The video discusses the release of Stability AI's Stable LM suite, specifically the 3B model, which is a tuned version of the language model trained on 800 billion tokens, and its potential for fine-tuning and commercial use with tools like Hugging Face and PyTorch.
Full Transcript
okay so stability AI has released their first lot of language models in what they're calling the stable LM suite and different blog posts we've kind of known that these were coming for a while there's been talk on Twitter about the founder of stability AI had mentioned that their training a bunch of models but finally we've got a blog post with info about them and we've also got some of the models to play with and there's some really interesting things in here one of the key things that I'm going to show you in this video is that they've released this three billion model so most the models that we've been looking at have sort of started at seven had something at sort of 13 or 14 and then gone 30 65. hear what they're doing it looks like is they're starting at 3 billion they have a seven billion parameter model then they're up to around about 15 and then going up to 65 for this now I should stress that the models that they've released so far are not fully trained which is one of the amazing things about this so they're calling these the Alpha version and the plan is for these to train on I think at least 1.5 trillion tokens of content if not more and currently they've been trained on I think it's 800 billion tokens and just put that in context the pythia models are trained on around 300 to 400 billion tokens and most the other non-lama models remember llama models were trained on a trillion tokens for the smaller ones and 1.4 trillion for the bigger ones but mostly other models out there even things like gpt3 was trained on sort of 300 billion tokens so this is really cool that they're out there and it's just a taster of what's to come when you look at this so blog post talks a little bit about their sort of their thinking behind this that their goal is to open source it they're releasing base models and they're releasing some fine tunings of those base models currently the one I'm going to show you today is not for commercial purpose just because it uses some of the data sets for the fine tuning that were not commercially available this will I think this will be fixed in the next few days that you'll see versions of these where they're trained just on the dolly true data set or on just open data sets so that you can use these fully certainly the base models for all of these you're able to use them for commercial purposes on top of the blog post they've also put out you can come to their GitHub and you can read a little bit about it and you can see here is what's actually been released and we can see that here we've got checkpoints for a base model Alpha and then also a tuned Alpha and these have been trained on 800 billion parameters here they've also put up a hugging face demo if you want to try that I'm going to go through running the small one in collab so that we can check out the speed of it and because they've done this in conjunction with a Luther AI these models are very much geared to work with hugging phase should be adaptable to things out of the box so I think very quickly you'll see 4-bit versions you'll see you know run on your computer versions all this kind of stuff in the near future so let's jump in and have a look at the actual model itself so this is the three billion tuned one from memory it's tuned on alpaca on shared GPT on Dolly on a few different data sets now of course the share GPT in alpaca are not available for commercial use so those ones that's why this model can't be used for commercial use but if you were to fine-tune the base model and perhaps we'll do that in a video in the next day or two if people want to see that you would then be able to get results quite easily and I'm sure people will be fine-tuning them and sticking those up on hugging face Hub all over the place okay if we come in and have a look at this we can see that I'm just using a T4 I'm not using any impressive GPU and I'm using around about half the T4 so you will be able to run this so for all of the people who've been asking me questions about or what can I run on my such and such old GPU there's a good chance that this will run on that right we've set it up so they've got a system prompt in here interestingly the system prompt I'm thinking this is because it's the tune version has got actually reference to the Alpha version in there oh another big thing that I forgot to mention in here is that not only are these being trained on a lot more tokens the sequence length is a lot longer so the sequence length of these is 4090 six I think there's somewhere perhaps not in here but the sequence length for these is much longer than even llama so these definitely going to become the sort of main models of choice at least for the time being with the red pajama Group Training models also it would be interesting to compare those when they finally come out as well but this is really sort of thrown down the gauntlet for everyone to sort of pick it up and run with these so you can see here I've basically just set up a little function to use their system interestingly they're using the tokens are similar to a GPT where you've got a system input then you've got a user and you're an AI assistant going on here first off I've just put in a few of the things that we would normally ask it so write a note to some Altman saying that they should open source gpt4 now it's definitely not as good as like the koala 13 billion or something like that right we're definitely not getting things up there but this is a tiny model compared to that it wasn't that long ago that you would be kind of amazed if this kind of model could be influenced could actually stay on track like you want it yeah so anyway it writes an email it doesn't talk too much about open sourcing it doesn't seem to really have the concept of Open Source in it but it does say your Sam Altman wanted to take a moment to thank you for your work on the insert project name the expertise in language generation text to text technology has been valuable to us it's good text that it's generating out and don't forget this is only sort of Alpha version we would expect this to improve with the next release that they have of this trying to get it to write stories again it's definitely on track of understanding the concept of writing a stories for that so I asked it to basically write a story about a koala who could beat all the camera Lids at pool it's kind of called The Koala camel which I'm not sure that that's not ideal and then it really hasn't got the concept of playing pool it's more a swimming pool in this case the really short one that we always test is what is the capital of England yes gets that fine no problem at all and then sort of the traditional ones one of the difference between llamas alpacas again it's probably not the quality of it is not going to be as good as the bigger models but it's definitely coherent it's doing what we're asking for it to do those kind of things and then finally when I ask it okay are you a fan of The Simpsons tell me a bit about Homer if you remember that I asked this one to see is it going to say no I am an AI model I don't have preferences so nicely it says of course I'm a fan of the show I'm always eager to learn more about the Visionary writers who create the show and particularly if under the writing style and humor of Mr Burns the Beloved voice of Springfield anyway Homer Simpson is indeed a very special individual so it's got the concept of these two things where one I'm asking it about are you a friend of the show second one tell us a bit about homer so remember these models are not going to have as many facts in them and they're not going to be as accurate on the facts just because they're so small what will be interesting is using some of these models with things like Lang chain to call out to search or other things like that where it's getting the facts externally and just using this model for your language for your grammar for your phrasing those kind of things so I put this notebook another one that I've also put in here is the exact same notebook but loading it in eight bits and you can see when we load it in 8-bit we're using even less Ram here we're maybe not getting as huge a win as we do with some of the other ones but it's still going quite well we can also see just here I put the measuring the time of these things we can see that it's generating text very quickly 20 seconds a response for this whereas you know and this is on the small GPU right so I'm normally when you see me doing the timings I'm using an a100 which is much faster bigger GPU here you can see these are very quick what's the capital of London one it's answering that in under a second looks of it so that's really good to see the vacuums the comparison one again around 20 seconds and the Simpson one 15 seconds so this is definitely a model that you're going to be able to use locally and to run on your computer to run for a variety of different things really going to be interesting to see once they've finished training this to the 1.5 trillion or there was even talk of maybe a 3 trillion parameter version how much better is that going to get and then once it's sort of fine-tuned for things like Lang chain and using external data sources how much is that going to get because this is definitely now in the realm of what we kind of want for something to put it into production anyway have a play with it this is a quick one I will do an up another video coming up of the seven billion one we'll look at that more in depth and maybe look at the untuned version versus the tuned version as well in that as always if you've got any questions please put them in the comments below if you found this useful please click like And subscribe I will see you in the next video bye for now
Original Description
Colab 3B: https://colab.research.google.com/drive/1WEug0384n6KveJNWyrovMcapzCwKngMp?usp=sharing
Colab 3B 8bit: https://colab.research.google.com/drive/10mFpi-YZyO6uzeKswEfP3YZQ9MqGffxI?usp=sharing
Blog post: https://stability.ai/blog/stability-ai-launches-the-first-of-its-stablelm-suite-of-language-models
In this video, I look at the announcement by Stability AI for the new suite of StableLM models and we examine the smallest 3B one which has been fine-tuned for instruction responses.
For more tutorials on using LLMs and building Agents, check out my Patreon:
Patreon: https://www.patreon.com/SamWitteveen
Twitter: https://twitter.com/Sam_Witteveen
My Links:
Linkedin: https://www.linkedin.com/in/samwitteveen/
Github:
https://github.com/samwit/langchain-tutorials
https://github.com/samwit/llm-tutorials
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Sam Witteveen · Sam Witteveen · 41 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
▶
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
Sam Witteveen
LangChain Basics Tutorial #2 Tools and Chains
Sam Witteveen
ChatGPT API Announcement & Code Walkthrough with LangChain
Sam Witteveen
Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Sam Witteveen
LangChain - Conversations with Memory (explanation & code walkthrough)
Sam Witteveen
LangChain Chat with Flan20B
Sam Witteveen
LangChain - Using Hugging Face Models locally (code walkthrough)
Sam Witteveen
PAL : Program-aided Language Models with LangChain code
Sam Witteveen
Building a Summarization System with LangChain and GPT-3 - Part 1
Sam Witteveen
Building a Summarization System with LangChain and GPT-3 - Part 2
Sam Witteveen
Microsoft's Visual ChatGPT using LangChain
Sam Witteveen
Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Sam Witteveen
LangChain Agents - Joining Tools and Chains with Decisions
Sam Witteveen
Investigating Alpaca 7B - Finetuned LLaMa LLM
Sam Witteveen
Comparing LLMs with LangChain
Sam Witteveen
Running Alpaca7B in Colab
Sam Witteveen
How to finetune your own Alpaca 7B
Sam Witteveen
How to make a custom dataset like Alpaca7B
Sam Witteveen
Understanding Constitutional AI - the paper and key concepts
Sam Witteveen
Using Constitutional AI in LangChain
Sam Witteveen
Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Sam Witteveen
Text-to-video-synthesis with Diffusers and Colab
Sam Witteveen
Meet Dolly the new Alpaca model
Sam Witteveen
Checking out the Cerebras-GPT family of models
Sam Witteveen
A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
Sam Witteveen
Is GPT4All your new personal ChatGPT?
Sam Witteveen
Raven - RWKV-7B RNN's LLM Strikes Back
Sam Witteveen
Talk to your CSV & Excel with LangChain
Sam Witteveen
Vicuna - 90% of ChatGPT quality by using a new dataset?
Sam Witteveen
Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Sam Witteveen
Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Sam Witteveen
BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
Sam Witteveen
Auto-GPT - How to Automate a Task Based AI with GPT-4
Sam Witteveen
Improve your BabyAGI with LangChain
Sam Witteveen
Generative Agents - Deep Dive and GPT-4 Recreation
Sam Witteveen
GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
Sam Witteveen
Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!
Sam Witteveen
Red Pajama - Operation: Freeing LLaMA
Sam Witteveen
Investigating Open Assistant - Models, Datasets and Addons
Sam Witteveen
Investigating MiniGPT-4 - The Secret behind GPT-V?
Sam Witteveen
Stable LM 3B - The new tiny kid on the block.
Sam Witteveen
Bard can now code and put that code in Colab for you.
Sam Witteveen
Checking out Bark: a Text to Speech system by Suno AI
Sam Witteveen
Fine-tuning LLMs with PEFT and LoRA
Sam Witteveen
Master PDF Chat with LangChain - Your essential guide to queries on documents
Sam Witteveen
Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Sam Witteveen
Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Sam Witteveen
StableVicuna: The New King of Open ChatGPTs?
Sam Witteveen
WizardLM: Evolving Instruction Datasets to Create a Better Model
Sam Witteveen
LaMini-LM - Mini Models Maxi Data!
Sam Witteveen
Finding the Best Free ChatGPT
Sam Witteveen
MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
Sam Witteveen
LangChain Retrieval QA Over Multiple Files with ChromaDB
Sam Witteveen
LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
Sam Witteveen
LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
Sam Witteveen
Transformers Agent - Is this Hugging Face's LangChain Competitor?
Sam Witteveen
StarCoder - The LLM to make you a coding star?
Sam Witteveen
Testing Starcoder for Reasoning with PAL
Sam Witteveen
The New Wizards - Unfiltered & Unaligned
Sam Witteveen
Camel + LangChain for Synthetic Data & Market Research
Sam Witteveen
More on: LLM Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
You Are Not Behind. The World Is.
Medium · AI
Career choice with the advent of AI - pure Computer Science or learn software with a background of core engineering area
Dev.to AI
The AI Hype Cycle: Calm Before the Next Breakthrough?
Medium · Programming
AI won’t replace scientists. It will make the current model of science obsolete
Medium · Data Science
🎓
Tutor Explanation
DeepCamp AI