Stable LM 3B - The new tiny kid on the block.

Sam Witteveen · Beginner ·📰 AI News & Updates ·3y ago

Skills: LLM Foundations80%Fine-tuning LLMs70%

Key Takeaways

The video discusses the release of Stability AI's Stable LM suite, specifically the 3B model, which is a tuned version of the language model trained on 800 billion tokens, and its potential for fine-tuning and commercial use with tools like Hugging Face and PyTorch.

Full Transcript

okay so stability AI has released their first lot of language models in what they're calling the stable LM suite and different blog posts we've kind of known that these were coming for a while there's been talk on Twitter about the founder of stability AI had mentioned that their training a bunch of models but finally we've got a blog post with info about them and we've also got some of the models to play with and there's some really interesting things in here one of the key things that I'm going to show you in this video is that they've released this three billion model so most the models that we've been looking at have sort of started at seven had something at sort of 13 or 14 and then gone 30 65. hear what they're doing it looks like is they're starting at 3 billion they have a seven billion parameter model then they're up to around about 15 and then going up to 65 for this now I should stress that the models that they've released so far are not fully trained which is one of the amazing things about this so they're calling these the Alpha version and the plan is for these to train on I think at least 1.5 trillion tokens of content if not more and currently they've been trained on I think it's 800 billion tokens and just put that in context the pythia models are trained on around 300 to 400 billion tokens and most the other non-lama models remember llama models were trained on a trillion tokens for the smaller ones and 1.4 trillion for the bigger ones but mostly other models out there even things like gpt3 was trained on sort of 300 billion tokens so this is really cool that they're out there and it's just a taster of what's to come when you look at this so blog post talks a little bit about their sort of their thinking behind this that their goal is to open source it they're releasing base models and they're releasing some fine tunings of those base models currently the one I'm going to show you today is not for commercial purpose just because it uses some of the data sets for the fine tuning that were not commercially available this will I think this will be fixed in the next few days that you'll see versions of these where they're trained just on the dolly true data set or on just open data sets so that you can use these fully certainly the base models for all of these you're able to use them for commercial purposes on top of the blog post they've also put out you can come to their GitHub and you can read a little bit about it and you can see here is what's actually been released and we can see that here we've got checkpoints for a base model Alpha and then also a tuned Alpha and these have been trained on 800 billion parameters here they've also put up a hugging face demo if you want to try that I'm going to go through running the small one in collab so that we can check out the speed of it and because they've done this in conjunction with a Luther AI these models are very much geared to work with hugging phase should be adaptable to things out of the box so I think very quickly you'll see 4-bit versions you'll see you know run on your computer versions all this kind of stuff in the near future so let's jump in and have a look at the actual model itself so this is the three billion tuned one from memory it's tuned on alpaca on shared GPT on Dolly on a few different data sets now of course the share GPT in alpaca are not available for commercial use so those ones that's why this model can't be used for commercial use but if you were to fine-tune the base model and perhaps we'll do that in a video in the next day or two if people want to see that you would then be able to get results quite easily and I'm sure people will be fine-tuning them and sticking those up on hugging face Hub all over the place okay if we come in and have a look at this we can see that I'm just using a T4 I'm not using any impressive GPU and I'm using around about half the T4 so you will be able to run this so for all of the people who've been asking me questions about or what can I run on my such and such old GPU there's a good chance that this will run on that right we've set it up so they've got a system prompt in here interestingly the system prompt I'm thinking this is because it's the tune version has got actually reference to the Alpha version in there oh another big thing that I forgot to mention in here is that not only are these being trained on a lot more tokens the sequence length is a lot longer so the sequence length of these is 4090 six I think there's somewhere perhaps not in here but the sequence length for these is much longer than even llama so these definitely going to become the sort of main models of choice at least for the time being with the red pajama Group Training models also it would be interesting to compare those when they finally come out as well but this is really sort of thrown down the gauntlet for everyone to sort of pick it up and run with these so you can see here I've basically just set up a little function to use their system interestingly they're using the tokens are similar to a GPT where you've got a system input then you've got a user and you're an AI assistant going on here first off I've just put in a few of the things that we would normally ask it so write a note to some Altman saying that they should open source gpt4 now it's definitely not as good as like the koala 13 billion or something like that right we're definitely not getting things up there but this is a tiny model compared to that it wasn't that long ago that you would be kind of amazed if this kind of model could be influenced could actually stay on track like you want it yeah so anyway it writes an email it doesn't talk too much about open sourcing it doesn't seem to really have the concept of Open Source in it but it does say your Sam Altman wanted to take a moment to thank you for your work on the insert project name the expertise in language generation text to text technology has been valuable to us it's good text that it's generating out and don't forget this is only sort of Alpha version we would expect this to improve with the next release that they have of this trying to get it to write stories again it's definitely on track of understanding the concept of writing a stories for that so I asked it to basically write a story about a koala who could beat all the camera Lids at pool it's kind of called The Koala camel which I'm not sure that that's not ideal and then it really hasn't got the concept of playing pool it's more a swimming pool in this case the really short one that we always test is what is the capital of England yes gets that fine no problem at all and then sort of the traditional ones one of the difference between llamas alpacas again it's probably not the quality of it is not going to be as good as the bigger models but it's definitely coherent it's doing what we're asking for it to do those kind of things and then finally when I ask it okay are you a fan of The Simpsons tell me a bit about Homer if you remember that I asked this one to see is it going to say no I am an AI model I don't have preferences so nicely it says of course I'm a fan of the show I'm always eager to learn more about the Visionary writers who create the show and particularly if under the writing style and humor of Mr Burns the Beloved voice of Springfield anyway Homer Simpson is indeed a very special individual so it's got the concept of these two things where one I'm asking it about are you a friend of the show second one tell us a bit about homer so remember these models are not going to have as many facts in them and they're not going to be as accurate on the facts just because they're so small what will be interesting is using some of these models with things like Lang chain to call out to search or other things like that where it's getting the facts externally and just using this model for your language for your grammar for your phrasing those kind of things so I put this notebook another one that I've also put in here is the exact same notebook but loading it in eight bits and you can see when we load it in 8-bit we're using even less Ram here we're maybe not getting as huge a win as we do with some of the other ones but it's still going quite well we can also see just here I put the measuring the time of these things we can see that it's generating text very quickly 20 seconds a response for this whereas you know and this is on the small GPU right so I'm normally when you see me doing the timings I'm using an a100 which is much faster bigger GPU here you can see these are very quick what's the capital of London one it's answering that in under a second looks of it so that's really good to see the vacuums the comparison one again around 20 seconds and the Simpson one 15 seconds so this is definitely a model that you're going to be able to use locally and to run on your computer to run for a variety of different things really going to be interesting to see once they've finished training this to the 1.5 trillion or there was even talk of maybe a 3 trillion parameter version how much better is that going to get and then once it's sort of fine-tuned for things like Lang chain and using external data sources how much is that going to get because this is definitely now in the realm of what we kind of want for something to put it into production anyway have a play with it this is a quick one I will do an up another video coming up of the seven billion one we'll look at that more in depth and maybe look at the untuned version versus the tuned version as well in that as always if you've got any questions please put them in the comments below if you found this useful please click like And subscribe I will see you in the next video bye for now

Original Description

Colab 3B: https://colab.research.google.com/drive/1WEug0384n6KveJNWyrovMcapzCwKngMp?usp=sharing Colab 3B 8bit: https://colab.research.google.com/drive/10mFpi-YZyO6uzeKswEfP3YZQ9MqGffxI?usp=sharing Blog post: https://stability.ai/blog/stability-ai-launches-the-first-of-its-stablelm-suite-of-language-models In this video, I look at the announcement by Stability AI for the new suite of StableLM models and we examine the smallest 3B one which has been fine-tuned for instruction responses. For more tutorials on using LLMs and building Agents, check out my Patreon: Patreon: https://www.patreon.com/SamWitteveen Twitter: https://twitter.com/Sam_Witteveen My Links: Linkedin: https://www.linkedin.com/in/samwitteveen/ Github: https://github.com/samwit/langchain-tutorials https://github.com/samwit/llm-tutorials

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Sam Witteveen · Sam Witteveen · 41 of 60

← Previous Next →

LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab

LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab

LangChain Basics Tutorial #2 Tools and Chains

LangChain Basics Tutorial #2 Tools and Chains

ChatGPT API Announcement & Code Walkthrough with LangChain

ChatGPT API Announcement & Code Walkthrough with LangChain

Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference

Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference

LangChain - Conversations with Memory (explanation & code walkthrough)

LangChain - Conversations with Memory (explanation & code walkthrough)

LangChain Chat with Flan20B

LangChain Chat with Flan20B

LangChain - Using Hugging Face Models locally (code walkthrough)

LangChain - Using Hugging Face Models locally (code walkthrough)

PAL : Program-aided Language Models with LangChain code

PAL : Program-aided Language Models with LangChain code

Building a Summarization System with LangChain and GPT-3 - Part 1

Building a Summarization System with LangChain and GPT-3 - Part 1

Building a Summarization System with LangChain and GPT-3 - Part 2

Building a Summarization System with LangChain and GPT-3 - Part 2

Microsoft's Visual ChatGPT using LangChain

Microsoft's Visual ChatGPT using LangChain

Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo

Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo

LangChain Agents - Joining Tools and Chains with Decisions

LangChain Agents - Joining Tools and Chains with Decisions

Investigating Alpaca 7B - Finetuned LLaMa LLM

Investigating Alpaca 7B - Finetuned LLaMa LLM

Comparing LLMs with LangChain

Comparing LLMs with LangChain

Running Alpaca7B in Colab

Running Alpaca7B in Colab

How to finetune your own Alpaca 7B

How to finetune your own Alpaca 7B

How to make a custom dataset like Alpaca7B

How to make a custom dataset like Alpaca7B

Understanding Constitutional AI - the paper and key concepts

Understanding Constitutional AI - the paper and key concepts

Using Constitutional AI in LangChain

Using Constitutional AI in LangChain

Talking to Alpaca with LangChain - Creating an Alpaca Chatbot

Talking to Alpaca with LangChain - Creating an Alpaca Chatbot

Text-to-video-synthesis with Diffusers and Colab

Text-to-video-synthesis with Diffusers and Colab

Meet Dolly the new Alpaca model

Meet Dolly the new Alpaca model

Checking out the Cerebras-GPT family of models

Checking out the Cerebras-GPT family of models

A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)

A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)

Is GPT4All your new personal ChatGPT?

Is GPT4All your new personal ChatGPT?

Raven - RWKV-7B RNN's LLM Strikes Back

Raven - RWKV-7B RNN's LLM Strikes Back

Talk to your CSV & Excel with LangChain

Talk to your CSV & Excel with LangChain

Vicuna - 90% of ChatGPT quality by using a new dataset?

Vicuna - 90% of ChatGPT quality by using a new dataset?

Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍

Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍

Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)

Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)

BabyAGI: Discover the Power of Task-Driven Autonomous Agents!

BabyAGI: Discover the Power of Task-Driven Autonomous Agents!

Auto-GPT - How to Automate a Task Based AI with GPT-4

Auto-GPT - How to Automate a Task Based AI with GPT-4

Improve your BabyAGI with LangChain

Improve your BabyAGI with LangChain

Generative Agents - Deep Dive and GPT-4 Recreation

Generative Agents - Deep Dive and GPT-4 Recreation

GPT4ALLv2: The Improvements and Drawbacks You Need to Know!

GPT4ALLv2: The Improvements and Drawbacks You Need to Know!

Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!

Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!

Red Pajama - Operation: Freeing LLaMA

Red Pajama - Operation: Freeing LLaMA

Investigating Open Assistant - Models, Datasets and Addons

Investigating Open Assistant - Models, Datasets and Addons

Investigating MiniGPT-4 - The Secret behind GPT-V?

Investigating MiniGPT-4 - The Secret behind GPT-V?

Stable LM 3B - The new tiny kid on the block.

Stable LM 3B - The new tiny kid on the block.

Bard can now code and put that code in Colab for you.

Bard can now code and put that code in Colab for you.

Checking out Bark: a Text to Speech system by Suno AI

Checking out Bark: a Text to Speech system by Suno AI

Fine-tuning LLMs with PEFT and LoRA

Fine-tuning LLMs with PEFT and LoRA

Master PDF Chat with LangChain - Your essential guide to queries on documents

Master PDF Chat with LangChain - Your essential guide to queries on documents

Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools

Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools

Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)

Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)

StableVicuna: The New King of Open ChatGPTs?

StableVicuna: The New King of Open ChatGPTs?

WizardLM: Evolving Instruction Datasets to Create a Better Model

WizardLM: Evolving Instruction Datasets to Create a Better Model

LaMini-LM - Mini Models Maxi Data!

LaMini-LM - Mini Models Maxi Data!

Finding the Best Free ChatGPT

Finding the Best Free ChatGPT

MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model

MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model

LangChain Retrieval QA Over Multiple Files with ChromaDB

LangChain Retrieval QA Over Multiple Files with ChromaDB

LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs

LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs

LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!

LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!

Transformers Agent - Is this Hugging Face's LangChain Competitor?

Transformers Agent - Is this Hugging Face's LangChain Competitor?

StarCoder - The LLM to make you a coding star?

StarCoder - The LLM to make you a coding star?

Testing Starcoder for Reasoning with PAL

Testing Starcoder for Reasoning with PAL

The New Wizards - Unfiltered & Unaligned

The New Wizards - Unfiltered & Unaligned

Camel + LangChain for Synthetic Data & Market Research

Camel + LangChain for Synthetic Data & Market Research

The video introduces the Stable LM 3B model, its capabilities, and potential for fine-tuning, and discusses its release and planned future developments, highlighting the importance of staying updated on AI news and developments.

Key Takeaways

Access the Colab notebooks for Stable LM 3B and 3B 8bit
Explore the Hugging Face Hub for model repositories
Fine-tune the Stable LM 3B model using external data sources
Run the model locally on a computer
Monitor future developments of the 1.5 trillion and 3 trillion parameter versions

💡 The Stable LM 3B model's small size and ability to run on local computers make it an accessible option for developers and researchers, with potential for improvement through fine-tuning and external data sources.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

Critical thinking in the AI Era

Develop critical thinking skills to navigate the AI era effectively and make informed decisions

Medium · Data Science

Anthropic Just Passed OpenAI Among Business Users. Here’s What That Means for Your Stack.

Anthropic surpasses OpenAI in business user adoption, impacting the AI stack for enterprises

Introducing beLithe: AI Courses Built for Real People, Not Engineers

Learn about beLithe, an AI course platform designed for non-technical individuals, and its mission to make AI accessible to everyone

AI: Energy Taker or Energy Maker

Learn how rising data center energy demands can catalyze a clean energy transition and why it matters for sustainable AI development

Channels Television