Ollama - Loading Custom Models

Sam Witteveen · Beginner ·🏭 MLOps & LLMOps ·2y ago

Key Takeaways

The video demonstrates how to load custom models into Ollama, specifically the Jackalope 7B model, a fine-tuning of the Mistral 7B model, using the GGUF format and the LLaMA CPP project.

Full Transcript

okay so in the previous AMA videos we've looked at how to set it up and stuff like that and we've also come in here and looked at how to install different models in here the challenge is what if you actually don't see the model that you want in here and for example if it's a fine tune of one of the models that are in here but you don't actually see the model that you want in here so don't despair it's actually quite easy to install custom models into a llama as well so in this video Let's look at how to do that all right so the custom model that I'm going to install here is a model called Jackal Loop so I've been playing with this in collab I might actually make a full video just about this model but for the sake of this video we're going to install this into olama so you can see that this is basically a 7B model it's a fine-tuning of the mistal 7B model in here and what we want to do is we actually want to get the quantized version of this so if we come down the one that we we want is this ggf here so if I click into this you can see that the bloke has converted the weights from this model into this GG UF format so ggf is used to be gml and this is quantized version of models so this is the project called llama CPP it allows us to basically run models it's what is using for models underneath the hood next up we're going to go into and find the files versions of this model and here I've got a bunch of different choices so basically these represent the quality of the model there different quantizations in here I'm going to go for a reasonably big one I'm going to go for this q6k in here but you might want to pick a different one and I'm just going to click download here now if you're using git and you're quite comfortable with Git of course you can just do you can just pull this down via git as well but here we can just come in here we can download it so that's going to take a little bit of time to download but what I'm going to do is download that and I'm going to put that in my models folder so that I can then start processing it okay now the model is actually downloaded we can actually see it here in my models folder here which I've made so you can see I've got some other models that I've downloaded last week and I made custom versions of last week so the next step is that we want to make a model file so if we come in and have a look at the Alama model file we can see you know what the parameters are and the general thing is going to be from and rather than us put in sort of llama 2 or something here we're actually going to point to the file that we downloaded where they're going to set up our template Etc and then we're going to use the model file to actually create our model so let's do that so I'm just going to make a text file for a model file in here okay so you can see here I've just sort of pre-filled the model file in here so we're basically saying from and we're pointing to the checkpoint that we just downloaded the model weights that we just downloaded so remember these are the quantized version of the Jackalope 7B model in here we need to put in a template for a system prompt so this is the template that they're using here and then the system prompts people can actually fill that out when they're actually running the model in here so okay I'm going to save this okay so I saved that out and we can see now I've got my model file there now you can see that I'm going to be making a model called Jackalope using the model file Jackalope that we've got here so if I run that you'll see it passes the model file it's looking for a model it's going to create the various layers it's going to create system layer Etc and then it's going to write the weights and then finally at the end we'll see this success here now if I come in and have a look at my models under model list you'll see that I've got the Hogwarts model which remember is just a different prompt for l to and I've now got this Jackalope model in here so if I want to use this model I can just come in and say okay Alama run Jackalope and it will start up the model you can see it's running and now I can use the model just like I would any other model in here so if I want to ask it a question I can do that I can converse with it just like normal if I want to see the commands of what I can do I can basically come down and see that and you'll see that it's just like normal for these particular things so this is basically how to set up a custom model how to use it there probably are some models that are not going to work but certainly all the fine tunes of like llama 2 the fine tunes of mistel even the Falcon models Etc will work for this so this is something that you can definitely use to basically start trying out different models that you see on the hugging face Hub there anyway as always if You' got any questions or any comments put them in the comment section below please click like And subscribe it will help people see the video I will talk to you in the next video bye for now

Original Description

Jackalope7B. - https://huggingface.co/openaccess-ai-collective/jackalope-7b GGUF versions - https://huggingface.co/TheBloke/jackalope-7B-GGUF/tree/main For more tutorials on using LLMs and building Agents, check out my Patreon: Patreon: https://www.patreon.com/SamWitteveen Twitter: https://twitter.com/Sam_Witteveen My Links: Linkedin: https://www.linkedin.com/in/samwitteveen/ Github: https://github.com/samwit/langchain-tutorials (updated) https://github.com/samwit/llm-tutorials 00:00 Intro 00:31 Jackalope 7B 01:10 Jackalope 7B GGUF 02:21 Make a model file
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Sam Witteveen · Sam Witteveen · 0 of 60

← Previous Next →
1 LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab
Sam Witteveen
2 LangChain Basics Tutorial #2 Tools and Chains
LangChain Basics Tutorial #2 Tools and Chains
Sam Witteveen
3 ChatGPT API Announcement & Code Walkthrough with LangChain
ChatGPT API Announcement & Code Walkthrough with LangChain
Sam Witteveen
4 Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference
Sam Witteveen
5 LangChain - Conversations with Memory (explanation & code walkthrough)
LangChain - Conversations with Memory (explanation & code walkthrough)
Sam Witteveen
6 LangChain Chat with Flan20B
LangChain Chat with Flan20B
Sam Witteveen
7 LangChain - Using Hugging Face Models locally (code walkthrough)
LangChain - Using Hugging Face Models locally (code walkthrough)
Sam Witteveen
8 PAL : Program-aided Language Models with LangChain code
PAL : Program-aided Language Models with LangChain code
Sam Witteveen
9 Building a Summarization System with LangChain and GPT-3 - Part 1
Building a Summarization System with LangChain and GPT-3 - Part 1
Sam Witteveen
10 Building a Summarization System with LangChain and GPT-3 - Part 2
Building a Summarization System with LangChain and GPT-3 - Part 2
Sam Witteveen
11 Microsoft's Visual ChatGPT using LangChain
Microsoft's Visual ChatGPT using LangChain
Sam Witteveen
12 Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo
Sam Witteveen
13 LangChain Agents - Joining Tools and Chains with Decisions
LangChain Agents - Joining Tools and Chains with Decisions
Sam Witteveen
14 Investigating Alpaca 7B - Finetuned LLaMa LLM
Investigating Alpaca 7B - Finetuned LLaMa LLM
Sam Witteveen
15 Comparing LLMs with LangChain
Comparing LLMs with LangChain
Sam Witteveen
16 Running Alpaca7B in Colab
Running Alpaca7B in Colab
Sam Witteveen
17 How to finetune your own Alpaca 7B
How to finetune your own Alpaca 7B
Sam Witteveen
18 How to make a custom dataset like Alpaca7B
How to make a custom dataset like Alpaca7B
Sam Witteveen
19 Understanding Constitutional AI - the paper and key concepts
Understanding Constitutional AI - the paper and key concepts
Sam Witteveen
20 Using Constitutional AI in LangChain
Using Constitutional AI in LangChain
Sam Witteveen
21 Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Talking to Alpaca with LangChain - Creating an Alpaca Chatbot
Sam Witteveen
22 Text-to-video-synthesis with Diffusers and Colab
Text-to-video-synthesis with Diffusers and Colab
Sam Witteveen
23 Meet Dolly the new Alpaca model
Meet Dolly the new Alpaca model
Sam Witteveen
24 Checking out the Cerebras-GPT family of models
Checking out the Cerebras-GPT family of models
Sam Witteveen
25 A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)
Sam Witteveen
26 Is GPT4All your new personal ChatGPT?
Is GPT4All your new personal ChatGPT?
Sam Witteveen
27 Raven - RWKV-7B RNN's LLM Strikes Back
Raven - RWKV-7B RNN's LLM Strikes Back
Sam Witteveen
28 Talk to your CSV & Excel with LangChain
Talk to your CSV & Excel with LangChain
Sam Witteveen
29 Vicuna - 90% of ChatGPT quality by using a new dataset?
Vicuna - 90% of ChatGPT quality by using a new dataset?
Sam Witteveen
30 Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍
Sam Witteveen
31 Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)
Sam Witteveen
32 BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
BabyAGI: Discover the Power of Task-Driven Autonomous Agents!
Sam Witteveen
33 Auto-GPT - How to Automate a Task Based AI with GPT-4
Auto-GPT - How to Automate a Task Based AI with GPT-4
Sam Witteveen
34 Improve your BabyAGI with LangChain
Improve your BabyAGI with LangChain
Sam Witteveen
35 Generative Agents - Deep Dive and GPT-4 Recreation
Generative Agents - Deep Dive and GPT-4 Recreation
Sam Witteveen
36 GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
GPT4ALLv2: The Improvements and Drawbacks You Need to Know!
Sam Witteveen
37 Dolly 2.0 by Databricks: Open for Business but is it  Ready to Impress!
Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!
Sam Witteveen
38 Red Pajama - Operation: Freeing LLaMA
Red Pajama - Operation: Freeing LLaMA
Sam Witteveen
39 Investigating Open Assistant - Models, Datasets and Addons
Investigating Open Assistant - Models, Datasets and Addons
Sam Witteveen
40 Investigating MiniGPT-4 - The Secret behind GPT-V?
Investigating MiniGPT-4 - The Secret behind GPT-V?
Sam Witteveen
41 Stable LM 3B - The new tiny kid on the block.
Stable LM 3B - The new tiny kid on the block.
Sam Witteveen
42 Bard can now code and put that code in Colab for you.
Bard can now code and put that code in Colab for you.
Sam Witteveen
43 Checking out Bark: a Text to Speech system by Suno AI
Checking out Bark: a Text to Speech system by Suno AI
Sam Witteveen
44 Fine-tuning LLMs with PEFT and LoRA
Fine-tuning LLMs with PEFT and LoRA
Sam Witteveen
45 Master PDF Chat with LangChain - Your essential guide to queries on documents
Master PDF Chat with LangChain - Your essential guide to queries on documents
Sam Witteveen
46 Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools
Sam Witteveen
47 Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
Sam Witteveen
48 StableVicuna: The New King of Open ChatGPTs?
StableVicuna: The New King of Open ChatGPTs?
Sam Witteveen
49 WizardLM: Evolving Instruction Datasets to Create a Better Model
WizardLM: Evolving Instruction Datasets to Create a Better Model
Sam Witteveen
50 LaMini-LM - Mini Models Maxi Data!
LaMini-LM - Mini Models Maxi Data!
Sam Witteveen
51 Finding the Best Free ChatGPT
Finding the Best Free ChatGPT
Sam Witteveen
52 MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model
Sam Witteveen
53 LangChain Retrieval QA Over Multiple Files with ChromaDB
LangChain Retrieval QA Over Multiple Files with ChromaDB
Sam Witteveen
54 LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs
Sam Witteveen
55 LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!
Sam Witteveen
56 Transformers Agent - Is this Hugging Face's LangChain Competitor?
Transformers Agent - Is this Hugging Face's LangChain Competitor?
Sam Witteveen
57 StarCoder - The LLM to make you a coding star?
StarCoder - The LLM to make you a coding star?
Sam Witteveen
58 Testing Starcoder for Reasoning with PAL
Testing Starcoder for Reasoning with PAL
Sam Witteveen
59 The New Wizards - Unfiltered & Unaligned
The New Wizards - Unfiltered & Unaligned
Sam Witteveen
60 Camel + LangChain for Synthetic Data & Market Research
Camel + LangChain for Synthetic Data & Market Research
Sam Witteveen

This video teaches how to load custom models into Ollama, including the Jackalope 7B model, and how to use them for fine-tuning and quantization. It covers the process of downloading and installing custom models, creating model files, and running the models in Ollama.

Key Takeaways
  1. Download the custom model from Hugging Face
  2. Create a model file for the custom model
  3. Point to the downloaded model weights in the model file
  4. Save the model file and run the model in Ollama
  5. Test the custom model and use it for fine-tuning and quantization
💡 The GGUF format allows for quantized models to be used in Ollama, enabling more efficient and accurate fine-tuning and inference.

Related AI Lessons

DevOps Took 10 Years to Mature.
MLOps is distinct from DevOps and solves unique problems, requiring a different approach
Medium · DevOps
Praesto: A Kubernetes Operator for Node-Local ML Model Caching with CSI
Learn how Praesto, a Kubernetes Operator, optimizes ML model caching for Node-Local storage with CSI, reducing costs and improving performance
Medium · DevOps
Beyond `ollama run`: Production-Ready DeepSeek R1 Deployment with vLLM and Nginx
Learn to deploy DeepSeek R1 with vLLM and Nginx for production-ready environments, moving beyond local development
Dev.to · Shannon Dias
MCP Health Check: Building Production Monitoring for Your MCP Server — What I Learned After 84 Production Outages
Learn to build production monitoring for your MCP server to minimize outages and ensure smooth operation
Dev.to AI

Chapters (4)

Intro
0:31 Jackalope 7B
1:10 Jackalope 7B GGUF
2:21 Make a model file
Up next
Pole Pruner How A Rope Lever Shears High Branches
Innoforge Studio
Watch →