Ollama - Loading Custom Models

Sam Witteveen · Beginner ·🏭 MLOps & LLMOps ·2y ago

Skills: LLM Foundations80%

Key Takeaways

The video demonstrates how to load custom models into Ollama, specifically the Jackalope 7B model, a fine-tuning of the Mistral 7B model, using the GGUF format and the LLaMA CPP project.

Full Transcript

okay so in the previous AMA videos we've looked at how to set it up and stuff like that and we've also come in here and looked at how to install different models in here the challenge is what if you actually don't see the model that you want in here and for example if it's a fine tune of one of the models that are in here but you don't actually see the model that you want in here so don't despair it's actually quite easy to install custom models into a llama as well so in this video Let's look at how to do that all right so the custom model that I'm going to install here is a model called Jackal Loop so I've been playing with this in collab I might actually make a full video just about this model but for the sake of this video we're going to install this into olama so you can see that this is basically a 7B model it's a fine-tuning of the mistal 7B model in here and what we want to do is we actually want to get the quantized version of this so if we come down the one that we we want is this ggf here so if I click into this you can see that the bloke has converted the weights from this model into this GG UF format so ggf is used to be gml and this is quantized version of models so this is the project called llama CPP it allows us to basically run models it's what is using for models underneath the hood next up we're going to go into and find the files versions of this model and here I've got a bunch of different choices so basically these represent the quality of the model there different quantizations in here I'm going to go for a reasonably big one I'm going to go for this q6k in here but you might want to pick a different one and I'm just going to click download here now if you're using git and you're quite comfortable with Git of course you can just do you can just pull this down via git as well but here we can just come in here we can download it so that's going to take a little bit of time to download but what I'm going to do is download that and I'm going to put that in my models folder so that I can then start processing it okay now the model is actually downloaded we can actually see it here in my models folder here which I've made so you can see I've got some other models that I've downloaded last week and I made custom versions of last week so the next step is that we want to make a model file so if we come in and have a look at the Alama model file we can see you know what the parameters are and the general thing is going to be from and rather than us put in sort of llama 2 or something here we're actually going to point to the file that we downloaded where they're going to set up our template Etc and then we're going to use the model file to actually create our model so let's do that so I'm just going to make a text file for a model file in here okay so you can see here I've just sort of pre-filled the model file in here so we're basically saying from and we're pointing to the checkpoint that we just downloaded the model weights that we just downloaded so remember these are the quantized version of the Jackalope 7B model in here we need to put in a template for a system prompt so this is the template that they're using here and then the system prompts people can actually fill that out when they're actually running the model in here so okay I'm going to save this okay so I saved that out and we can see now I've got my model file there now you can see that I'm going to be making a model called Jackalope using the model file Jackalope that we've got here so if I run that you'll see it passes the model file it's looking for a model it's going to create the various layers it's going to create system layer Etc and then it's going to write the weights and then finally at the end we'll see this success here now if I come in and have a look at my models under model list you'll see that I've got the Hogwarts model which remember is just a different prompt for l to and I've now got this Jackalope model in here so if I want to use this model I can just come in and say okay Alama run Jackalope and it will start up the model you can see it's running and now I can use the model just like I would any other model in here so if I want to ask it a question I can do that I can converse with it just like normal if I want to see the commands of what I can do I can basically come down and see that and you'll see that it's just like normal for these particular things so this is basically how to set up a custom model how to use it there probably are some models that are not going to work but certainly all the fine tunes of like llama 2 the fine tunes of mistel even the Falcon models Etc will work for this so this is something that you can definitely use to basically start trying out different models that you see on the hugging face Hub there anyway as always if You' got any questions or any comments put them in the comment section below please click like And subscribe it will help people see the video I will talk to you in the next video bye for now

Original Description

Jackalope7B. - https://huggingface.co/openaccess-ai-collective/jackalope-7b GGUF versions - https://huggingface.co/TheBloke/jackalope-7B-GGUF/tree/main For more tutorials on using LLMs and building Agents, check out my Patreon: Patreon: https://www.patreon.com/SamWitteveen Twitter: https://twitter.com/Sam_Witteveen My Links: Linkedin: https://www.linkedin.com/in/samwitteveen/ Github: https://github.com/samwit/langchain-tutorials (updated) https://github.com/samwit/llm-tutorials 00:00 Intro 00:31 Jackalope 7B 01:10 Jackalope 7B GGUF 02:21 Make a model file

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Sam Witteveen · Sam Witteveen · 0 of 60

← Previous Next →

LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab

LangChain Basics Tutorial #1 - LLMs & PromptTemplates with Colab

LangChain Basics Tutorial #2 Tools and Chains

LangChain Basics Tutorial #2 Tools and Chains

ChatGPT API Announcement & Code Walkthrough with LangChain

ChatGPT API Announcement & Code Walkthrough with LangChain

Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference

Trying Out Flan 20B with UL2 - Working in Colab with 8Bit Inference

LangChain - Conversations with Memory (explanation & code walkthrough)

LangChain - Conversations with Memory (explanation & code walkthrough)

LangChain Chat with Flan20B

LangChain Chat with Flan20B

LangChain - Using Hugging Face Models locally (code walkthrough)

LangChain - Using Hugging Face Models locally (code walkthrough)

PAL : Program-aided Language Models with LangChain code

PAL : Program-aided Language Models with LangChain code

Building a Summarization System with LangChain and GPT-3 - Part 1

Building a Summarization System with LangChain and GPT-3 - Part 1

Building a Summarization System with LangChain and GPT-3 - Part 2

Building a Summarization System with LangChain and GPT-3 - Part 2

Microsoft's Visual ChatGPT using LangChain

Microsoft's Visual ChatGPT using LangChain

Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo

Building a Summarization System with LangChain - Part 3 Using ChatGPT Turbo

LangChain Agents - Joining Tools and Chains with Decisions

LangChain Agents - Joining Tools and Chains with Decisions

Investigating Alpaca 7B - Finetuned LLaMa LLM

Investigating Alpaca 7B - Finetuned LLaMa LLM

Comparing LLMs with LangChain

Comparing LLMs with LangChain

Running Alpaca7B in Colab

Running Alpaca7B in Colab

How to finetune your own Alpaca 7B

How to finetune your own Alpaca 7B

How to make a custom dataset like Alpaca7B

How to make a custom dataset like Alpaca7B

Understanding Constitutional AI - the paper and key concepts

Understanding Constitutional AI - the paper and key concepts

Using Constitutional AI in LangChain

Using Constitutional AI in LangChain

Talking to Alpaca with LangChain - Creating an Alpaca Chatbot

Talking to Alpaca with LangChain - Creating an Alpaca Chatbot

Text-to-video-synthesis with Diffusers and Colab

Text-to-video-synthesis with Diffusers and Colab

Meet Dolly the new Alpaca model

Meet Dolly the new Alpaca model

Checking out the Cerebras-GPT family of models

Checking out the Cerebras-GPT family of models

A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)

A Step-by-Step Guide to Fine-Tuning Your Dolly Model (tutorial)

Is GPT4All your new personal ChatGPT?

Is GPT4All your new personal ChatGPT?

Raven - RWKV-7B RNN's LLM Strikes Back

Raven - RWKV-7B RNN's LLM Strikes Back

Talk to your CSV & Excel with LangChain

Talk to your CSV & Excel with LangChain

Vicuna - 90% of ChatGPT quality by using a new dataset?

Vicuna - 90% of ChatGPT quality by using a new dataset?

Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍

Koala Revealed: The ChatGPT Alternative You Need to Know! 🔍

Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)

Running Koala for free in Colab. Your own personal ChatGPT? (tutorial)

BabyAGI: Discover the Power of Task-Driven Autonomous Agents!

BabyAGI: Discover the Power of Task-Driven Autonomous Agents!

Auto-GPT - How to Automate a Task Based AI with GPT-4

Auto-GPT - How to Automate a Task Based AI with GPT-4

Improve your BabyAGI with LangChain

Improve your BabyAGI with LangChain

Generative Agents - Deep Dive and GPT-4 Recreation

Generative Agents - Deep Dive and GPT-4 Recreation

GPT4ALLv2: The Improvements and Drawbacks You Need to Know!

GPT4ALLv2: The Improvements and Drawbacks You Need to Know!

Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!

Dolly 2.0 by Databricks: Open for Business but is it Ready to Impress!

Red Pajama - Operation: Freeing LLaMA

Red Pajama - Operation: Freeing LLaMA

Investigating Open Assistant - Models, Datasets and Addons

Investigating Open Assistant - Models, Datasets and Addons

Investigating MiniGPT-4 - The Secret behind GPT-V?

Investigating MiniGPT-4 - The Secret behind GPT-V?

Stable LM 3B - The new tiny kid on the block.

Stable LM 3B - The new tiny kid on the block.

Bard can now code and put that code in Colab for you.

Bard can now code and put that code in Colab for you.

Checking out Bark: a Text to Speech system by Suno AI

Checking out Bark: a Text to Speech system by Suno AI

Fine-tuning LLMs with PEFT and LoRA

Fine-tuning LLMs with PEFT and LoRA

Master PDF Chat with LangChain - Your essential guide to queries on documents

Master PDF Chat with LangChain - Your essential guide to queries on documents

Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools

Using LangChain with DuckDuckGO Wikipedia & PythonREPL Tools

Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)

Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)

StableVicuna: The New King of Open ChatGPTs?

StableVicuna: The New King of Open ChatGPTs?

WizardLM: Evolving Instruction Datasets to Create a Better Model

WizardLM: Evolving Instruction Datasets to Create a Better Model

LaMini-LM - Mini Models Maxi Data!

LaMini-LM - Mini Models Maxi Data!

Finding the Best Free ChatGPT

Finding the Best Free ChatGPT

MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model

MPT-7B - The First Commercially Usable Fully Trained LLaMA Style Model

LangChain Retrieval QA Over Multiple Files with ChromaDB

LangChain Retrieval QA Over Multiple Files with ChromaDB

LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs

LangChain Retrieval QA with Instructor Embeddings & ChromaDB for PDFs

LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!

LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!

Transformers Agent - Is this Hugging Face's LangChain Competitor?

Transformers Agent - Is this Hugging Face's LangChain Competitor?

StarCoder - The LLM to make you a coding star?

StarCoder - The LLM to make you a coding star?

Testing Starcoder for Reasoning with PAL

Testing Starcoder for Reasoning with PAL

The New Wizards - Unfiltered & Unaligned

The New Wizards - Unfiltered & Unaligned

Camel + LangChain for Synthetic Data & Market Research

Camel + LangChain for Synthetic Data & Market Research

This video teaches how to load custom models into Ollama, including the Jackalope 7B model, and how to use them for fine-tuning and quantization. It covers the process of downloading and installing custom models, creating model files, and running the models in Ollama.

Key Takeaways

Download the custom model from Hugging Face
Create a model file for the custom model
Point to the downloaded model weights in the model file
Save the model file and run the model in Ollama
Test the custom model and use it for fine-tuning and quantization

💡 The GGUF format allows for quantized models to be used in Ollama, enabling more efficient and accurate fine-tuning and inference.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

DevOps Took 10 Years to Mature.

MLOps is distinct from DevOps and solves unique problems, requiring a different approach

Medium · DevOps

Praesto: A Kubernetes Operator for Node-Local ML Model Caching with CSI

Learn how Praesto, a Kubernetes Operator, optimizes ML model caching for Node-Local storage with CSI, reducing costs and improving performance

Medium · DevOps

Beyond `ollama run`: Production-Ready DeepSeek R1 Deployment with vLLM and Nginx

Learn to deploy DeepSeek R1 with vLLM and Nginx for production-ready environments, moving beyond local development

Dev.to · Shannon Dias

MCP Health Check: Building Production Monitoring for Your MCP Server — What I Learned After 84 Production Outages

Learn to build production monitoring for your MCP server to minimize outages and ensure smooth operation

Chapters (4)

Intro

0:31 Jackalope 7B

1:10 Jackalope 7B GGUF

2:21 Make a model file

Pole Pruner How A Rope Lever Shears High Branches

Innoforge Studio