LLM Quantization with llama.cpp on Free Google Colab | Llama 3.1 | GGUF

Name: LLM Quantization with llama.cpp on Free Google Colab | Llama 3.1 | GGUF
Uploaded: 2024-08-08T15:26:43+00:00
Channel: TheAILearner
Description: In this video, I walk you through the process of quantizing a open source LLM (Llama 3.1) using the powerful llama.cpp library, all on a free Google Col...

TheAILearner · Advanced ·🧠 Large Language Models ·1y ago

In this video, I walk you through the process of quantizing a open source LLM (Llama 3.1) using the powerful llama.cpp library, all on a free Google Colab environment. The purpose of this type of quantization is to be able to run the quantized model on both CPU and GPU. Notebook : https://colab.research.google.com/drive/1GmXoZ997XHsd1WTYcB_pPiOvYsfY8nl0?usp=sharing #llama3.1 #llama3 #llamacpp #gguf #HuggingFace #LLM #ModelQuantization #GoogleColab #MachineLearning #AI #NLP #DeepLearning #ModelOptimization #PythonTutorial #HuggingFaceModels #ColabTutorial #AIModels #QuantizationTutorial #Free…

Watch on YouTube ↗ (saves to browser)

Next Up

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)