LLM Quantization with llama.cpp on Free Google Colab | Llama 3.1 | GGUF

TheAILearner · Advanced ·🧠 Large Language Models ·1y ago
In this video, I walk you through the process of quantizing a open source LLM (Llama 3.1) using the powerful llama.cpp library, all on a free Google Colab environment. The purpose of this type of quantization is to be able to run the quantized model on both CPU and GPU. Notebook : https://colab.research.google.com/drive/1GmXoZ997XHsd1WTYcB_pPiOvYsfY8nl0?usp=sharing #llama3.1 #llama3 #llamacpp #gguf #HuggingFace #LLM #ModelQuantization #GoogleColab #MachineLearning #AI #NLP #DeepLearning #ModelOptimization #PythonTutorial #HuggingFaceModels #ColabTutorial #AIModels #QuantizationTutorial #Free…
Watch on YouTube ↗ (saves to browser)
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)