How to run Llama-7B on a laptop with 4GB GPU

Name: How to run Llama-7B on a laptop with 4GB GPU
Uploaded: 2023-04-30T12:54:25+00:00
Channel: DeepLearning Hero
Description: In this tutorial we will load and make predictions with the Llama-7B model using a Laptop with 6GB free RAM and 4GB GPU Github: https://github.com/thush...

DeepLearning Hero · Beginner ·🧠 Large Language Models ·2y ago

In this tutorial we will load and make predictions with the Llama-7B model using a Laptop with 6GB free RAM and 4GB GPU Github: https://github.com/thushv89/tutorials_deeplearninghero/blob/master/llms/llama_on_laptop.ipynb llm.int8() paper: https://arxiv.org/pdf/2208.07339.pdf Huggingface's accelerate: https://huggingface.co/docs/accelerate/index 00:00 - Introduction 01:52 - Initial setup 02:43 - Main libraries 04:04 - Compute specifications 04:30 - Using the accelerate library 06:33 - Using GPU, CPU and Disk to load the model 07:52 - Loading the model 08:10 - llm.int8() quantization 09:24 - …

Watch on YouTube ↗ (saves to browser)