How to Fine-tune Mixtral 8x7B MoE on Your Own Dataset

Brev · Intermediate ·📄 Research Papers Explained ·2y ago

Skills: Fine-tuning LLMs90%LLM Engineering70%

In this video, we show you how to fine-tune Mixtral, Mistral's 8x7B MoE (Mixture of Experts) model, on your own dataset. You'll be directed to another video where we fine-tune Mistral 7B (standard Mistral) on your own dataset, but you'll be using the notebook here: https://github.com/brevdev/notebooks/blob/main/mixtral-finetune-own-data.ipynb My explanation on how QLoRA works: https://brev.dev/blog/how-qlora-works Fine-tune Mixtral MoE on a HuggingFace Dataset: https://youtu.be/zbKz4g100SQ More AI/ML notebooks: https://github.com/brevdev/notebooks/ Join the Discord: https://discord.gg/NVDyv7TUgJ Connect with me on 𝕏: https://x.com/HarperSCarroll

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Fine-tuning LLMs

View skill →

Fine-tuning T5 LLM for Text Generation: Complete Tutorial w/ free COLAB #coding

Fine-tuning T5 LLM for Text Generation: Complete Tutorial w/ free COLAB #coding

Train image classifier using transfer learning - Fine-tuning MobileNet with Keras

Train image classifier using transfer learning - Fine-tuning MobileNet with Keras

Advanced Fine-Tuning in Rust

Advanced Fine-Tuning in Rust

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

LLM Fine-tuning: Two Crucial Tips for New Models - LLama 2

LLM Fine-tuning: Two Crucial Tips for New Models - LLama 2

SDXL LORA STYLE Training! Get THE PERFECT RESULTS!

SDXL LORA STYLE Training! Get THE PERFECT RESULTS!

Related AI Lessons

The ABCs of reading medical research and review papers these days

Learn to critically evaluate medical research papers by accepting nothing at face value, believing no one blindly, and checking everything

#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.

Learn to manage research paper tabs efficiently and apply meta-research techniques to improve productivity

How to Set Up a Karpathy-Style Wiki for Your Research Field

Learn to set up a Karpathy-style wiki for your research field to organize and share knowledge effectively

The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap

Scientific knowledge may be stuck in a local minimum, hindering optimal progress, and understanding this concept is crucial for advancing research

Microsoft Research Forum | Season 2, Episode 4

Microsoft Research