How to Fine-tune Mixtral 8x7B MoE on Your Own Dataset

Brev · Intermediate ·📄 Research Papers Explained ·2y ago
In this video, we show you how to fine-tune Mixtral, Mistral's 8x7B MoE (Mixture of Experts) model, on your own dataset. You'll be directed to another video where we fine-tune Mistral 7B (standard Mistral) on your own dataset, but you'll be using the notebook here: https://github.com/brevdev/notebooks/blob/main/mixtral-finetune-own-data.ipynb My explanation on how QLoRA works: https://brev.dev/blog/how-qlora-works Fine-tune Mixtral MoE on a HuggingFace Dataset: https://youtu.be/zbKz4g100SQ More AI/ML notebooks: https://github.com/brevdev/notebooks/ Join the Discord: https://discord.gg/NVDyv7TUgJ Connect with me on 𝕏: https://x.com/HarperSCarroll
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

The ABCs of reading medical research and review papers these days
Learn to critically evaluate medical research papers by accepting nothing at face value, believing no one blindly, and checking everything
Medium · LLM
#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.
Learn to manage research paper tabs efficiently and apply meta-research techniques to improve productivity
Dev.to AI
How to Set Up a Karpathy-Style Wiki for Your Research Field
Learn to set up a Karpathy-style wiki for your research field to organize and share knowledge effectively
Medium · AI
The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap
Scientific knowledge may be stuck in a local minimum, hindering optimal progress, and understanding this concept is crucial for advancing research
ArXiv cs.AI
Up next
Microsoft Research Forum | Season 2, Episode 4
Microsoft Research
Watch →