Fit More and Train Faster With ZeRO via DeepSpeed and FairScale

📰 Hugging Face Blog

Use ZeRO via DeepSpeed and FairScale to fit more and train faster with large ML models

advanced Published 19 Jan 2021

Action Steps

Understand the concept of ZeRO and its application in ML model training
Explore DeepSpeed and FairScale libraries for implementation
Apply ZeRO to existing ML models to improve training speed and efficiency

Who Needs to Know This

Machine learning engineers and researchers can benefit from this technique to train larger models and improve performance, while data scientists can apply this to real-world problems

Key Insight

💡 ZeRO optimizes memory usage, enabling training of trillion-parameter models