Training at Scale
Train large models with mixed precision, gradient checkpointing, and distributed strategies.
0%
Confidence · no data yet
After this skill you can…
- Use FP16/BF16 mixed precision training
- Apply gradient accumulation for large batches
- Set up DDP and FSDP on multi-GPU clusters
Prerequisites
Watch (10 videos)
Lightning Talk: Optimized PyTorch Inference on aarch64 Linux CPUs - Sunita Nadampalli, Amazon (AWS)
→ Optimize PyTorch inference on aarch64 Linux CPUs→ Use Arm compute library for optimization
DeepSpeed: Efficient Training Scalability for Deep Learning - Tunji Ruwase, Snowflake
→ Train large-scale deep learning models→ Optimize compiler for efficiency
Keras 3 Distributed Training: Scaling Models with JAX using DataParallel, and ModelParallel
→ Train large deep learning models→ Use Keras 3 for distributed training
Optimize PyTorch: Build and Accelerate Layers
→ Apply optimizations like mixed precision→ Boost training throughput
Pushing the Performance Envelope: An Optimization Study for 3... Suvaditya Mukherjee & Shireen Chand
→ Train 3D generative models with PyTorch→ Optimize performance of Variational Autoencoders
Deep Learning with PyTorch Live Course - Training Deep Neural Networks on GPUs (Part 3 of 6)
→ Train neural networks on GPUs→ Implement GANs with PyTorch
Stock Price Prediction using GRU | Deep Learning Project in Tamil | Gated Recurrent Unit
→ Train a GRU model for stock price prediction→ Optimize deep learning models for time series forecasting
Tips N Tricks #6: How to train multiple deep neural networks on TPUs simultaneously
→ Train multiple neural networks on TPUs→ Optimize hyperparameters for TPU training
Enabling Efficient Trillion Parameter Scale Training for Deep Learning Models // Tunji Ruwase
→ Train deep learning models at scale→ Optimize model training for performance
EP8: Training Models at Scale | AWS for AI Podcast
→ Scale AI model training→ Optimize AI infrastructure for large models
DeepCamp AI