Moonwalk: Inverse-Forward Differentiation
📰 ArXiv cs.AI
Moonwalk introduces Inverse-Forward Differentiation to reduce memory usage in deep neural networks by avoiding storage of intermediate activations
Action Steps
- Revisit the structure of gradient computation in backpropagation
- Identify the need to store intermediate activations as a limitation
- Apply Inverse-Forward Differentiation to compute gradients without storing activations
- Implement Moonwalk in deep learning frameworks to enable training of deeper networks
Who Needs to Know This
ML researchers and engineers on a team can benefit from Moonwalk as it enables training of deeper networks, and software engineers can implement this technique in deep learning frameworks
Key Insight
💡 Inverse-Forward Differentiation can avoid storing intermediate activations during the forward pass
Share This
🚀 Moonwalk: Inverse-Forward Differentiation reduces memory usage in deep neural networks
DeepCamp AI