Batch vs Mini-Batch vs Stochastic Gradient Descent Explained | Deep Learning 9
About this lesson
In this video, we’re going to talk about the different ways Gradient Descent is actually used in machine learning: Batch Gradient Descent, Stochastic Gradient Descent, and Mini-Batch Gradient Descent. The idea is the same, but what changes is how much data we use before updating the weights. Batch Gradient Descent uses the entire dataset at once, so it’s slow but very stable and the loss curve moves smoothly. Stochastic Gradient Descent does the opposite and updates after every single data point, which makes it fast but extremely noisy and unstable. And finally, there’s Mini-Batch Gradient Descent, which is the version used in real applications—it processes the data in smaller batches like 32 or 64 samples, so it converges faster than full batch and is much more stable than noisy SGD. By the end of this video, you’ll know exactly how these three differ and why mini-batch became the standard choice in machine learning. Links for the Related videos:- Neural Networks:- https://youtu.be/sE6OaMndGZg BackPropagation:- https://youtu.be/nAMkcgxKwfA Activation Functions:- https://youtu.be/Kz7bAbhEoyQ Vanishing/Exploding gradients:- https://youtu.be/CzNFuL_5uig Data Normalization:- https://youtu.be/W2vqsTg-rDU 📚 Welcome to the Channel! If you're passionate about learning complex concepts in the simplest way possible, you're in the right place. I create visual explanations using animations to make topics more intuitive and engaging—especially in Algorithms, AI, machine learning, and beyond. 🎥 Animations created using Manim: Manim is an open-source Python library for creating mathematical animations. Learn more or try it yourself: 🔗 https://www.manim.community Let's Connect:- GitHub:- https://github.com/ByteQuest0 Reddit:- https://www.reddit.com/r/ByteQuest/
DeepCamp AI