'How neural networks learn' - Part III: Generalization and Overfitting

arXiv Insights · Advanced ·📄 Research Papers Explained ·7y ago
In this third episode on "How neural nets learn" I dive into a bunch of academical research that tries to explain why neural networks generalize as wel as they do. We first look at the remarkable capability of DNNs to simply memorize huge amounts of (random) data. We then see how this picture is more subtle when training on real data and finally dive into some beautiful analysis from the viewpoint on information theory. Main papers discussed in this video: First paper on Memorization in DNNs: https://arxiv.org/abs/1611.03530 A closer look at memorization in Deep Networks: https://arxiv.org/abs/1706.05394 Opening the Black Box of Deep Neural Networks via Information: https://arxiv.org/abs/1703.00810 Other links: Quanta Magazine blogpost on Tishby's work: https://www.quantamagazine.org/new-theory-cracks-open-the-black-box-of-deep-learning-20170921/ Tishby's lecture at Stanford: https://youtu.be/XL07WEc2TRI Amazing lecture by Ilya Sutkever at MIT: https://youtu.be/9EN_HoEk3KY If you want to support this channel, here is my patreon link: https://patreon.com/ArxivInsights --- You are amazing!! ;) If you have questions you would like to discuss with me personally, you can book a 1-on-1 video call through Pensight: https://pensight.com/x/xander-steenbrugge
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

The ABCs of reading medical research and review papers these days
Learn to critically evaluate medical research papers by accepting nothing at face value, believing no one blindly, and checking everything
Medium · LLM
#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.
Learn to manage research paper tabs efficiently and apply meta-research techniques to improve productivity
Dev.to AI
How to Set Up a Karpathy-Style Wiki for Your Research Field
Learn to set up a Karpathy-style wiki for your research field to organize and share knowledge effectively
Medium · AI
The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap
Scientific knowledge may be stuck in a local minimum, hindering optimal progress, and understanding this concept is crucial for advancing research
ArXiv cs.AI
Up next
Kimi AI's Huge LLM Breakthrough Is Fascinating [Attention Residuals]
bycloud
Watch →