ML Interpretability: feature visualization, adversarial example, interp. for language models

Umar Jamil · Beginner ·🛡️ AI Safety & Ethics ·1y ago
In this video, I will be introducing Machine Learning Interpretability, a vast topic that aims at understanding the inner mechanisms of how machine learning models make their predictions, with the aim of debugging them, making them more transparent and trustworthy. I will start by reviewing deep learning and the back-propagation algorithm, which are necessary for understanding adversarial example generation and feature visualization for computer vision classification models. In the second part, I will show how we can leverage the knowledge built in the first part of the video and apply it to …
Watch on YouTube ↗ (saves to browser)
The Real AI Danger? Concentration of Compute #shorts
Next Up
The Real AI Danger? Concentration of Compute #shorts
DataCamp