ML Interpretability: feature visualization, adversarial example, interp. for language models

Name: ML Interpretability: feature visualization, adversarial example, interp. for language models
Uploaded: 2024-05-21T06:11:42+00:00
Channel: Umar Jamil
Description: In this video, I will be introducing Machine Learning Interpretability, a vast topic that aims at understanding the inner mechanisms of how machine lear...

Umar Jamil · Beginner ·🛡️ AI Safety & Ethics ·1y ago

In this video, I will be introducing Machine Learning Interpretability, a vast topic that aims at understanding the inner mechanisms of how machine learning models make their predictions, with the aim of debugging them, making them more transparent and trustworthy. I will start by reviewing deep learning and the back-propagation algorithm, which are necessary for understanding adversarial example generation and feature visualization for computer vision classification models. In the second part, I will show how we can leverage the knowledge built in the first part of the video and apply it to …

Watch on YouTube ↗ (saves to browser)

Next Up

The Real AI Danger? Concentration of Compute #shorts

DataCamp