Presenting... Determined AI!

Connor Shorten · Advanced ·🧬 Deep Learning ·5y ago

Skills: LLM Engineering90%Fine-tuning LLMs80%LLM Foundations70%

Key Takeaways

The video presents Determined AI, a platform for deep learning experimentation, and demonstrates its features, including hyperparameter optimization, cluster sharing, and resource management, using tools such as TensorFlow, PyTorch, and Keras. The platform is shown to be capable of large-scale experimentation, experiment tracking, and visualization, with a web UI for cluster management and resource sharing.

Full Transcript

thank you so much for watching henry ai labs on this youtube channel we've reviewed several deep learning research papers so after processing all this do you have an idea for a new transformer neural network what about a technique to improve gans or q-learning maybe you want to try out deep learning with a new data set well if you're looking to get into deep learning research you'll need to have some serious engineering power behind your experiments without a platform like determined ai setting up large scale deep learning experiments can be a serious headache you want to be testing out ideas and writing papers not debugging distributed training code facing obscure errors or having your entire experiment crash altogether so let's take a step back and talk about what you need to do to run a deep learning experiment and present a new idea in deep learning at the time of recording this video the two transformers gan is making a lot of noise is an exciting new approach to the generative adversarial network framework the idea is to use an all attention generator and then a vision transformer architecture like the 16 by 16 images patches is all you need paper from google and they're using the vision transformer for the discriminator and transformer in the generator so it's an exciting idea for integrating transformers in the gann framework that you know maybe viewers of this video could come up with ideas like this and they just need the proper coding frameworks in order to test out these ideas so what does it require to test out these ideas well in order to say that these two transformers is better than the generative adversarial network baseline would say say like the dc gan with the deep convolutional architectures and the generator and the discriminator you need to test those two different models the next thing to do would be to ablate all the different hyper parameters of the transformer gan so when you have a vision transformer it's splitting the image up into the this hyper parameter of the patch size you might split it into 8 by 8 patches 16 by 16 patches or 32 by 32 patches then further what happens with model configuration having more layers having larger intermediate features like the width of these feature maps in the intermediate layers of deep neural networks as well as things like batch size learning rate augmentation strength there are tons of hyper parameters that are going to influence the performance of a deep learning system and are important to have some kind of framework to search through these hyper parameters in order to report your new ideas in deep learning so to come back to this idea imagine you've come up with a variant of the layer normalization used in transformer blocks or you have some idea for structuring a multi-task learning curriculum well the performance of deep learning models is somewhat shrouded in mystery many parameters of the learning algorithm itself known as hyper parameters could heavily influence training these include the learning rate batch size model size optimizer choice or augmentation strength to name a few before you can declare that your new algorithm is effective and present your results you need to search through these values to make sure you've properly tested the old verse the new this can also be useful for squeezing that extra bit of performance out if you're doing some kind of like machine learning competition and you want to have the highest accuracy or auc score or all these kind of things coding and organizing a hyper parameter sweep system from scratch is really at least a couple of months work make that a few months if you're working with a team and sharing computing resources so don't waste your time and check out determined ai in the rest of this video i'll be explaining to you why i'm so excited about determined ai as a deep learning researcher myself i've had to come up with a solution to this problem as well of hyper parameter optimization one of my favorite algorithms for this is hyperband hyperband presents a strategy to randomly allocate computing resources to different hyperparameter configurations this is an angle on hyperparameter search that surprisingly hadn't really been considered before about two years ago i listened to liam lee explain cutting-edge research on random search and reproducibility for neural architecture search at icml2019 the slides i've linked to this talk is in the description of the video well today liam alongside many other superstar researchers in hyper parameter optimization are working at determined ai to build this platform with all the cutting edge bells and whistles presented in hyper parameter optimization research so as someone who really values this deep learning research i really trust this team to build out this infrastructure and implement these hyper parameter optimizations so that we can quickly test out these ideas with algorithms like hyperband or bayesian optimization or these reinforced learning evolutionary search algorithms and then all these other things like hierarchical neural architecture search that looks at different ways of parameterizing the search space itself so i'm really excited about this team and i really believe in their ability to build such a platform so as a final note before getting into a preview of the determined aia platform henry ai labs will be making videos that walk through the determined ai examples similar to the keras code example series to help you get started with this as soon as possible the rest of this video is about providing a little more information about determined ai and some walkthroughs of the platform itself [Music] so we're going to start this off with two images that i think really explain the determined platform well so these are two images that are commonly found this is from their determined ai documentation and a lot of their blog posts have this picture and i took this from their youtube video uh determined ai intro on their determined ai youtube channel which is linked in the description of this video so in the orange is what's covered in the determined training platform and these other things are things that are generally live in the deep learning machine learning ecosystem that are also relevant for developing these pipelines so first you have your data preparation this is you know where your data lives like your s3 buckets and so on but on the bottom here are your run times so you could have aws you have ec2 or spot instances you can have the gcp google cloud compute or you could also have your own local cluster running on kubernetes or your own local machine like if you have your own data science workstation or something like that all these different runtimes can be managed with the determined ai platform and then model deployment is something that's also not currently a part of the determined ai stack so now let's talk about what is contained in each of these orange boxes the first of which is cluster sharing and resource management so i've attended a lunch and learn with the term ai which is the program for outreach on teaching people how to use the platform and implementing cutting edge ideas in this case the detr dead or object detection system with transformers from facebook they implemented that and showed you how to run the large-scale experimentation with it on the determined ai platform so in this case in this example they managed 128 gpus so they're showing you how to manage this massive computing system with several different people who are going through the lunch and learn course so that's another example of one of the videos that are on the determined ai youtube channel lunch and learn debtor that i highly recommend having a look at if you're curious about kind of what kind of content is currently on determined ai youtube channel so the cluster sharing and resource management super useful if you're you know working with a team and you're trying to work with 16 gpus and each team member uses x gpus for their experiment and so on so it really abstracts us away makes it a lot easier with the web ui and everything to facilitate cluster sharing and resource management the next offering is distributed training itself so in the current versions of tensorflow and uh you know running keras of the tensorflow back-end you may be used to things like tf.distributed strategy the mirrored strategy and that kind of syntax but determined ai is going to have an api to handle distributed training for you in a similar way as these uh like one lines of code that handle this distributed training without you having to worry about really the details of it experiment tracking describes having a nice interface to visualize the results of different experiments with deep learning say you have these different model configurations and you already kind of the search space really isn't as large as something like hyper parameter search so the next thing we'll get into is the hyper parameter search that's offered and we're even going to show you a quick example of the cfar10 example they provide in the ending of this video with the web ui and show you what that looks like but they implement hyper band and this uh you know random resource allocation right off the shelf with your hyper parameter search and it results in a really interesting api that you can use for doing this kind of search we'll look at exactly how you define the model and then how you have your configuration.yaml file very similar to if you watched the keras tuner video in the keras code examples video on henry ai labs so in development is neural architecture search on the determined ai platform and seeing the expertise of this team and the general research area of neural architecture search and some of the exciting things that have come out of this i expect this to be an incredibly powerful feature for this kind of deep learning training platform to have its own neural architecture search feature something that's highlighted as in development and coming soon and i just think this will be incredibly powerful and exciting to use so finally we also have the deep learning training data accelerator visualization and debugging during the model training and batch inference so here's another picture of these different services i think maybe just seeing it in another way will just help some of these things click for you a little better you see how you have data storage and management the data preparation column in this image and then having it in this box and this image then we have tensorflow pytorch and keras with respect to talking about how we're managing these different runtimes with our shared computing resources with our team we see how we can send pi torch code tensorflow code and keras code using this determined ai management of tensorflow pytorch and keras and all that you have to do to monitor these different experiments is inspect the configuration.yaml file and we'll look into that deeper when we look at the web ui itself so again we have experiment tracking automated hyperparameter search cluster sharing and resource management distributed training model deployment optimization and model serving and then we have the final comment column in the actual model deployment services that are outside the scope of determined ai the first question you might have is what do i use determined ai with well can run on aws ec2 also works with the spot instances which are kind of like a cheaper way of using these ec2 instances but you might get kind of like timed out some of that so it's kind of like the google collab where you have these lower cost resources but you know i'm not sure exactly how much that works because i don't personally have too much experience with using spot instances but it's something that you can use determined ai with you can also use it with the google cloud compute platform and then you can also use it with a personal workstation or if you have a cluster that's running on kubernetes you can all interface these things with determined ai so the way this would work is you have your computing like aws and then on your local machine you're going to write the model definition and the configuration for the test and you're going to send that to the cloud computing service through the determined ai interface so you write this code locally and then you send it off to the computing runtime to run the experiments so now let's dive deeper into what makes me personally so excited about the determined ai platform which is automl and hyperparameter optimization so again when we're talking about hyper parameter optimization we're looking to find the best performing values of the learning rate batch size whether batch size of 32 256 512 and so on model configurations say eight layers versus four layers and then hidden dimension say you have hidden feature vectors of 512 dimensions compared to 1024 compared to 256 and so on and then the data augmentation strength or all these other hyper parameters of deep neural network training and again deep neural network performance is really sensitive to these hyper parameters and it's really important to search through them to try to find the best performance report proper baselines compare models properly and so on so neural architecture search is another really exciting idea it's a bit more ambitious than high parameter optimization generally or you're trying to find these optimal computation blocks to stack together to form these deep neural networks so say when you have the gpt transformer decoder model it's done by having this one transformer decoder block stacked on top of itself several times so neural architecture surgeon how many times you want to stack these blocks together and the overall configuration of stacking blocks together is known as macro architecture search whereas micro architecture search is this more fine-grained trying to find this optimal block has been stacked on top of each other itself for several times and this has a really large search space so this definitely requires tools like hyperband and these different things bayesian optimization evolutionary search and so on in order to find a optimal neural architecture purely from search algorithms so without further ado here's a preview into the kind of what's going to be following on henry ai labs as we walk through these experiments this is one of the experiments provided in the determined ai documentation cfar10 pi torch adaptive search so this is the web ui for hyperparameter search we can see it quickly by clicking on view configuration this is what makes working with tensorflow pytorch keras it's also easy because it's all just unified in the config.yaml files and we just click on view configuration in our determined ai interface and we can see the hyper parameters that we're searching through so in this case we see the learning rate we have our scale to search through layer 1 dropout parameters 2 and 3 the global batch size and learning rate decay so we're searching through these hyper parameters that are configured the min max the step count it's all configured in this configuration file that the determined ai platform is using to run these experiments so in this case we see how we've run several different experiments of these configurations of hyper parameters for our cfar10 classifier now we see this is hyperband in action they haven't all been trained for the same amount of training time we have the duration we see some models have only been trained on 2941 batches for five minutes of training time but the validation metric is so high compared to the other experiments that why bother continuing to train this configuration it's not going to perform as well as the others anyways so this is a quick preview of what we're looking at with this hyperparameter search interface we have the configuration file we have the checkpointing the best validation laws we have the curves as is happening over time in our trial 41 which is the one that performed the best in the end and we see how we have this unequal resource allocation implemented for us with these batches and we don't have to bother with any of this complex unequal resource allocation code ourselves because we can rely on the determined ai platform so determined ai is implementing these tools for hyper parameter optimization there's an extensive body of literature on automl hyper parameter optimization and deep learning research and we have tools like different search strategies like random search grid search bayesian optimization evolutionary search reinforced learning search or differentiable search all these have you know probably at least five research papers have explored each of these different techniques mostly on bayesian search though and that implement these different ideas as well as these cool ideas like resource allocation like the hyper band that we just saw which is building on this idea of early stopping where no sense in training the model further if it's not getting any better and then we also have ideas like hierarchical neural architecture search that looks at different ways of parameterizing these black box search spaces so just showing the slide as quickly to give an overview there are all these different tools that are being developed in the hyper parameter optimization literature it's a very active branch of research for deep learning researchers to pursue so all of this is being implemented and integrated in the determined ai platform so you personally if you're experimenting with gans or q learning or transformer designs you don't need to bother with with this code yourself and it's you know it's abstracting the way it's offering you the service to have this already implemented without having to do it yourself because it will give you a serious boost in performance to have this kind of tool for your experiments so before we go back to the determined web ui to have a look at some of the other features like this cluster management particularly is what i want us to focus on right now with resource sharing when i was watching the launch and learn the debtor demonstration this is a really great example of how a group can share these computing resources so this is showing how they had this 128 gpu system and they have the different uh tutorial people who are falling through the tutorial and running their experiments and you can see this picture it's a bad picture because i just screenshotted this from their youtube video on their youtube channel determined ai this is from lunch and learned the debtor model and this is just showing a quick example of how all these different experiments are running on the same cluster so helping people interested in determine ai get familiar with this api for defining your models setting up your configuration files and setting up these different hype parameter searches is going to be the focus of the next upcoming videos on henry ai labs but here's a quick preview of it just so you have a sense of it so we have this class cfar trial it inherits this pie chart trial object where we have this self.context where we're accessing our hyperparameters like self.context.gethyperparameter and it gives this layer 3 dropout that is configured the values of it of which are configured in our configuration.yaml file so this is from the adaptive search this is just the constant trying to run one loop through it you see how we just have layer one dropout is 0.25 compared to defining a range of values like 0.2 to 0.5 and so on so basically just to give you a quick overview of how we do this we have the defining the model there's an api for how we're supposed to structure our code in order to send it off the determined system and then we have these yaml files that organize the hyperparameters really well and don't worry about this too much because this is going to be the focus of these upcoming tutorials to help you get really up to speed with using this and really familiar with the syntax thank you so much for watching this introduction to determined ai presented by henry ai labs the upcoming videos we'll be showing you more examples of the determined ai documentation and walking through these examples of setting up the model builder code setting up the configuration files and then sending them off to our cluster to run the experiments so i also highly recommend checking out the determined ai youtube channel there's also some medium posts that are linked in description of this video and this overall going through these examples should be very similar to the keras code example series but in the end of this i think you'll get a lot more value out of this because you need to have some kind of serious engineering power and some kind of understanding of these deep learning training infrastructure systems ml ops this kind of idea in order to really progress with your deep learning experimentation so overall thank you so much for watching i'm really excited about this upcoming series explaining the determined ai examples if there's anything that i left out of the video and you have a question please just ask it in the comments of the youtube video or if there's a mistake you think i've made along the way of explaining this or anything that you want to request with respect to these determined ai tutorials so thanks for watching and please subscribe to henry ai labs for the remainder of the determined ai series and more deep learning and ai videos

Original Description

I'm really excited to present this video on Determined AI! Determined has been teaching me how to use their platform and showing me what they are building. I am so excited about this with advancing my Deep Learning experimentation skills and I hope you all find value out of this as well. More particularly, I think the Determined team is really onto something with their Hyperparameter Optimization tools. The remainder of this series will have a similar objective as the Keras Code Examples series. I highly recommend checking out some of the content links below to learn more about Determined. Content Links: This is the video that got me up and running with my first Determined Experiment: https://www.youtube.com/watch?v=htObOwwnhQk&t=8s Liam's ICML 2019 Talk on HPO/NAS: https://slideslive.com/38917538/random-search-and-reproducibility-for-neural-architecture-search?ref=speaker-16967-latest Determined YouTube: https://www.youtube.com/channel/UCbi8fmzzTaTBCqZZ2GbbS-A Determined Overview (Blog Post): https://medium.com/pytorch/determined-a-batteries-included-deep-learning-training-platform-9038ce5d4e4b Determined website: https://determined.ai/ TransGAN (mentioned briefly): https://arxiv.org/pdf/2102.07074.pdf Chapters 0:00 Beginning 0:47 TransformerGAN 2:17 DL Experimentation for You 4:59 Determined AI - Feature Overview 9:44 Runtimes for Determined 10:40 Hyperparameter Optimization 12:15 Determined Web UI! 15:20 Resource Sharing 16:03 HPO Interface

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Connor Shorten · Connor Shorten · 0 of 60

← Previous Next →

DeepWalk Explained

DeepWalk Explained

Inception Network Explained

Inception Network Explained

Progressive Growing of GANs Explained

Progressive Growing of GANs Explained

Improved Techniques for Training GANs

Improved Techniques for Training GANs

Word2Vec Explained

Word2Vec Explained

Must Read Papers on GANs

Must Read Papers on GANs

Unsupervised Feature Learning

Unsupervised Feature Learning

Self-Supervised GANs

Self-Supervised GANs

Embedding Graphs with Deep Learning

Embedding Graphs with Deep Learning

Transfer Learning in GANs

Transfer Learning in GANs

ReLU Activation Function

ReLU Activation Function

AC-GAN Explained

AC-GAN Explained

SimGAN Explained

SimGAN Explained

DC-GAN Explained!

DC-GAN Explained!

ResNet Explained!

ResNet Explained!

Graph Convolutional Networks

Graph Convolutional Networks

Neural Architecture Search

Neural Architecture Search

Video Classification with Deep Learning

Video Classification with Deep Learning

BigGANs in Data Augmentation

BigGANs in Data Augmentation

Introduction to Deep Learning

Introduction to Deep Learning

EfficientNet Explained!

EfficientNet Explained!

Self-Attention GAN

Self-Attention GAN

Curriculum Learning in Deep Neural Networks

Curriculum Learning in Deep Neural Networks

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Compression

Deep Compression

Skin Cancer Classification with Deep Learning

Skin Cancer Classification with Deep Learning

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

The Lottery Ticket Hypothesis Explained!

The Lottery Ticket Hypothesis Explained!

GauGAN Explained!

GauGAN Explained!

AutoML with Hyperband

AutoML with Hyperband

DL Podcast #3 | Yannic Kilcher | Population-Based Search

DL Podcast #3 | Yannic Kilcher | Population-Based Search

Weakly Supervised Pretraining

Weakly Supervised Pretraining

Image Data Augmentation for Deep Learning

Image Data Augmentation for Deep Learning

Unsupervised Data Augmentation

Unsupervised Data Augmentation

Wide ResNet Explained!

Wide ResNet Explained!

RevNet: Backpropagation without Storing Activations

RevNet: Backpropagation without Storing Activations

GANs with Fewer Labels

GANs with Fewer Labels

BigBiGAN Unsupervised Learning!

BigBiGAN Unsupervised Learning!

Self-Supervised Learning

Self-Supervised Learning

Multi-Task Self-Supervised Learning

Multi-Task Self-Supervised Learning

Self-Supervised GANs

Self-Supervised GANs

Population Based Training

Population Based Training

Show, Attend and Tell

Show, Attend and Tell

Siamese Neural Networks

Siamese Neural Networks

WaveGAN Explained!

WaveGAN Explained!

VAE-GAN Explained!

VAE-GAN Explained!

Evolution in Neural Architecture Search!

Evolution in Neural Architecture Search!

AI Research Weekly Update August 18th, 2019

AI Research Weekly Update August 18th, 2019

Weight Agnostic Neural Networks Explained!

Weight Agnostic Neural Networks Explained!

AI Research Weekly Update August 25th, 2019

AI Research Weekly Update August 25th, 2019

Neuroevolution of Augmenting Topologies (NEAT)

Neuroevolution of Augmenting Topologies (NEAT)

AI Research Weekly Update September 1st, 2019

AI Research Weekly Update September 1st, 2019

Randomly Wired Neural Networks

Randomly Wired Neural Networks

The video introduces Determined AI, a platform for deep learning experimentation, and demonstrates its features, including hyperparameter optimization, cluster sharing, and resource management. The platform is shown to be capable of large-scale experimentation, experiment tracking, and visualization, with a web UI for cluster management and resource sharing. By watching this video, viewers can learn how to use Determined AI to build and optimize deep learning models, perform hyperparameter searc

Key Takeaways

Define models using the Determined AI API
Set up configuration files using the Determined AI API
Set up hyperparameter searches using the Determined AI API
Run experiments using the Determined AI system
Train models using the Determined AI system
Use Determined AI with CFAR10 PI Torch Adaptive Search for hyperparameter search
Use Hyperband in action with unequal resource allocation

💡 Determined AI is a powerful platform for deep learning experimentation, allowing users to perform hyperparameter searches, manage cluster sharing and resource management, and track experiments and visualize results.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Engineering

View skill →

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Shane | LLM Implementation

How to Make an Asteroids Game Bot (LIVE)

How to Make an Asteroids Game Bot (LIVE)

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Automata Learning Lab

Related Reads

I Found the Neural Network I Built in Class 9 — Here’s What Happened When I Tried to Run It Again

Revisiting a 4-year-old neural network project for handwritten digit recognition using a convolutional neural network and analyzing its performance

Medium · Deep Learning

Introduction to Deep Learning and Neural Networks: From Human Brain to Artificial Intelligence

Learn how biological neurons inspired artificial neural networks and deep learning, transforming the AI landscape

Medium · Deep Learning

Want to get started with deep learning

Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch

Reddit r/deeplearning

Building a Deepfake Detector From Scratch — What Nobody Tells You

Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media

Medium · Deep Learning

Chapters (9)

Beginning

0:47 TransformerGAN

2:17 DL Experimentation for You

4:59 Determined AI - Feature Overview

9:44 Runtimes for Determined

10:40 Hyperparameter Optimization

12:15 Determined Web UI!

15:20 Resource Sharing

16:03 HPO Interface

Image Classification with ml5.js

The Coding Train