Genetic Algorithms - Learn Python for Data Science #6

Siraj Raval · Beginner ·🧠 Large Language Models ·9y ago

Key Takeaways

The video demonstrates the use of Genetic Algorithms and the Teapot library in Python to build a Gamma Radiation Classifier, automating the process of selecting the best Machine Learning model and hyperparameters. The Teapot library uses genetic programming to optimize the Machine Learning pipeline, and the video shows how to use it to train a model and predict the classification of gamma radiation.

Full Transcript

hello world it's SJ in this video we're going to use genetic programming to identify if some energy is gamma radiation or not I'm getting angry gamma raay now I wish data science is a way of thinking about Discovery a data scientist needs to decide the right question to ask like who's the best candidate to vote for in the US election then decide what data set to use like tweet history of candidates and pass endorsements of each cand it and lastly decide what machine learning model to use on the data to discover the right answer life goes on with the right data computing power and machine learning model you can discover a solution to any problem but knowing which model to use can be challenging for new data scientists there are so many of them that's where genetic programming can help genetic algorithms are inspired by the darwinian process of natural selection and they're used to generate solutions to optimization and search problems they have three properties selection crossover and mutation you have a population of possible solutions to a given problem and a fitness function every iteration we evaluate how fit each solution is with our fitness function then we select the fittest ones and perform crossover to create a new population we take those children and mutate them with some random modification and repeat the process until we get the fittest or best solution so take this problem for instance instance let's say you want to take a road trip across a bunch of cities what's the shortest possible path you could take to hit up each City once and then return back to your home City this is popularly called the traveling salesman problem in computer science and we can use a genetic algorithm to help us solve it let's look at some highlevel python code we have the number of Generations set to 5,000 and the population size set to 100 so we start by initializing our population using our size parameter each individual in our population represents a different solution path then for each generation we compute the fitness of each solution and store it in our population Fitness array now we'll perform selection by only taking the top 10% of the population which are our shortest road trips and produce Offspring from them by performing crossover then mutate those Offspring randomly and repeat the process as you can see in the animation eventually we will get an optimal solution using this process unlike Apple Maps all right so how does this all fit into data science well it turns out that choosing the right machine learning model and all the best hyperparameters for that model is itself an optimization problem we're going to use a python Library called teapot built on top of pyit learn that uses genetic programming to optimize our machine learning pipeline so after formatting our data properly we need to know what features to input to our model and how we should construct those features once we have those features we'll input them into our model to train on and we'll want to tune our hyper parameters or tuning knobs to get the optimal results instead of doing this all ourselves through trial and error teapot automates these steps for us with genetic programming and it'll output the optimal code for us when it's done so we can use it later so we're going to create a classifier for gamma radiation using teapot after installing our dependencies and then analyze the results teapot is built on the popular pyit learn machine learning library so we'll want to make sure that we have that installed first then we'll install pandas to help us analyze our data and numpy to perform math calculations our first step is to load our data set we use Panda's read CSV method and set the parameter to the name of our saved CSV file this is data collected from a scientific instrument called a cherenov telescope that measures radiation in the atmosphere and these are a bunch of features of whatever type of radiation it picks up thanks Putin since the class object is already organized we'll Shuffle our data to get a better result the iock function of the telescope variable is Panda's way of getting the position in the index and we'll generate a sequence of random indices the size of our data using the permutation function of numpy's random submodule since all the instances are now randomly rearranged we'll just reset all the indices so they are ordered even though the data is now shuffled using the reset index method of pandas with the drop parameter set to True we'll now let our Tel variable know what our two classes are by mapping both of them to an integer with the map method so G or gamma is set to zero and H or hadrin is set to one let's store those class labels which we're going to predict in a separate variable called Tel class and use the values attribute to retrieve them before we train our model we need to split our data into training and validation sets we use the train test split method of pyit learn that we imported to create the indices for both the parameters will be the size of our data set we want both sets to be arrays so we'll set the stratify parameter to our array type then we'll Define what percent of our data we want to be training and testing with these last two parameters we have a 7525 split now in our data and we're ready to train our model we'll initialize the teapot variable using the teapot class with the number of Generations set to five on a standard laptop with four gigs of RAM it takes 5 minutes per generation to run so this will take about 25 minutes this is so Toot's genetic algorithm knows how many iterations to run for and we set verbosity to two which just means show a progress bar in terminal during the optimization process then we can call our fit method on our training data to let it perform optimization using genetic programming the first parameter is the training feature set which will retrieve from our Tel variable along the first access for every training index the second variable is our training class set which will retrieve from our Tel variable like so we can compute the testing error for validation using toot score method with validation feature set as the first parameter and the validation class set as the second we'll export the computed python code to the pipeline. py class using this method and name it in the parameter as a string let's demo this thing after training we'll see that after five generations toot chose the gradient boosting classifier as the most accurate machine learning model to use it also chose the optimal hyperparameters like the learning rate and number of estimators for us yeah boy so to break it down with the right amount of data computing power and machine learning model you can discover a solution to any problem genetic algorithms replicate Evolution via selection crossover and mutation toine an optimal solution to a problem and teapot is a python library that uses genetic programming to help you find the best model and hyperparameters for your use case the winner of the coding Challenge from the last video is Peter Metrano he added some great deepdream samples to his repository and even deep dreamed my own video badass of the week and the runner up is Kyle Jordan good job stitching all the Deep dreamed frames together with one line of code the challenge for this video is to use teapot and a climate change data set that I'll provide to predict the answer to a question you decide this will be great practice in learning to think like a data scientist post your GitHub Link in the comments and I'll announce the winner next time for now I've got to stay fit to reproduce so thanks for watching

Original Description

In this video, we build a Gamma Radiation Classifier and use Genetic Programming to pick the best Machine Learning model + hyper-parameters FOR US in 40 lines of Python. Challenge for this video: https://github.com/llSourcell/genetic_algorithm_challenge Peter's winning code: https://github.com/PeterMitrano/deep_dream_challenge Kyle's Runner up code: https://github.com/ljlabs/deep_dream_challenge/blob/master/Dream_in_video.py Great chapter on Genetic Algorithms: http://natureofcode.com/book/chapter-9-the-evolution-of-code/ Link to TPOT: https://github.com/rhiever/tpot Join the Wizards Slack Channel: https://wizards.herokuapp.com/ Please like + subscribe + comment! Please support me on Patreon!: https://www.patreon.com/user?u=3191693 Follow me: Twitter: https://twitter.com/sirajraval Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/ Instagram: https://www.instagram.com/sirajraval/ Signup for my newsletter for exciting updates in the field of AI: https://goo.gl/FZzJ5w Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available): https://www.wagergpt.xyz
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Siraj Raval · Siraj Raval · 47 of 60

1 What is Bitcoin?
What is Bitcoin?
Siraj Raval
2 5 Ways to Use Bitcoin
5 Ways to Use Bitcoin
Siraj Raval
3 BTC Fever - Siraj [Music Video]
BTC Fever - Siraj [Music Video]
Siraj Raval
4 5 Reasons to Build Decentralized Apps
5 Reasons to Build Decentralized Apps
Siraj Raval
5 The Interplanetary File System
The Interplanetary File System
Siraj Raval
6 How to Build a Dapp in 3 min
How to Build a Dapp in 3 min
Siraj Raval
7 Life Before Smartphones
Life Before Smartphones
Siraj Raval
8 4 Ways to Use Smart Contracts
4 Ways to Use Smart Contracts
Siraj Raval
9 3 Dapps You HAVE to See
3 Dapps You HAVE to See
Siraj Raval
10 Char's Life as a BitTorrent Engineer
Char's Life as a BitTorrent Engineer
Siraj Raval
11 4 Reasons AlphaGo is a Huge Deal
4 Reasons AlphaGo is a Huge Deal
Siraj Raval
12 Build a Neural Net in 4 Minutes
Build a Neural Net in 4 Minutes
Siraj Raval
13 Sentiment Analysis in 4 Minutes
Sentiment Analysis in 4 Minutes
Siraj Raval
14 The Hackathon Life
The Hackathon Life
Siraj Raval
15 Your First ML App - Machine Learning for Hackers #1
Your First ML App - Machine Learning for Hackers #1
Siraj Raval
16 Build an AI Composer - Machine Learning for Hackers #2
Build an AI Composer - Machine Learning for Hackers #2
Siraj Raval
17 Build a Game AI - Machine Learning for Hackers #3
Build a Game AI - Machine Learning for Hackers #3
Siraj Raval
18 Build a Movie Recommender - Machine Learning for Hackers #4
Build a Movie Recommender - Machine Learning for Hackers #4
Siraj Raval
19 Build an AI Artist - Machine Learning for Hackers #5
Build an AI Artist - Machine Learning for Hackers #5
Siraj Raval
20 Build a Chatbot - ML for Hackers #6
Build a Chatbot - ML for Hackers #6
Siraj Raval
21 Build an AI Reader - Machine Learning for Hackers #7
Build an AI Reader - Machine Learning for Hackers #7
Siraj Raval
22 Build an AI Writer - Machine Learning for Hackers #8
Build an AI Writer - Machine Learning for Hackers #8
Siraj Raval
23 Build a Chatbot w/ an API - ML for Hackers #9
Build a Chatbot w/ an API - ML for Hackers #9
Siraj Raval
24 One-Shot Learning - Fresh Machine Learning #1
One-Shot Learning - Fresh Machine Learning #1
Siraj Raval
25 Generative Adversarial Nets - Fresh Machine Learning #2
Generative Adversarial Nets - Fresh Machine Learning #2
Siraj Raval
26 Tone Analysis - Fresh Machine Learning #3
Tone Analysis - Fresh Machine Learning #3
Siraj Raval
27 Generate Rap Lyrics - Fresh Machine Learning #4
Generate Rap Lyrics - Fresh Machine Learning #4
Siraj Raval
28 Build an Autoencoder in 5 Min - Fresh Machine Learning #5
Build an Autoencoder in 5 Min - Fresh Machine Learning #5
Siraj Raval
29 Build a Self Driving Car in 5 Min - Fresh Machine Learning #6
Build a Self Driving Car in 5 Min - Fresh Machine Learning #6
Siraj Raval
30 Build an Antivirus in 5 Min - Fresh Machine Learning #7
Build an Antivirus in 5 Min - Fresh Machine Learning #7
Siraj Raval
31 TensorFlow in 5 Minutes (tutorial)
TensorFlow in 5 Minutes (tutorial)
Siraj Raval
32 Build a Recurrent Neural Net in 5 Min
Build a Recurrent Neural Net in 5 Min
Siraj Raval
33 Build a Simulation in 5 Min
Build a Simulation in 5 Min
Siraj Raval
34 Build a TensorFlow Image Classifier in 5 Min
Build a TensorFlow Image Classifier in 5 Min
Siraj Raval
35 Tensorboard Explained in 5 Min
Tensorboard Explained in 5 Min
Siraj Raval
36 Generate Music in TensorFlow
Generate Music in TensorFlow
Siraj Raval
37 Build a Game Bot (LIVE)
Build a Game Bot (LIVE)
Siraj Raval
38 Deep Learning Frameworks Compared
Deep Learning Frameworks Compared
Siraj Raval
39 Introduction - Learn Python for Data Science #1
Introduction - Learn Python for Data Science #1
Siraj Raval
40 Build a Neural Network (LIVE)
Build a Neural Network (LIVE)
Siraj Raval
41 Twitter Sentiment Analysis - Learn Python for Data Science #2
Twitter Sentiment Analysis - Learn Python for Data Science #2
Siraj Raval
42 Recommendation Systems - Learn Python for Data Science #3
Recommendation Systems - Learn Python for Data Science #3
Siraj Raval
43 Predicting Stock Prices - Learn Python for Data Science #4
Predicting Stock Prices - Learn Python for Data Science #4
Siraj Raval
44 Pong Neural Network (LIVE)
Pong Neural Network (LIVE)
Siraj Raval
45 Deep Dream in TensorFlow - Learn Python for Data Science #5
Deep Dream in TensorFlow - Learn Python for Data Science #5
Siraj Raval
46 Visualizing Data with D3.js (LIVE)
Visualizing Data with D3.js (LIVE)
Siraj Raval
Genetic Algorithms - Learn Python for Data Science #6
Genetic Algorithms - Learn Python for Data Science #6
Siraj Raval
48 Enter Siraj [Music Video]
Enter Siraj [Music Video]
Siraj Raval
49 Build a Web Scraper (LIVE)
Build a Web Scraper (LIVE)
Siraj Raval
50 Why is P vs NP Important?
Why is P vs NP Important?
Siraj Raval
51 How to Make a Neural Network (LIVE)
How to Make a Neural Network (LIVE)
Siraj Raval
52 How to Make an Amazing Tensorflow Chatbot Easily
How to Make an Amazing Tensorflow Chatbot Easily
Siraj Raval
53 How to Make an Amazing Video Game Bot Easily
How to Make an Amazing Video Game Bot Easily
Siraj Raval
54 How to Make a Tensorflow Neural Network (LIVE)
How to Make a Tensorflow Neural Network (LIVE)
Siraj Raval
55 How to Make a Simple Tensorflow Speech Recognizer
How to Make a Simple Tensorflow Speech Recognizer
Siraj Raval
56 Joel Shor - Really Quick Questions with an Awesome Google Engineer
Joel Shor - Really Quick Questions with an Awesome Google Engineer
Siraj Raval
57 How to Make a Path Planning Algorithm Easily (LIVE)
How to Make a Path Planning Algorithm Easily (LIVE)
Siraj Raval
58 The Best Way to Prepare a Dataset Easily
The Best Way to Prepare a Dataset Easily
Siraj Raval
59 Catherine Olsson - Really Quick Questions with an OpenAI Engineer
Catherine Olsson - Really Quick Questions with an OpenAI Engineer
Siraj Raval
60 How to Make a Tic Tac Toe Neural Network Easily (LIVE)
How to Make a Tic Tac Toe Neural Network Easily (LIVE)
Siraj Raval

This video teaches how to use Genetic Algorithms and the Teapot library in Python to automate the selection of the best Machine Learning model and hyperparameters for a Gamma Radiation Classifier. The viewer will learn how to use genetic programming to optimize the Machine Learning pipeline and predict the classification of gamma radiation.

Key Takeaways
  1. Import necessary libraries (Teapot, Pyit learn, Pandas, Numpy)
  2. Load and preprocess the dataset
  3. Split the data into training and validation sets
  4. Initialize the Teapot variable and set the number of generations
  5. Call the fit method on the training data to perform optimization using genetic programming
  6. Compute the testing error for validation using the Teapot score method
  7. Export the computed Python code to a pipeline.py file
💡 Genetic Algorithms can be used to automate the selection of the best Machine Learning model and hyperparameters, saving time and improving accuracy.

Related AI Lessons

Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →