Introduction - Learn Python for Data Science #1

Siraj Raval · Beginner ·🛠️ AI Tools & Apps ·9y ago

Key Takeaways

The video series introduces Python for data science, covering installation, setting up the environment, and building a gender classifier using scikit-learn library. It demonstrates the use of Python for machine learning tasks such as speech recognition and gender classification.

Full Transcript

hello world it's SJ and welcome to the Learn Python for data science Series in this video we're going to set up our python environment and write a 10line script that can classify anyone as male or female given just their body measurements data science is the study of data and a data scientist is someone who solves Problems by studying data so pretty much all science is data science we observe we make predictions we test and we update our ideas so if we were given a data set of meteorite Landings over the past 10 years we could come up with questions that we think the data might help us solve like what area is most likely to get hit or how does atmospheric pressure affect meteorite trajectory then we could write a little code that trains a machine learning model on that data and predicts the answer we can use an existing model and there are a lot of them or build our own traditionally you needed a PhD for this stuff but with the world's data doubling every 2 years and machine learning algorithms getting more powerful anyone can become a data scientist you just need time and motivation if you have those two things you'll be able to complete a bunch of data science projects and upload them to your GitHub GitHub is the new resume it's not about how many degrees you have it's about what you can do machine learning democratizes scientific discovery okay I'm talking to you yes you sitting right there you can be a data scientist anyone can and the tool we're going to use to help us learn from our data is the Python programming language I'm going to teach you python but not just by talking about syntax you'll learn by doing in each episode we're going to focus on a different data science project I'll give you a coding challenge at the end that extends that project and you'll learn python along the way I'm picking python for two reasons it's designed for readability and it's general purpose check out this speech recognition app it uses a library called Sphinx to read an audio file convert it to text and print it out that's just five lines of code and we can still read what it's doing since every word is descriptive and compact now let's look at a similar app in C++ that's about 100 lines I love I love so beautiful to build our gender classification app there are four steps we'll install python set up our environment install our dependencies and write the python script let's start by installing python if you're on a Mac or Linux machine python comes pre-installed if you're on Windows it doesn't yo what the regardless you'll want to download the latest version of python 3.5.2 as of today on Mac you can download the installer package and go through the necessary steps to install it then you'll be able to compile your scripts from terminal using the python keyword like so on Linux you can download the source then in terminal type in three commands to install it you'll then be able to run Python scripts using the python keyword on Windows you can go download the installer make sure add python.exe to paath is set to be installed on your local hard drive then once it's finished you can run python right from command line now that we have python installed let's set up our environment the text editor we'll be using for this course is Sublime Text since it's super simple to use but what about emacs no both Mac and windows have an installer that you can use to install it for Linux you can install it via the app to get package manager with these three commands once we have it installed we can type our python code in there and compile it with terminal by pointing our python interpreter to our script that's it we only need terminal and our text editor to run our scripts so we've got our environment set up let's move on to installing our dependencies dependencies are packages that our code depends on we call them at the top of each script we write with the import statement any programmer can write a package to say figure out who shot harambe in a thousand lines of code upload it to the python package server then we could download it and call it with a single line of code all code is part of a greater hole it's all linked together in a grand chain of dependencies it's like building a house in order for you to be able to build the roof of a house it'd be nice if you already had the dependencies the python package manager pip helps us install dependencies and we'll use it right from commandline you can install pip for Python 3 using these commands for whichever op operating system you're using the only dependency we'll be using in this video to build our gender classifier is pyit learn a machine learning package with a bunch of pre-built models for us to use dope we have our dependencies installed and now we're ready to write our script we'll start by importing it first as we should for all dependencies we're going to use a specific subm module of psyit learn called tree that will let us build a machine learning model called a decision tree a decision tree is like a flowchart that stores data it asks each labeled data point it receives a yes or no question does it contain X or not if the answer is yes the data moves One Direction if the answer is no it moves in the other it'll build every Noe in the tree the more data points it receives then when we have a new unlabeled data point we can feed it to the tree It'll ask it a series of questions until it labels it that label is our classification the more data we train it on the more accurate the classification let's start by creating our data set programmatically we'll write our first variable X as a list of lists a variable is a value that can change and will store our list of lists in it a list is a data type in Python that can store a sequence of values here each value is a list itself that contains three numbers that represent the length width and shoe size of a person we'll write 11 of these so our data set size is only 11 people we'll write one more variable called y to store a list of labels each label is a gender and is associated with the list of body metrics in the previous list we'll write them as strings which is a data type used to represent text instead of numbers now that we have our data set we'll want to Define a variable to store our decision tree model let's call it clf short for classifier and it'll store our decision tree classifier we can reference our tree dependency directly by calling it here then initialize the decision Tree by calling the decision tree method on the tree object now that we have our tree variable we can train it on our data set we'll call the fit method on the classifier variable which takes two arguments we'll store our X and Y variables as the arguments and the result will be stored in the updated clf variable the fit method trains the decision Tree on our data set let's test it by classifying the gender of someone given a new list of body metrics we'll create a variable called prediction to store the result and call the predict method of our decision tree to predict the gender given these three values in a list then we can print it out to terminal via the print command we can run the script in terminal by saving it as demo. piy and running it via the python demo. command so to break it down data scientists solve problems using data and because easy to ous machine learning libraries and abundant data are now available everywhere you can become one python is a programming language for both beginners and experts and emphasizes readability and a decision tree is a model that classifies data by creating brand in for every possible outcome the challenge for this video is to use any three different classifiers from the scikit learn package on this same data set compare their results then put the name of the best one sometimes you have to try a few models to see what gives you the most accurate predictions post your GitHub Link in the comments I'll pick a winner within one week and mention them in the next video please share this video if you liked it and subscribe for more programming videos for now I've got to drink some soilent so thanks for watching

Original Description

Welcome to the 1st Episode of Learn Python for Data Science! This series will teach you Python and Data Science at the same time! In this video we install Python and our text editor (Sublime Text), then build a gender classifier using the sci-kit learn library in just about 10 lines of code. Please subscribe & share this video if you liked it! The code for this video is here: https://github.com/llSourcell/gender_classification_challenge I created a Slack channel for us, sign up here: https://wizards.herokuapp.com/ Download Python here: https://www.python.org/downloads/ Download Sublime Text here: https://www.sublimetext.com/3 Some Great simple sci-kit learn examples here: https://github.com/chribsen/simple-machine-learning-examples and the official scikit website: http://scikit-learn.org/ Highly recommend this online book as supplementary reading material: https://learnpythonthehardway.org/book/ Wondering when to use which model? This chart helps, but keep in mind deep neural nets outperform pretty much any model given enough data and computing power. so use these when you don't have access to loads of data and compute: http://scikit-learn.org/stable/tutorial/machine_learning_map/ Thank you guys for watching! Subscribe, like, and comment! That's what keeps me going. Feel free to support me on Patreon: https://www.patreon.com/user?u=3191693 Follow me: Twitter: https://twitter.com/sirajraval Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/ Instagram: https://www.instagram.com/sirajraval/ Signup for my newsletter for exciting updates in the field of AI: https://goo.gl/FZzJ5w Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available): https://www.wagergpt.xyz
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Siraj Raval · Siraj Raval · 39 of 60

1 What is Bitcoin?
What is Bitcoin?
Siraj Raval
2 5 Ways to Use Bitcoin
5 Ways to Use Bitcoin
Siraj Raval
3 BTC Fever - Siraj [Music Video]
BTC Fever - Siraj [Music Video]
Siraj Raval
4 5 Reasons to Build Decentralized Apps
5 Reasons to Build Decentralized Apps
Siraj Raval
5 The Interplanetary File System
The Interplanetary File System
Siraj Raval
6 How to Build a Dapp in 3 min
How to Build a Dapp in 3 min
Siraj Raval
7 Life Before Smartphones
Life Before Smartphones
Siraj Raval
8 4 Ways to Use Smart Contracts
4 Ways to Use Smart Contracts
Siraj Raval
9 3 Dapps You HAVE to See
3 Dapps You HAVE to See
Siraj Raval
10 Char's Life as a BitTorrent Engineer
Char's Life as a BitTorrent Engineer
Siraj Raval
11 4 Reasons AlphaGo is a Huge Deal
4 Reasons AlphaGo is a Huge Deal
Siraj Raval
12 Build a Neural Net in 4 Minutes
Build a Neural Net in 4 Minutes
Siraj Raval
13 Sentiment Analysis in 4 Minutes
Sentiment Analysis in 4 Minutes
Siraj Raval
14 The Hackathon Life
The Hackathon Life
Siraj Raval
15 Your First ML App - Machine Learning for Hackers #1
Your First ML App - Machine Learning for Hackers #1
Siraj Raval
16 Build an AI Composer - Machine Learning for Hackers #2
Build an AI Composer - Machine Learning for Hackers #2
Siraj Raval
17 Build a Game AI - Machine Learning for Hackers #3
Build a Game AI - Machine Learning for Hackers #3
Siraj Raval
18 Build a Movie Recommender - Machine Learning for Hackers #4
Build a Movie Recommender - Machine Learning for Hackers #4
Siraj Raval
19 Build an AI Artist - Machine Learning for Hackers #5
Build an AI Artist - Machine Learning for Hackers #5
Siraj Raval
20 Build a Chatbot - ML for Hackers #6
Build a Chatbot - ML for Hackers #6
Siraj Raval
21 Build an AI Reader - Machine Learning for Hackers #7
Build an AI Reader - Machine Learning for Hackers #7
Siraj Raval
22 Build an AI Writer - Machine Learning for Hackers #8
Build an AI Writer - Machine Learning for Hackers #8
Siraj Raval
23 Build a Chatbot w/ an API - ML for Hackers #9
Build a Chatbot w/ an API - ML for Hackers #9
Siraj Raval
24 One-Shot Learning - Fresh Machine Learning #1
One-Shot Learning - Fresh Machine Learning #1
Siraj Raval
25 Generative Adversarial Nets - Fresh Machine Learning #2
Generative Adversarial Nets - Fresh Machine Learning #2
Siraj Raval
26 Tone Analysis - Fresh Machine Learning #3
Tone Analysis - Fresh Machine Learning #3
Siraj Raval
27 Generate Rap Lyrics - Fresh Machine Learning #4
Generate Rap Lyrics - Fresh Machine Learning #4
Siraj Raval
28 Build an Autoencoder in 5 Min - Fresh Machine Learning #5
Build an Autoencoder in 5 Min - Fresh Machine Learning #5
Siraj Raval
29 Build a Self Driving Car in 5 Min - Fresh Machine Learning #6
Build a Self Driving Car in 5 Min - Fresh Machine Learning #6
Siraj Raval
30 Build an Antivirus in 5 Min - Fresh Machine Learning #7
Build an Antivirus in 5 Min - Fresh Machine Learning #7
Siraj Raval
31 TensorFlow in 5 Minutes (tutorial)
TensorFlow in 5 Minutes (tutorial)
Siraj Raval
32 Build a Recurrent Neural Net in 5 Min
Build a Recurrent Neural Net in 5 Min
Siraj Raval
33 Build a Simulation in 5 Min
Build a Simulation in 5 Min
Siraj Raval
34 Build a TensorFlow Image Classifier in 5 Min
Build a TensorFlow Image Classifier in 5 Min
Siraj Raval
35 Tensorboard Explained in 5 Min
Tensorboard Explained in 5 Min
Siraj Raval
36 Generate Music in TensorFlow
Generate Music in TensorFlow
Siraj Raval
37 Build a Game Bot (LIVE)
Build a Game Bot (LIVE)
Siraj Raval
38 Deep Learning Frameworks Compared
Deep Learning Frameworks Compared
Siraj Raval
Introduction - Learn Python for Data Science #1
Introduction - Learn Python for Data Science #1
Siraj Raval
40 Build a Neural Network (LIVE)
Build a Neural Network (LIVE)
Siraj Raval
41 Twitter Sentiment Analysis - Learn Python for Data Science #2
Twitter Sentiment Analysis - Learn Python for Data Science #2
Siraj Raval
42 Recommendation Systems - Learn Python for Data Science #3
Recommendation Systems - Learn Python for Data Science #3
Siraj Raval
43 Predicting Stock Prices - Learn Python for Data Science #4
Predicting Stock Prices - Learn Python for Data Science #4
Siraj Raval
44 Pong Neural Network (LIVE)
Pong Neural Network (LIVE)
Siraj Raval
45 Deep Dream in TensorFlow - Learn Python for Data Science #5
Deep Dream in TensorFlow - Learn Python for Data Science #5
Siraj Raval
46 Visualizing Data with D3.js (LIVE)
Visualizing Data with D3.js (LIVE)
Siraj Raval
47 Genetic Algorithms - Learn Python for Data Science #6
Genetic Algorithms - Learn Python for Data Science #6
Siraj Raval
48 Enter Siraj [Music Video]
Enter Siraj [Music Video]
Siraj Raval
49 Build a Web Scraper (LIVE)
Build a Web Scraper (LIVE)
Siraj Raval
50 Why is P vs NP Important?
Why is P vs NP Important?
Siraj Raval
51 How to Make a Neural Network (LIVE)
How to Make a Neural Network (LIVE)
Siraj Raval
52 How to Make an Amazing Tensorflow Chatbot Easily
How to Make an Amazing Tensorflow Chatbot Easily
Siraj Raval
53 How to Make an Amazing Video Game Bot Easily
How to Make an Amazing Video Game Bot Easily
Siraj Raval
54 How to Make a Tensorflow Neural Network (LIVE)
How to Make a Tensorflow Neural Network (LIVE)
Siraj Raval
55 How to Make a Simple Tensorflow Speech Recognizer
How to Make a Simple Tensorflow Speech Recognizer
Siraj Raval
56 Joel Shor - Really Quick Questions with an Awesome Google Engineer
Joel Shor - Really Quick Questions with an Awesome Google Engineer
Siraj Raval
57 How to Make a Path Planning Algorithm Easily (LIVE)
How to Make a Path Planning Algorithm Easily (LIVE)
Siraj Raval
58 The Best Way to Prepare a Dataset Easily
The Best Way to Prepare a Dataset Easily
Siraj Raval
59 Catherine Olsson - Really Quick Questions with an OpenAI Engineer
Catherine Olsson - Really Quick Questions with an OpenAI Engineer
Siraj Raval
60 How to Make a Tic Tac Toe Neural Network Easily (LIVE)
How to Make a Tic Tac Toe Neural Network Easily (LIVE)
Siraj Raval

This video series teaches Python and data science, starting with the installation of Python and setting up the environment. It then builds a gender classifier using the scikit-learn library. The series covers machine learning concepts such as decision tree classification and data analysis.

Key Takeaways
  1. Install Python
  2. Set up the environment
  3. Install dependencies using pip
  4. Import dependencies using import statement
  5. Create a dataset programmatically
  6. Define a decision tree model
  7. Train the decision tree on the dataset
  8. Compare results of different classifiers
  9. Select the best classifier for accurate predictions
💡 The video demonstrates how to use Python and scikit-learn library to build a simple machine learning model for gender classification, highlighting the ease of use and readability of Python.

Related AI Lessons

Up next
How to Open HPL Files (HP-GL Plotter)
File Extension Geeks
Watch →