How to Simulate NBA Games in Python

Ken Jee · Beginner ·📰 AI News & Updates ·7y ago

Key Takeaways

The video demonstrates how to simulate NBA games using Python 3.6, specifically the 2017-2018 NBA Finals between Golden State and Cleveland, utilizing Monte Carlo simulation and NBA team game stats from a Kaggle data set. It showcases the use of libraries such as pandas, matplotlib, and numpy for data analysis and visualization.

Full Transcript

what's up Ken here from flying numbers today I'm showing you how to simulate NBA game outcomes using Python 3.6 more specifically I'll be using a Monte Carlo simulation for this analysis now I bet you're wondering what a simulation is in the most basic terms a simulation is randomly sampling from a distribution so if we randomly sample let's say from team points here enough times will actually just recreate that distribution somewhere else by itself that doesn't really tell us a whole lot but when we compare one distribution with another distribution it gives us a surplus so let's say we're comparing or randomly sampling from team points and opponent points we can determine on our samples what percent of the time team points is actually higher than opponent points and that tells us a little bit more information if we're adding another distribution to that or anything like that it continues to give us more information and the mathematical complexity of those problems increases if we're using simulation we don't have to worry about the mathematical complexity we just have to run the simulation more times and we get closer and closer to that limit so simulation is a great way to actually simplify problems and get really really good results regardless for this example I'm going to be using the NBA team game stats from 2014 to 2018 data set from Kaggle the link will be in the description below and in our analysis here I'm going to recreate the NBA Finals or actually simulate the NBA Finals from the 2017 2018 season where the Golden State defeated Cleveland now let's just jump straight into the code so we import these modules the most important ones here are going to be pandas and the random module from Python I don't actually use numpy in this video but it's used for the more advanced version of this of this code that I've written on my github playing numbers for that I show you you know the code is actually flexible enough to simulate all any of the team in the data set rather than just this one example I don't use great programming paradigms here but I'm doing this code in a certain way to illustrate a point now we also use matplotlib to actually visualize the histograms which is very very important in simulation so we're just going to read in the data there let's take a look at the columns that we're going to use so in this analysis we're only concerned with four columns the first being the team so I care about Golden State in Cleveland the next being the date we only want the 2017-2018 season because that's going to be most representative of what happened in the files or what we're trying to estimate what happened in class we're also looking at team points and opponent points so that's just the number of points a team scores as a distribution and the number of points that are scored against them as a distribution so right here we're going to break the data into two data frames one for Golden State and one for Cleveland this line right here I just use a lambda function to filter out all games that are not from the 2017-2018 season now let's take our first stab at looking at the actual total point distributions so in blue we can see Golden State the distribution of points and in orange we can see Cleveland's distribution of points it looks like Golden State has a slightly higher average point total per game than Cleveland does here but both of these distributions appear to be normally distributed which is exactly what we want in this type of simulation we look at points against and we see that same almost normal distribution and we also see that Cleveland it appears has slightly more points scored against them than Golden State so now we just tabulate those things into variables so we take the team point averages and before Cleveland and Golden State and save those in two variables we take the standard deviations which are also very important so whenever you make any type of normal distribution you really only need two the first being the mean and the second being the standard deviation so those are kind of the magic components for us actually running the simulations here we also look at the mean and standard deviations of the opponent points against so as you can see it looks like our quick analysis from the histogram is right Golden State in fact on average score slightly more points than Cleveland now just as an example before we get into the simulation code what we're going to be doing is randomly sampling again from a specific distribution so this Gaussian is a normal distribution with a mean of the total number of points that Golden's date and with the average number of points Golden State scores and the standard deviation of that distribution so if we run it enough times it should reach a limit of an average of 113 and that appears to be fairly close it might be a little lower until we run it a certain number of times and we get a very realistic distribution so with that thought in mind let's actually look at the first real component of our distribution and our simulation code so the game sim simulates just one game and the game Sim just runs game Sim over and over again and tabulates the results of the simulations it just keeps track of what happens in game set so for game Sim we want to simulate one the Golden State score to the cleveland score and then we compare them so when we submit when we simulate a score we take a sample from the random distribution of Golden State and we average that with the randoms distribution of the number of points that Cleveland allows so in my opinion that's a fairly good estimator because you're looking at how those teams specifically would match up in terms of how good at Golden State's offenses and how good Cleveland's defenses and we the exact opposite for Cleveland we look at how many points you know random sample of from their total points distribution and a random sample from Golden State's defensive distribution now we compare those two variables that we created and if Golden State wins this matchup we get a 1 if Cleveland wins the matchup we get minus 1 and if it's a tie which I know can't happen we get a 0 so I just built pause in here because then we won't have any holes in the data if you'd like to you can go forward and create some tiebreaker criteria now let's just run a couple example games so in that scenario Golden State one in that scenario Cleveland won so you can see that it's not just going to be one outcome over and over again now let's run a couple of actual game simulations so if we run this ten times it appears that Golden State won seven of those towns in Cleveland won three now let's run it a hundred times and the more we run it the closer to the limit we actually get so we got we're at 63 percent that's one a thousand we're at 55 ten thousand fifty five point nine one so it looks like we're rounding out right around 55 56 percent up so if we were to do this analysis in perpetuity we rented an infinite number of times it looks like Golden State would win between 55 and 57 percent of the time and that tells us a lot of information if we were interested in sports betting for example we might be able to evaluate if the line is good on a certain night based on what are calculated winning percentages you can also use this in fantasy sports we can use this for all other types of analysis but again specifically in sports simulation can be really really fun and interesting if you're so inclined you can definitely build on this model and add in you know more features you can look at specific positions or even at the shot level if you're really really interested in getting your hands dirty so hopefully this is a great starting place for a lot of pee people if you have any questions or comments please leave them in the section below and if you'd like me to keep producing compound or if you like videos like this please subscribe and I'll try and produce more interesting videos of this nature thank you so much again and have a great one

Original Description

In this video I show you how to simulate NBA Games using Python 3.6. As an example I simulate the NBA Finals from the 2017-2018 season where Golden State played Cleveland. Data: https://www.kaggle.com/ionaskel/nba-games-stats-from-2014-to-2018 Github: https://github.com/PlayingNumbers/NBASimulator #DataScience #SportsAnalytics #Basketball #Simulation #KenJee ⭕ Subscribe: https://www.youtube.com/c/kenjee1?sub_confirmation=1 🎙 Listen to My Podcast: https://www.youtube.com/c/KensNearestNeighborsPodcast 🕸 Check out My Website - https://kennethjee.com/ ✍️Sign up for My Newsletter - https://www.kennethjee.com/newsletter 📚 Books and Products I use - https://www.amazon.com/shop/kenjee (affiliate link) Partners & Affiliates 🌟 365 Data Science - Courses ( 57% Annual Discount): https://365datascience.pxf.io/P0jbBY 🌟 Interview Query - https://www.interviewquery.com/?ref=kenjee MORE DATA SCIENCE CONTENT HERE: 🐤My Twitter - https://twitter.com/KenJee_DS 👔 LinkedIn - https://www.linkedin.com/in/kenjee/ 📈 Kaggle - https://www.kaggle.com/kenjee 📑 Medium Articles - https://medium.com/@kenneth.b.jee 💻 Github - https://github.com/PlayingNumbers 🏀 My Sports Blog -https://www.playingnumbers.com Check These Videos Out Next! My Leaderboard Project: https://www.youtube.com/watch?v=myhoWUrSP7o&ab_channel=KenJee 66 Days of Data: https://www.youtube.com/watch?v=qV_AlRwhI3I&ab_channel=KenJee How I Would Learn Data Science in 2021: https://www.youtube.com/watch?v=41Clrh6nv1s&ab_channel=KenJee My Playlists Data Science Beginners: https://www.youtube.com/playlist?list=PL2zq7klxX5ATMsmyRazei7ZXkP1GHt-vs Project From Scratch: https://www.youtube.com/watch?v=MpF9HENQjDo&list=PL2zq7klxX5ASFejJj80ob9ZAnBHdz5O1t&ab_channel=KenJee Kaggle Projects: https://www.youtube.com/playlist?list=PL2zq7klxX5AQXzNSLtc_LEKFPh2mAvHIO
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Ken Jee · Ken Jee · 5 of 60

1 Predicting Crypto-Currency Price Using RNN lSTM & GRU
Predicting Crypto-Currency Price Using RNN lSTM & GRU
Ken Jee
2 Predicting Season Long NBA Wins Using Multiple Linear Regression
Predicting Season Long NBA Wins Using Multiple Linear Regression
Ken Jee
3 How I Became A Data Scientist From a Business Background
How I Became A Data Scientist From a Business Background
Ken Jee
4 Should You Get A Masters in Data Science?
Should You Get A Masters in Data Science?
Ken Jee
How to Simulate NBA Games in Python
How to Simulate NBA Games in Python
Ken Jee
6 Demystifying Data Science Roles
Demystifying Data Science Roles
Ken Jee
7 The Best Way to Predict NBA Minutes Played
The Best Way to Predict NBA Minutes Played
Ken Jee
8 IT'S NOT TOO LATE TO LEARN CODE!
IT'S NOT TOO LATE TO LEARN CODE!
Ken Jee
9 My Top 5 Data Science Resources for 2019
My Top 5 Data Science Resources for 2019
Ken Jee
10 Watch This Before Applying to Data Science Jobs
Watch This Before Applying to Data Science Jobs
Ken Jee
11 Where YOU Should Start With Data Science Projects
Where YOU Should Start With Data Science Projects
Ken Jee
12 Welcome To My Channel | Ken Jee | Data Science
Welcome To My Channel | Ken Jee | Data Science
Ken Jee
13 Why You DON'T Want to be a WFH Data Scientist
Why You DON'T Want to be a WFH Data Scientist
Ken Jee
14 Was Captain Marvel Bad? A Sentiment Analysis of Twitter Data
Was Captain Marvel Bad? A Sentiment Analysis of Twitter Data
Ken Jee
15 Data Science, Machine Learning, and AI: What's the Difference?
Data Science, Machine Learning, and AI: What's the Difference?
Ken Jee
16 Data Science: Startup vs. Large Corporation
Data Science: Startup vs. Large Corporation
Ken Jee
17 Where to Look for Data Science Jobs
Where to Look for Data Science Jobs
Ken Jee
18 Work From Home Data Scientist: Day in the Life
Work From Home Data Scientist: Day in the Life
Ken Jee
19 Scrape Twitter Data in Python with Twitterscraper Module
Scrape Twitter Data in Python with Twitterscraper Module
Ken Jee
20 Should You Learn R for Data Science?
Should You Learn R for Data Science?
Ken Jee
21 NASA Physicist Turned Data Scientist (Tim Bowling) - KNN EP. 02
NASA Physicist Turned Data Scientist (Tim Bowling) - KNN EP. 02
Ken Jee
22 I Wish I Had Known THIS Before Starting in Data Science
I Wish I Had Known THIS Before Starting in Data Science
Ken Jee
23 What I Learned From My Three Degrees
What I Learned From My Three Degrees
Ken Jee
24 Most Data Science Hopefuls Overlook This Important Skill
Most Data Science Hopefuls Overlook This Important Skill
Ken Jee
25 Golf STATS: Strokes Gained Explained
Golf STATS: Strokes Gained Explained
Ken Jee
26 My Top 5 Data Science Internship Tips
My Top 5 Data Science Internship Tips
Ken Jee
27 How I Got My First Data Science Internship (And How You Can Land One)
How I Got My First Data Science Internship (And How You Can Land One)
Ken Jee
28 Data Science: Pros and Cons
Data Science: Pros and Cons
Ken Jee
29 Data Science Fundamentals: Data Exploration in Python (Pandas)
Data Science Fundamentals: Data Exploration in Python (Pandas)
Ken Jee
30 Data Science Fundamentals: Data Manipulation in Python (Pandas)
Data Science Fundamentals: Data Manipulation in Python (Pandas)
Ken Jee
31 What Does a Data Scientist Actually Do?
What Does a Data Scientist Actually Do?
Ken Jee
32 The Projects You Should Do To Get A Data Science Job
The Projects You Should Do To Get A Data Science Job
Ken Jee
33 Take Your Data Science Projects From Good to Great
Take Your Data Science Projects From Good to Great
Ken Jee
34 How To Get Data Science Experience (Without a Job)
How To Get Data Science Experience (Without a Job)
Ken Jee
35 Data Science Fundamentals: Data Cleaning in Python
Data Science Fundamentals: Data Cleaning in Python
Ken Jee
36 Is Data Science Right For You?
Is Data Science Right For You?
Ken Jee
37 Thank You For The Support | What's Next | Ken Jee | Data Science
Thank You For The Support | What's Next | Ken Jee | Data Science
Ken Jee
38 How To Build A Word Cloud From Scraped Data (Python)
How To Build A Word Cloud From Scraped Data (Python)
Ken Jee
39 6 Habits of Successful Data Scientists
6 Habits of Successful Data Scientists
Ken Jee
40 How Far Should the NBA 3-Point Line Actually Be?
How Far Should the NBA 3-Point Line Actually Be?
Ken Jee
41 How to Stay Productive & Motivated When Learning Data Science
How to Stay Productive & Motivated When Learning Data Science
Ken Jee
42 Why is Balance Important in Data Science?
Why is Balance Important in Data Science?
Ken Jee
43 By The Numbers: Where Should The NBA Put a 4 Point Line?
By The Numbers: Where Should The NBA Put a 4 Point Line?
Ken Jee
44 Why Selling Is An Important Data Science Skill
Why Selling Is An Important Data Science Skill
Ken Jee
45 Applying Data Science To My YouTube Data: My Surprising Findings
Applying Data Science To My YouTube Data: My Surprising Findings
Ken Jee
46 9 Ways You Can Make Extra Income as a Data Scientist
9 Ways You Can Make Extra Income as a Data Scientist
Ken Jee
47 Sports Analytics 101: The Pythagorean Theorem of Sports
Sports Analytics 101: The Pythagorean Theorem of Sports
Ken Jee
48 Golf: Would You Rather Be the LONGEST or STRAIGHTEST Driver on the PGA Tour?
Golf: Would You Rather Be the LONGEST or STRAIGHTEST Driver on the PGA Tour?
Ken Jee
49 Data Science Fundamentals: Linear Regression
Data Science Fundamentals: Linear Regression
Ken Jee
50 How YOU Can Land a Sports Analytics Job
How YOU Can Land a Sports Analytics Job
Ken Jee
51 The 5 Stages of Data Science Adoption
The 5 Stages of Data Science Adoption
Ken Jee
52 Math Needed for Mastering Data Science
Math Needed for Mastering Data Science
Ken Jee
53 5 Sports Analytics Books to Get You Started
5 Sports Analytics Books to Get You Started
Ken Jee
54 3 Reasons You Should NOT Become a Data Scientist
3 Reasons You Should NOT Become a Data Scientist
Ken Jee
55 Collision Course: Sports Betting + Data Science
Collision Course: Sports Betting + Data Science
Ken Jee
56 How to Scrape NBA Data Using the nba_api Python Module
How to Scrape NBA Data Using the nba_api Python Module
Ken Jee
57 5 Data Science Resolutions for 2020
5 Data Science Resolutions for 2020
Ken Jee
58 The Data Science Interview: What to Expect
The Data Science Interview: What to Expect
Ken Jee
59 The 9 Books That Changed My Perspective in 2019
The 9 Books That Changed My Perspective in 2019
Ken Jee
60 Questions You Should Ask Your Data Science Interviewers
Questions You Should Ask Your Data Science Interviewers
Ken Jee

This video teaches how to simulate NBA games using Python and Monte Carlo simulation, allowing viewers to estimate winning percentages and analyze team performance. It provides a practical example of data analysis and visualization in sports. The simulation can be used for fantasy sports analysis and can be built upon to add more features.

Key Takeaways
  1. Import necessary modules
  2. Read in data from Kaggle
  3. Filter out games not from 2017-2018 season
  4. Create data frames for Golden State and Cleveland
  5. Tabulate team point averages and standard deviations into variables
  6. Randomly sample from a normal distribution to model team scores
  7. Compare the scores of Golden State and Cleveland to determine the winner
  8. Run the simulation multiple times to get a more accurate estimate of the winning percentage
💡 The Monte Carlo simulation can be used to estimate winning percentages and analyze team performance in sports, providing a valuable tool for fantasy sports analysis and decision-making.

Related AI Lessons

The AI Moat Paradox: The Better Models Become, the Less Models Matter
The AI moat paradox suggests that as AI models improve, their importance may decrease, and understanding this concept is crucial for AI professionals and businesses.
Medium · AI
170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026
Discover the biggest AI research shifts of 2026 based on 170,927 papers, and learn how to apply these trends to your work
Medium · Machine Learning
170,927 AI Papers Reveal the Biggest Research Shifts of the First Half of 2026
Discover the major research shifts in AI from 170,927 papers published in the first half of 2026, and learn how to analyze trends in AI research
Medium · Data Science
[PoV] When Everyone Is Smart, No One Is
In a world where AI makes everyone smart, the value of intelligence decreases, and new challenges arise
Medium · AI
Up next
‘ENOUGH IS ENOUGH’: Lebanon is STANDING UP to Iran, expert says
Fox Business
Watch →