Data Science SQL Interview Question Walkthrough | SQL Sundays #9

Tina Huang · Beginner ·📰 AI News & Updates ·5y ago

Key Takeaways

Walks through a SQL interview question on estimating Airbnb growth

Full Transcript

hi welcome to another sql sunday where i go through a walkthrough of a real data science interview question let's get started all right so this question over here is growth of airbnb so let's read the question estimate the growth of airbnb each year using the number of hosts registered as the growth metric the rate of growth is calculated by taking the number of hosts registered in the current year minus the number of hosts registered in the previous year divided by the number of hosts registered in the previous year times 100. okay i'll put the year number of hosts in the current year number of hosts in the previous year and the rate of growth round area grew to the nearest percent and ordered the result in the ascending order based on the view assume that the data set consists only of unique hosts meaning that there are no duplicate hosts listed okay so let's actually check out the table so this is a really big table um a lot of stuff going on over here but really though like what we actually want to find ultimately is the year the number of hosts in the current year number of posts in the previous year and the growth rate so we actually don't need most of the columns over here so this is what the table looks like um it has like a bunch of stuff in it so really what we need here is just the ids and we know that a data cell only can some unique hosts so each of these ids would be one single host and we also need the year that they registered posted this column that's pretty much it yeah we actually don't need any of the other columns here even though there's a lot that's being shown okay so some of the assumptions that i have um looking at this table now and just looking at the question um so my my first assumption was answered already it was like the data sentence is only unique host okay so they are only unique hosts um [Music] the other assumption that i want to make here is because we're looking at rate of growth that means in the first year the rate of growth there would not be rate of growth because there would be no previous year so if i were an actual interview i would probably ask the interviewer like what would you like me to do about that scenario like should i just leave it as no leave it as is or should i you know get rid of that rule or do something like that so the assumption that i'll be making here is just that i will have a row um that will have some blanks in it and just leave it leave it as it is that's pretty much the only assumption that i have so before i get started and actually dive into the query i'm gonna go through my general strategy of first writing out exactly how i'm going to approach this question in english and then go and write the query let's first write down what it is that we ultimately want so we want the ear and the number of hosts i guess i call it num host current and the num hosts previous and we also want the growth rate so that's the ultimate table that we want okay so now how i would approach this question okay so first i'm gonna have a cte over here um that is gonna just get the year and the number of posts for that year um and then after that i'm gonna do a self point of this table over here and that's going to be based on so self join that's going to be on um so it's like the previous year is equal to current year minus one right um and it's this is gonna be like a left join because we want that first row as well okay and then after that i do the join and then what i would do is get the current year um and the number current num what did i call it num hosts current and the num hosts previous and i also want to get the growth rate which would be calculated by um current year so it would be like num host current minus the num host previous uh divided by the num host sees around that multiply by 100 so that would mean a growth rate and then ultimately i want to get the iran to round the growth rate two nearest percent um and order by the current year okay so this looks like the correct approach to me so i'm just going to go ahead and actually write the query oh i spelled this wrong order okay so to actually write this create quick query um i would do like from this table airbnb search details um and i want to select the year which is not called year it is host hostess it's called host since and i want to actually extract the year from that because this is like a a date um so i want to extract oops ear from post-its i'm just going to call that ear um and then i want to get the number post so that would be i think it's id i'm pretty sure it's id yeah id so and then i want to get the um count id as numbers register that year and we don't need to do like distinct over here um because it says that it's unique already okay and then we need to group by one which is the year over here uh this is going to be cte we're going to call this table registers let's call it registers as like that um and then we're going to do the joins now so from registers let's call that current year um join left join registers previous year um and we are going to be joining on it's left join previous zero is equal to occurring or minus one so oops current year dot year is equal to previous year dot here minus one no plus one wait oh no did it did the other way around oh okay i'm just going to do it the way that i wrote it although it doesn't really matter that much previous year is equal to current year dot year minus one okay so i did that and then i want to get the current year numpost current number closed previous as well as the growth rate okay um [Music] so for that i would get the current year dot year and then i want to get the current year dot num host uh we can call that what did i call it num post current and then we can get the previous year dot num host as num hosts pbs and we want to get the growth rate so that is the current year dot num hosts minus the previous year dot numbers divided by the previous year dot numbers multiply by 100 did i put a space in here no i didn't okay um and then we want to put parentheses around this and then also just multiply that by 1.00 so that it won't be integer division um and then we're going to call that um okay and then next we have to round the growth rate and an order by current year okay so to do that we can just do round over here that should do by the percent and then we want to order by the year so order by one which is the current year so let me just go over this query and make sure that i did everything correctly pull this down a little bit uh let's see so registers here okay from airbnb search details i want to do extract the year from host since the year county ids okay so i get year and number of hosts and then um i'm doing the join over here so from register's current year left my registers previous year on previous year is equal to current year minus one that makes sense and then i'm going to get the current the year of the current year and then get the number of hosts for the current year and number host for the previous year and then finally the um growth rate which is current year number of host minus previous year times 1.00 divided by the previous year um and then multiply that by 100 as the growth rate and then we round by the growth rate okay so we did that and then finally we order by the current year so ordered by one okay so that looks correct to me fingers crossed let's see if it works hey nice okay yep yep this is what it looks like just to double check look at their expected output um 200 200 2009 uh navajo's current previous year growth rate cool i mean that looks correct awesome so in terms of efficiency and optimization here uh so we do have a self join over here you know what i think we can actually also just do this with a window function yeah i think we can actually do that with a window function as well why do i actually leave that to you guys with homework write in the comments below um how you how you would approach this question using a window function and also if you think that it'll be more efficient or less efficient than the way that i've done it here using a self join so before i wrap up this video i do have an announcement my sequel for data science and tech interviews course in partnership with 365 data science has launched this week it's a step-by-step guide on how to ace the sql portion of the interview complete with 10 full mock interviews so you may actually be wondering what's the difference between that course and the sequel sundays and all the other free content i already have on this channel let me explain if you follow this free video where i outline how i pass my own fang interview and practice the sequel sundays in the style that i explained in that video you would be mostly there i mean that's how i did it myself and i didn't even have the sequel sundays or anything like that and i was clearly fine this course though is for people who are looking for a little bit more and a resource that i personally wish that i had when i was interviewing it outlines the exact steps to do to learn enough sql to pass the interviews but i think the biggest value is my guided walkthrough and coaching through 10 full mock interviews unlike the sequel sunday mock interview walkthroughs the 10 full mock interviews has an actual interviewer that asks follow questions gives clarifications and provides feedback i also walk you through how to install pg admin which is for developing and using a postgresql database and then i provide you the data and sql script that adds the interview questions data into the database so you can actually answer the interview questions yourself it's basically as close to the real interview as you can get my goal is for you to feel like you have done 10 full sql interviews before your actual interview so that your actual one will feel just like another mock and finally you will also get my direct support if you have any questions i honestly wish i could answer every single question on youtube of course but unfortunately if i did that i would literally not do anything else in my day so i will be prioritizing answering all questions from the course i've linked a coupon code in the description so you can get it for 9.99 on udemy for a limited amount of time if you're interested please do check it out and i'll see you guys in the next live stream or video

Original Description

Hi friends! Here's a SQL Sunday as part of the SQL Sundays series in celebration of the launch of my SQL for tech and data science interviews course! I know I know it's Tuesday :x Can we let this one slide? SQL for tech and data science interviews course with 10 mock interviews: https://www.udemy.com/course/sql-for-tech-and-data-science-interviews/?couponCode=DF947818F15A561AE84E ($12.99 promotion until July 18th, 2021!) [expired] https://www.udemy.com/course/sql-for-tech-and-data-science-interviews/?couponCode=A92E9C5BFB043A8FF039 ($9.99 promotion until June 6th, 2021!) discord: https://discord.gg/5mMAtprshX livestreaming google calendar: https://calendar.google.com/calendar/u/1?cid=cDBtOGgxOG1waW92bTJxYzdpZmkzNmgwODhAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ ______________________________________________________________________ Timestamps 00:00 intro 00:08 question 01:51 assumptions 02:47 answer 10:43 check answer 11:53 announcement ______________________________________________________________________ Other videos you might be interested in How to learn data science in 2021: https://www.youtube.com/watch?v=Axu4tJl8gbM The resume that got me into FAANG as a data scientist: https://www.youtube.com/watch?v=vx-x-yXXE9I ______________________________________________________________________ Subscribe: https://www.youtube.com/channel/UC2UXDak6o7rBm23k3Vv5dww/?sub_confirmation=1 ______________________________________________________________________ SQL for tech and data science interviews course with 10 mock interviews: https://www.udemy.com/course/sql-for-tech-and-data-science-interviews/?couponCode=A92E9C5BFB043A8FF039 ($9.99 promotion until June 6th, 2021!) Real SQL interview question walkthrough series: https://www.youtube.com/watch?v=Td-cmLfQ7uU&list=PLVD3APpfd1tuXrXBWAntLx4tNaONro5dA Check out StrataScratch for SQL interview prep: https://stratascratch.com/?via=tina ______________________________________________________________________ About me
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Tina Huang · Tina Huang · 55 of 60

1 How to choose between software engineering and data science | 5 Key Considerations
How to choose between software engineering and data science | 5 Key Considerations
Tina Huang
2 How I got Software Engineering and Data Science Internships | Computer Science Job Search Part 1
How I got Software Engineering and Data Science Internships | Computer Science Job Search Part 1
Tina Huang
3 How I Became a Data Scientist | Computer Science Job Search Part 2
How I Became a Data Scientist | Computer Science Job Search Part 2
Tina Huang
4 3rd Year Statistics,  Data Science, Computer Science Resume | Reviewing Your Resumes Ep. 1
3rd Year Statistics, Data Science, Computer Science Resume | Reviewing Your Resumes Ep. 1
Tina Huang
5 How to learn SQL for data science interview (the minimize effort maximize outcome way)
How to learn SQL for data science interview (the minimize effort maximize outcome way)
Tina Huang
6 3rd Year CS Resume (and asian drink) Review | Reviewing Your Resumes Ep. 2
3rd Year CS Resume (and asian drink) Review | Reviewing Your Resumes Ep. 2
Tina Huang
7 Are you a student? If yes, this is the best data science project for you!
Are you a student? If yes, this is the best data science project for you!
Tina Huang
8 The Ultimate Guide to Land a Data Science Internship | schedule and time management survival guide
The Ultimate Guide to Land a Data Science Internship | schedule and time management survival guide
Tina Huang
9 Upenn MCIT Program Details and Real Student Experiences - Dr. Arvind Bhusnurmath
Upenn MCIT Program Details and Real Student Experiences - Dr. Arvind Bhusnurmath
Tina Huang
10 Real Data Science SQL Interview Questions and Answers # 1 | Data Science Interview Questions
Real Data Science SQL Interview Questions and Answers # 1 | Data Science Interview Questions
Tina Huang
11 3 More Unique and Impactful Projects to get a Data Science Job
3 More Unique and Impactful Projects to get a Data Science Job
Tina Huang
12 Real Data Science SQL Interview Questions and Answers # 2 | Data Science Interview Questions
Real Data Science SQL Interview Questions and Answers # 2 | Data Science Interview Questions
Tina Huang
13 THANK YOU FOR 1000! | Proper intro | Random facts about myself
THANK YOU FOR 1000! | Proper intro | Random facts about myself
Tina Huang
14 A day in the life of a data scientist (FAANG data scientist remote)
A day in the life of a data scientist (FAANG data scientist remote)
Tina Huang
15 SQL Data Science Interview Questions and Answers (interview style walk-through) | Question 3
SQL Data Science Interview Questions and Answers (interview style walk-through) | Question 3
Tina Huang
16 Biology to Data Science (data professor's tips on how to get a data science research position)
Biology to Data Science (data professor's tips on how to get a data science research position)
Tina Huang
17 SQL Data Science Interview Questions and Answers (interview style walk-through) | SQL Sundays #4
SQL Data Science Interview Questions and Answers (interview style walk-through) | SQL Sundays #4
Tina Huang
18 Data Science SQL Interview Question Walkthrough | SQL Sundays #5
Data Science SQL Interview Question Walkthrough | SQL Sundays #5
Tina Huang
19 Data Science Resume Round-Up With @KenJee_ds  - Episode 2
Data Science Resume Round-Up With @KenJee_ds - Episode 2
Tina Huang
20 SQL Data Science Interview Question Walkthrough | SQL Sundays #6
SQL Data Science Interview Question Walkthrough | SQL Sundays #6
Tina Huang
21 Data Science vs Software Engineering Interview | 3 Key Differences
Data Science vs Software Engineering Interview | 3 Key Differences
Tina Huang
22 Data Science SQL Interview Question Walkthrough (real interview style) | SQL Sundays #7
Data Science SQL Interview Question Walkthrough (real interview style) | SQL Sundays #7
Tina Huang
23 The data science resume that got me my FAANG (MANGA?) job (entry level data scientist)
The data science resume that got me my FAANG (MANGA?) job (entry level data scientist)
Tina Huang
24 Data Science SQL Interview Question Walkthrough (real interview style) | SQL Sundays #8
Data Science SQL Interview Question Walkthrough (real interview style) | SQL Sundays #8
Tina Huang
25 Interview with a quant trader
Interview with a quant trader
Tina Huang
26 How I chose my masters degree (as an international student)
How I chose my masters degree (as an international student)
Tina Huang
27 The software engineering resume that got me into FAANG and Goldman Sachs (internship)
The software engineering resume that got me into FAANG and Goldman Sachs (internship)
Tina Huang
28 3 tips to avoid debt for a masters #SHORTS
3 tips to avoid debt for a masters #SHORTS
Tina Huang
29 A hard work day (ft. new NLP project) | FAANG data science isn't chill | vlog 1
A hard work day (ft. new NLP project) | FAANG data science isn't chill | vlog 1
Tina Huang
30 The comments sections are WILD | YouTube sentiment analysis - Data science project for beginners
The comments sections are WILD | YouTube sentiment analysis - Data science project for beginners
Tina Huang
31 Do you have what it takes to be a great data scientist?
Do you have what it takes to be a great data scientist?
Tina Huang
32 How to learn data science in 2022 (the minimize effort maximize outcome way)
How to learn data science in 2022 (the minimize effort maximize outcome way)
Tina Huang
33 A productive day as a data scientist | day in the life of a data scientist vlog #2
A productive day as a data scientist | day in the life of a data scientist vlog #2
Tina Huang
34 How to learn math for data science (the minimize effort maximize outcome way)
How to learn math for data science (the minimize effort maximize outcome way)
Tina Huang
35 Internship that made me rethink my career...(technology summer analyst at Goldman Sachs)
Internship that made me rethink my career...(technology summer analyst at Goldman Sachs)
Tina Huang
36 How to get a data science job
How to get a data science job
Tina Huang
37 cake and big sister advice 🖤
cake and big sister advice 🖤
Tina Huang
38 the most underrated data job in 2021
the most underrated data job in 2021
Tina Huang
39 My career changing computer science masters degree in 15 minutes (Upenn MCIT)
My career changing computer science masters degree in 15 minutes (Upenn MCIT)
Tina Huang
40 Data science interview tips (product and technical interviews)
Data science interview tips (product and technical interviews)
Tina Huang
41 Needed to learn javascript in 3 hours - would not recommend
Needed to learn javascript in 3 hours - would not recommend
Tina Huang
42 from management consultant to software engineer | Humans of MCIT
from management consultant to software engineer | Humans of MCIT
Tina Huang
43 Overview, Review and Study Tips - Google Data Analytics Professional Certificate
Overview, Review and Study Tips - Google Data Analytics Professional Certificate
Tina Huang
44 Overview, Review and Study Tips - Google Data Analytics Professional Certificate (condensed version)
Overview, Review and Study Tips - Google Data Analytics Professional Certificate (condensed version)
Tina Huang
45 Watch this video before applying to Georgia Tech OMSCS
Watch this video before applying to Georgia Tech OMSCS
Tina Huang
46 How to self study technical things
How to self study technical things
Tina Huang
47 What's the best certificate for data analysts? Google vs IBM Data Analyst Certificate
What's the best certificate for data analysts? Google vs IBM Data Analyst Certificate
Tina Huang
48 FAANG Data scientist reviews: Datacamp, Dataquest, 365 Data Science
FAANG Data scientist reviews: Datacamp, Dataquest, 365 Data Science
Tina Huang
49 How I would learn to code (if I could start over)
How I would learn to code (if I could start over)
Tina Huang
50 The quant trader interview guide
The quant trader interview guide
Tina Huang
51 We code a trading bot live! @jacobamaral
We code a trading bot live! @jacobamaral
Tina Huang
52 Why you should have a portfolio website
Why you should have a portfolio website
Tina Huang
53 60k cooking and Q&A (ft. Uncle Roger fried rice)
60k cooking and Q&A (ft. Uncle Roger fried rice)
Tina Huang
54 How to self study using MOOCS (Udemy, Coursera, and other online courses) | self study online
How to self study using MOOCS (Udemy, Coursera, and other online courses) | self study online
Tina Huang
Data Science SQL Interview Question Walkthrough | SQL Sundays #9
Data Science SQL Interview Question Walkthrough | SQL Sundays #9
Tina Huang
56 Watch me build my portfolio! | How to create a portfolio website
Watch me build my portfolio! | How to create a portfolio website
Tina Huang
57 How I stop myself from burning out
How I stop myself from burning out
Tina Huang
58 How I take notes - Tips for efficient note taking that speeds up learning
How I take notes - Tips for efficient note taking that speeds up learning
Tina Huang
59 How I design effective study plans for ANY SUBJECT (and stick with them) | trading, coding etc.
How I design effective study plans for ANY SUBJECT (and stick with them) | trading, coding etc.
Tina Huang
60 How I'm learning to trade (forex swing trading)
How I'm learning to trade (forex swing trading)
Tina Huang

Related AI Lessons

Chapters (6)

intro
0:08 question
1:51 assumptions
2:47 answer
10:43 check answer
11:53 announcement
Up next
Man dies after horror Gold Coast house fire; high-speed Sydney motorway pursuit | 9 News Australia
9 News Australia
Watch →