Real Estate Scouting Tool - Full Python Project
Key Takeaways
This video demonstrates how to build a real estate scouting tool in Python, utilizing Bright Data's Zillow dataset and web scraper API to access up-to-date data and historical price information for specific properties. The tool filters data to be only from Austin, Texas, in the last couple of days and displays properties on a map with information such as address, price, number of bedrooms, and Zillow estimate.
Full Transcript
what is going on guys welcome back in this video today we're going to build a professional real estate scouting tool in Python so let us get right into it not a [Music] GED all right so let us take a look at the real estate scouting tool that we're going to build in this video today this is the final result here we have Austin Texas and as you can see I can zoom in here and look at a bunch of different properties so I can zoom in closer and then I can click on individual properties to get some information like the address the price the number of bedrooms if present bathrooms living area uh gross rental yield the Zillow estimate so how much the property is worth according to the estimate of Zillow uh how much the rent is estimated to be by Zillow and then we also have a link to the actual property if we want to take a closer look and I can also show the price history if present so what this application does is it tries to get the price history if it uh does not exist it's going to give me a no historic data which is often times the case but sometimes I will get some information a table that displays the price history of the property which can also be interesting for investment decisions and as you can see a lot of these properties here have a black color which means that they're off Market but some of them for example are not this one here is red the color is determined by the gross rental yield so how much um the rent is in relation to the property value to the estimated property value uh in this case here it's worse so so it's just 4.66% here we have an orange one which has 6.39% and then maybe we can also find some green ones which have a higher yield but that is the basic tool that we're going to build in this video today we're going to work with a lot of data we're going to look at properties we're going to be able to find properties like this uh and of course you can customize this you don't have to build the exact same tool that I'm building here you can also change a couple of things I'm going to tell you where and when you can change certain things to adjust it to your preferences and your needs but this is what we're going to to build in this video today all right now in order to make this an actually useful real estate scouting tool we need to make sure that we have a lot of data we need to make sure that we have data which is up to date and we need to make sure that we have access to some endpoint that can give us information about the price history if it exists on demand for specific properties now of course you can build a prototype of this with a small data set just to play around with but if you actually want to make this a useful tool you need to have a lot of data upto-date data and you need to have this access to an API or an endpoint that gives you also historical information this is why I partnered up with bright data for this video today they're offering exactly that for those of you who don't know bright data they are an all-in-one platform for proxies and web scrapers and data sets and so on so if you go into their control panel once you create an account you will see they have uh proxies they have webscraper apis and they have data sets which we can use here which are up to-date and um can be filtered on their platform and this is what we're going to do in this video today so we're going to use one the data set from Bright data the Zillow data set and we're going to filter it to be um only Austin Texas the last couple of days and then we're also going to use their web scraper API to make requests about historical price information if it exists for the specific property um again as I said you can start by building a prototype with uh a small sample you can also get a sample from Bright data um from their data sets for free but if you want to build this into a useful tool that you actually use for Real Estate scouting it makes sense to a lot of data it makes sense to have upto-date data and also an access to an API that gives you historical information so uh what we're going to do first is we're going to create an account on bright data I'm sure you can do this on your own so I'm not going to show you how to do that and uh probably you have to set up some billing and some payment here if you want to use the actual um data set but as I said you can also get a free sample so what we're going to do is we're going to go to web data we're going to go to the Zillow data set so just uh look up Zillow or just click on Zillow if you see it right away and what you can do then is you can get a sample so you can download a sample for free as a CSV file if you just want to have the structure so you can build the exact project here using um a sample if you don't want to start by purchasing some data right away and uh what you can do here in general is you can go to all records you can click on filter and then you can say I'm only interested in a specific subset so I can say um I can click here I can say my fancy subset and then I can say okay what am I actually interested in for example I could be interested in city is equal to and then I can say I'm only interested in New York or actually let's say something like San Francisco um like this and then I can say I'm also only interested if for example the number of bedrooms is lower than four for example now you can do whatever you want but basically when you create that um filter it's going to show you how large the data set is so it's going to tell you that with this configuration you're going to get this amount of entries and then you can calculate uh how much how many gigabytes that is and you can see if that's too much or if that's too little but basically the idea here is you can filter a data set the way you want um on this platform and then you can download the subset and purchase the subset if you want to but again for the sake of this video you can start by just downloading the free sample to work with and what I did for this video is I filtered for Texas for Austin Texas so my data is Austin Texas of the last 5 days when I downloaded the data set but the basic idea is you either download the sample or you purchase the data set when you purchase the data set you're going to see it uh in my data sets and you're going to see that the sample is ready to download so you just go to the snapshot here like this one and you can just download it as a CS v file this is basically what I did you can see in this case here my Snapshot which is Austin Texas of a 5day span has 2.24 gigabyt which is manageable if you have a large data set you can also export it to uh AWS or to so to Amazon S3 or to other storage uh providers so there are a lot of options to get the data in this case here I chose a small sample and I downloaded it as a CSV file so that I can work with it locally so that's the first step this is what I already have what you also want to do then later on for the API requests is you want to go to account settings you want to go to user access and you want to create an API token here so that you can use it for the API but we're going to talk about this um later on or actually I can show you how to do that right away so you go to webra per API you go to um to Zillow so let's look up Zillow here and in this case here I went for Price history the basic idea here is you provide a URL of a property which you also see in the data set and uh then what you do is you make an API call for that URL and you get the historic price information um you get the historic price information if you uh if it's present so you can see how this is done you can say for example I want the data as uh Json input so you just provide a couple of URLs and then here in this case now it's uh written as a cur request but we're going to do that in python so we're going to change uh the way this is done in python or we're going to do it in the same way but we're going to use Python instead of curl and the basic idea is we authorize ourselves with our API token which we created and then we just pass a list of URLs and as a result we're going to get a snapshot ID and this snapshot ID then we can uh use another Cur request for or request in general for to download the snapshot as a CSV file for example I can specify CSV here uh we're going to do all of this in Python and we're going to model all of this as a flask application so let us get started with this uh I'm going to start by creating a file called app.py which is going to be our flask application now for this video today we're going to need a couple of packages so open up your command line and install the following packages if you want to build this as well and again as I said you can work with a free sample you can do the same thing that I do here uh with the free sample it's just going to be less data and probably not exactly the data that you want but you can do all of this with the free sample I think just the API request is not going to work uh but you can follow along like this as well so pip or pip 3 install folium for the map visualization numpy and pandas to work with the data requests to make the requests and flask to build the web application so these are the Imports make sure you have these packages installed and then we can get started by importing a couple of core python packages import OS import Json import uh time import or actually from Funk tools I'm going to also import cache now why do I import all these things let me briefly explain that Json to work with ad Json objects um with certain dictionary type objects in the data OS to see if certain files exist time in order to wait uh to do certain things because when we create the snapshot we cannot download it right away we need to wait for 5 seconds to uh try to get maybe we get the message that the snapshot is not ready yet so we wait maybe for 10 seconds to load it again instead of crashing the script we can wait so we use time for that and we use cash in order to not have to make the same request for the price history all over again because otherwise if I click on requests um the price history then I get it then I click away I would have to do it again so in this way I can just use use the uh cach response so this is why we need these core python packages here we're going to use of course requests to sent the request that we saw in curl in Python um and we're going to use numpy and pandas to work with the data set so import pandas SPD and nump SNP we're going to use folium for the map visualization so from folium plugins we're going to use the marker cluster class and from folium we're going to also use the macro element now now I'm not actually sure if I need all of these so maybe we're going to also delete them because I don't use them because I played around with creating the map and maybe we don't use this one anymore but I'm going to see if that's the case later on um and then we we of course also need to import flask so we need to say from flask import flask for the application we're going to import render template and we're going to import request so these are the Imports for our application here and now we can start by saying the application is flask name uncore uh by the way what you see here already is I have the data set as a CSV file here so this is Zillow Austin TX this is basically the CSV file I get from Bright data when I uh choose my uh subset and I download the snapshot this is what I get in your case this could be either your snapshot or the sample that you downloaded and I also have a file token which just contains my API token um now what we're going to do here is at the beginning of the application and this will take some time uh we're going to load the data set now what you can also do here is you can also explore the data set if you want to so what I can do for example is I can go into my directory I can start here with Jupiter lab um if you want to do that before you get into coding you can do that so we can just create a notebook here let's call it main um and explore the data so we can say import pandas aspd and then I can say data frame equals PD read CSV and then Zillow Austin TX CSV uh because if you don't know what the data is about of course you need to First Look At The Columns at the data types at the uh different things that we can work with because we do have a lot of features in this data the data set is quite comprehensive which is of course good but we also need to know what is it that we actually need from this data set so if I look at the data frame here I do have this uh irrelevant unamed column and I have then also the zp ID this is important this is a unique identifier for a property so I think this stands for I guess it stands for Zillow property ID I'm not sure about this uh then also city and state obviously Austin Texas here and then we also have stuff like the home status address which is going to be important specifically here this is why we need the Json package uh we do have a dictionary here with um multiple information or multiple Fields here we want to have the street address uh and then we have stuff like bedrooms bathrooms and so on now I'm not going to do the data set exploration here because this would just take too long to look at everything um I want to get into coding here but it makes sense to maybe list the different uh the different columns that we have and to discuss which ones we're going to use I'm going to Briefly summarize it for you so as you can see we have a bunch of columns here a lot of them and we're not going to look at all of them um in particular we're going to need the identifier for ident identifying specific uh properties when we want to get the price history we're going to look at the address to get the street address we're going to look at the price and we're going to specifically look at um also the uh Z estimate and rent Z estimate these are basically Zillow estimates this is the property value estimation and this is the rent estimation so how much the property is most likely to be worth and how much the rent is most likely going to be now the problem with the price is that sometimes it's the rent and sometimes it's the property value so we cannot use it always um because of that we're going to use the estimates here so this is always the property value and this is always the rent um also of course we're going to need to have the longitude and latitude the coordinates to make the visualization possible um and we're also going to use or actually are we going to use anything else we're going to use for the for the markers obviously bedrooms and bathrooms so that we have some information to display there um but besides that we're going to craft our own features for example the gross rental yield is going to be just the um it's just going to be the annual rent divided by how much we think it's worth and the annual rent is going to be just the rent estimate times 12 so we're going to craft our own features here for the visualization uh but of course as I said in the beginning you can customize this to be whatever you want maybe you want the color of the markers to be dependent on the number of bedrooms or maybe you want to have it depend on something regarding to taxes there are a lot of features in here and you can work with them whatever way you like I'm going to show you one example on on uh or for how you can craft this uh tool but you can of course adjust this to be whatever you want you can have it completely customized you just have to use different features different ranges and so on but as I said we're going to use the property ID the address the bedrooms the bathrooms the price the uh longitude latitude and also the estimates here so that's basically it all right so I'm going to start by loading the data frame into the application we're going to say PD read CSV and we're going to load the Zillow Austin tx. CSV file and then our application is basically only going to have two endpoints now you could structure this in a different way if you want to but my idea here is to have an index endpoint which I can just um where I can just look at the map and another endpoint where I can ask for Price history now our index endpoint will behave differently depending on whether we already have a map or not so the first time we run this it's going to take quite some time to create the map but then once the map is created we can basically just uh load it immediately um even though it would probably make sense to have some update algorithm in this case here I'm just going to say that once we have the map we're going to keep the map but of course if you change the data set if you for example get another Snapshot from Bright data it makes sense to also reload the map so to also delete the map uh but let's get started we are going to say here app. uh route is going to be the slash route this is going to be the index function here and we're going to say here if this is why we need the OS package if os. path. exists and now we're going to look for the following uh path we want to know if templates slash property map. HTML exists now this is going to be our map in the end this is going to be what we craft with fum but if we don't have it we want to create it otherwise we just want to load it so if it does not exist um or if it does exist we're going to say return render template and then just property map HTML the reason we don't provide templates here is because templates is the default template directory of flask if you want to change that you can say template folder equals something else but by default it's templates so if you provide property map HTML it's just automatically going to look for templates property map uh HTML so if it exists just load it just return it otherwise we need to go through the process of creating it and this is going to be quite uh comprehensive so we're going to write a lot of code here uh for a single video but we're going to start by first of all dropping all the Nan values um or all the rows where we have Nan values so not present values for the coordinates because we don't want to visualize any property that we don't know where it's at so we only want to visualize the properties that we can actually put on the map with coordinates so we're going to say here data frame is equal to data frame drop na a so drop the Nan values um considering the subset of um longitude and latitude now what's the problem here actually this should actually work I mean it does work so I'm not sure why it doesn't think it works maybe we can do Global DF I think this shouldn't make a difference I think it also works if we don't do that but let's just do it um so we're going to drop the Nan values or we're going to drop the rows that have Nan values for the coordinates and then we're going to craft a couple of features now first of all we need to make sure that the features rent estimate or rent Z estimate uh and also the normal estimate and also the price are actually numeric so we're going to say here data frame uh rent Z estimate is going to be equal to pd2 numeric DF rent Z estimate and we're going to say errors is equal to coers and now I'm going to do the same thing now for the Z estimate and for the price obviously here we need to change this as well as well and then we have these features as numerical features now based on these features we're going to now craft additional features we're going to say that the annual um rent is going to be equal to basically the rent estimate time 12 so DF rent Z estimate time 12 because the rent estimate is per month so time 12 for 12 months is going to give us the annual rent and now the gross rental yield is just going to be the annual rent divided by the property value the estimated property value the reason we don't do it with price is you can also do it with price if you want to but the problem is that price is sometimes also rent so the easy way here is to just go and use the Z estimate so I'm going to say here um the gross rental yield is going to be equal to taking the the um annual rent dividing it by the Z estimate and taking that times 100 and this will give us a number like 6 something 4 point something just a gross rental yield um now what we also want to do now this might lead to some divisions by zero or something because of that we want to replace infinite values with man values so I'm going to say here gross rental yield is going to be equal to gross rental yield and we're going to call replace on the following list NP INF and NP or negative NP INF uh will be replaced by NP Nan so we want to have Nan values there if the value is infinite because otherwise we can have some uh bad coloring we don't want that so we're just going to say if it's Nan we're going to have a special color for it um and this brings us to the next function now the color of our markers so of the individual uh property buttons that we can click is going to be determined in this example here by the gross rental yield now again this is a place where you can customize it the way you want to maybe you want to say the color should be dependent on something else um if you want for example to make the color dependent on the Z estimate you can do that just make sure you have reason able thresholds because what I'm going to do here now is I'm going to say uh Define a method or a function get marker color and this get marker color will get two arguments first of all the gross yield and second of all uh whether the property is off market now this is a feature of the data set I think I forgot to mention this uh there should be a feature somewhere here called is off Market I'm not sure if we're going to find it now uh there you go is off market and this tells us if the property is off Market or not so if we can buy it or not um and if it's off Market I want to have a special color now maybe you don't want to have that so maybe you want to consider the gross yield uh regardless of the pro if the property is off Market or not maybe you want to contact the owners and say hey do you want to sell it but I'm going to say if the property is off Market I'm going to return the color black now otherwise if the property um gross yield is Nan so PD is na a uh gross re gross yield then what I want to do is I want to return gr and in all the other cases I'm going to focus on the value so for example if the yield is less than five so less than 5% I'm going to return Red if the yield is less than eight but not less than five then I'm going to return orange and otherwise I'm going to return green so if it's above 8% now whether these values make sense or not you need to Define it I'm not a real estate agent I'm not a real estate expert you will set this up the way you want to so you will set your own thresholds your own criteria the basic idea is you get a function you pass some of the features into that function and then you return a color depending on your analysis you can have your own complex logic here comparing 50 different features and 100 different cases and then returning 100 100 different colors that's up to you but this is the basic idea we have a function get marker color that we pass information to and this function will give us then the color of the individual markers so um now we get to the map visualization the map visualization starts with us centering the map at a certain position for this I'm going to define a map Center which is going to be a list and this is going to be a list of two coordinates very simple we're going to take the latitude we're going to take the mean this is going to be our latitude and we're going to take the longitude and this is also going to be the mean here so we just Center at the mean position of all the data points um we could also go with a median if we want to maybe if you have multiple States a median might be uh more reasonable but we're going to go with a mean here all right then we create our map m is going to be equal to folium do map and we're going to pass here the location we want to focus on is the map Center and we want to start with the zoom level of 12 again this is also something you can adjust depending on how far you want to zoom out if you have multiple States and multiple cities it might make sense to decrease this number uh and then we're going to create our marker cluster this is going to be the collection of our markers this is going to be a marker cluster object and we're going to add this marker cluster to M to our map and now we're going to go through the individual data points of our data set and we're going to create create the various markers so I'm going to say for index and row in data frame iter Ros so I'm going to iterate over the rows and I'm going to get the information that I need I'm going to say price is equal to DF uh price and this is now basically the same thing that I'm going to do here so I'm going to have the price I'm going to have the address I'm going to have um the bedrooms I'm going to have the bath [Music] rooms and I'm going to have the uh living area this is also one of the features that we're going to use here living area and we're going to use the gross yield which we calculated these are now all the informations or this all the information that we're going to display in the popups is text so this doesn't have to be related to styling but when we click on a button we want to see this information again here feel free to include all the information that you need so gross yield is going to be uh of course gross rental yield then we're going to say that the Z estimate is equal to the Z estimate and then we want to have rent Z estimate is going to be equal to rent Z estimate and finally we also want to have the property URL so that we can have a link to the property and the ID so DF zp ID which I think again stands for Zillow property ID this is a unique identifier so this is what you can use to identify the same property across multiple data sets if they are Zillow data sets um all right so we're going to say now if PD is an a if we don't have a price we need to make sure that the price uh formatted or actually we're going to do it the other way around if not is an a we're going to format the price um a certain way so we're going to say we want to have a dollar symbol and then we want to have price with two decimal places otherwise we're just going to say that the price is missing so the price formatted is going to be equal to the string n/8 and we're going to do a very similar thing now for the Z estimate so we're going to say here Z estimate and we're going to say here Z estimate uh this and here as well Z estimate and here we're going to change this to Z estimate uh now of course we don't want to have this as a string this is a mistake we want to have the field that we got so the series object um so we're checking if this is Nan or if this is a not existent value or an invalid value if this this is an invalid value and we also want to do the same thing for the rent and also for the gross yield so we want to make sure that the numbers that we're displaying are either na I think this stands for not available usually um so we want to display this if they're not existent and otherwise we want to display them properly so Zen rent Z estimate uh rent Z estimate and rent C estimate and finally the last one is the gross rental yield so I'm going to say here gross gross yield is what I called it I think gross yield formatted gross yield and gross formatted um all right so that should be it so the numerical values that we're interested in they are either na a or not and actually we can also do the same thing uh as a oneliner here since we don't need to really format them is we can say bedrooms is going to be equal to int bedrooms if not PD is na bedrooms else we can just say na and now we can copy the same thing for bathrooms um yeah and for living area there you go all right so I think that should be it and now what we're going to do is now we're going to get into the HTML stuff so what we want to have is we want to have a a diff box basically which uh let me just see if that makes sense I need to double check with my prepared code no actually we don't need uh this anymore because I had a different approach uh When I Was preparing the video but the basic idea now is we're going to get the street address from the address because remember the address um do I have it still opened up here the address is a dictionary object and we want to get this street address field here for the popup um so we're going to do that and then we're going to create the popup text and the pop-up text will be uh HTML code and we're going to also add JavaScript code in there because in this popup we want to have a button that requests the price information from our flask application uh and from the web scraper API uh so we need to actually have some functionality in there as well so we're going to do that uh but let's get started here with the address so the address is going to be equal or the address dictionary let's call it is going to be equal to Json load string so load s not Json load load s and here we pass now the address text that we loaded from here and we're going to say now that the street address is going to be equal to address dict Street street address now I'm not sure actually if this is um stupid that I'm doing it like this because there is actually a field street address so chances are that we have this information here already you can try out I'm not going to experiment now because this works for me uh but chances are you can just get this value using the street address column uh and now we're going to uh design the popup text so popup unor text is going to be here a multi-line string and we're going to have a bunch of fields here we're going to have B for bolt this is now just stying so this is going to be in bold text we're going to say address and we're going to have the and this needs to be a formatted string so an F string uh we're going to have the street address here and then we want to have a line break and like this we're going to now add all the different fields so we're going to have price bedrooms and so on uh bathrooms uh what else did we have living area we had uh gross rental yields we had the Z estimate and the rent Z estimate and now we also replace all of this here bedrooms [Music] bathrooms living area gross yield and or actually we need to use the formatted ones okay so we need to use the price formatted uh bedrooms bathrooms living area works like this then for gross you we want to use formatted as well then here we want to use Z estimate formatt it and reny estimate formatt it and we also want to have now an A tag here that goes to the following location row URL or actually I think I actually got got this as property URL so we're going to say property URL here now the important thing is that we say Target equals undor blank why because we want to open this in a new tab we don't want to open this in the iframe which is the popup so the popup is an iframe I don't want to click on the popup and then be redirected to the property in the popup I want to open this in a new tab which is why we use this target blank um and here now I'm going to say Zillow link and another line break and then finally we have our button the button will have the ID um we'll have the ID button Dash Index this is important because uh the index is here from the itals to have a unique identifier even though I'm not entirely sure I'm not 100% certain that this even matters because we have these different iframes I don't know if they have conflicting if they even relate to one another in the HT HL code I'm not sure about this but um yeah we're going to go with button index here and we're going to say on click we want to on click we want to execute the following function show loading and redirect is what I call it and we're going to pass here index and the Zillow property id zp id as a string so we add quotation single quotations around this uh not around the index but around the zpid and then we say show price history now what is the idea behind this what are we trying to do here the basic idea is when I click the button I want to use JavaScript code to show a loading animation but I also want to use the JavaScript code to send a request uh or actually not to send a request to redirect um to to the API call so to actually get um to actually go to the page because the idea here is not to get the information from the endpoint and to then display it like uh a JavaScript request the idea is to redirect the iframe which is the popup window to a flask endpoint which renders an HTML file and shows the table with the price history so we actually want to go there we don't want to actually make a request we want to go there which is why I call this redirect um so we show the price history like this and then we go and say want to have div if ID equals uh loading I need to use double quotations here loading Das index again um again not sure if this actually makes a difference but we're going to set the style this is important now equal to um actually I didn't close this style is equal to display none and um in this diff box here we're going to say that we want have an image and the image source and this is now I got something from Wikipedia I'm actually going to copy this I don't want to type this out so I'm going to copy this from my prepared code um this is actually a loading animation GIF so you can also use any other gift but this is just a circle rotating got it from Wikipedia so from the Wikipedia Commons um and then basically I say alternative text is loading and the width is going to be equal to 50 now this is not blocking my camera right okay uh and the height is also going to be equal to 50 so that's just an image tag showing a loading animation it doesn't show it in the beginning the idea is to click the button and the function that we're calling here will display this and hide the button which is why we need the IDS we need to hide the button show the diff box and then we want to be redirected once uh we get once the response works so we're going to say script script and this script is going to be a JavaScript script I don't think that we need to specify this necessarily so let's see if it works like this um but what we're going to do here is we're going to define the function show loading and redirect we're going to pass the index and the zp ID um and we're going to say here using two curly brackets the reason we want to use two Cur brackets and not one is because um otherwise this is going to be considered like this here to be python code so we want to use two to make sure that this is actually a JavaScript um CI bracket and then we're going to say document get element by ID and by the way really I don't really enjoy coding in JavaScript CSS HTML so this is very very basic core JavaScript code those of you who know how to code JavaScript well who know how to use jQuery well do it the way you want this is just me coding a basic popup with some information a button and a link you can also be creative here and add some more stuff but we want to get the button um with the index so I'm going to say idx here we're going to get the style of this element and set it to none so we want to remove the button basically and we want to do the opposite for the loading index here we want to set display to visible or actually to block um um and then we want to say window.location.href so we want to change the location of the current window to the following and now I'm going to use HTTP localhost Port 5000 SL prore history now why do I do that I'm going to explain this here in a second but we want to add to this also the zp ID as a string um the reason I do this is because the application that I'm running here will run on Local Host Port 5000 if you change that if you run it on some other port or if you run it separately somewhere else you need to adjust this to be the actual uh flask application and the actual port and this is just the name of the endpoint that I'm going to choose here so this is one endpoint here the index route we're going to have a second one and this second one will have um the route price history and it will take a parameter zp ID this is why we do that and this what this is going to do is just is going to remove the button show the loading animation it's going to redirect the iframe not the entire application the iframe content to that endpoint in this endpoint what we're going to do with it is we're going to display uh a table containing the price information if it is existing um all right so what else do we want to do we want to say now that uh the color that we're going to use for this specific marker remember we're in a loop iterating Over All the rows all of this is done for every single data point so we want to say that the color for this specific data point is going to be the result of calling our get marker color passing here the row gross uh rental yield and the row is off Market again if you have your own logic here with different parameters in different uh branches and so on you just pass the stuff that your function needs and you will get a color depending on your logic and now finally we're going to say folium do Marker we want to create a marker and the location of this marker is obviously going to be the latitude and longitudes of this row and the popup information when you click on this shall be a folium do popup element uh containing a folium do iframe and this should contain the popup text and should have a width of 300 and a height of 250 again you can adjust this if you want to and then we want to have an icon so how should this pop up uh how should this marker look and we want to have here a folium do icon and the color is the color that we get from our function and the icon is a home item just because it's a property you can of course also change this and the prefix is equal to fa I think this is because we have these um specific fa icons um so this is why why we need to specify this and then we need to add this marker to our marker cluster and this is basically it so for every single data point let's briefly repeat this uh we drop all the Nan values we make sure we have numbers here we calculate two new features um we Define how to get the marker color we create um a map centered at the mean location we create an empty marker cluster and then we go through all the individual rows of the data frame to extract some information to format it properly and then we create this popup when you click on the button this is what you're going to see uh all the information just displayed here and then also a link to the Zillow URL a button that makes a call to another endpoint that redirects to another endpoint that shows a loading animation if we click it and then we just add all of this together and in the end after this uh loop so once we have all the data points covered we say m. safe and we want to save this to templates /properties map. HTML so this is what we're checking for in the beginning remember if it exists just render it otherwise the first time we're going to have to create it and as I said if you update the data frame if you get a new CSV file after a month or something then you want to also delete the HTML file and the first time you will have to uh run this again so return render uh template property map. HTML and the larger your data set the more time this will take so this already takes quite some time here I'm not sure maybe a minute or something on my system with this 2 gab data set if you have a larger one it's going to take more time but as I said you only need to do it once at least until you swap the data frame uh or the data set so that is our map displaying or this is how we create and display our map the second part is the price history for this we're going to say app. route slash price history slash and here we want to have now our int which is the zp ID so we're going to take the string from the URL type cast automatically into an integer which the zp ID is and we're going to also cach this function the reason again we want to Cache this function is because uh we want to make an API request but if we made one already then we don't need to do it again we're not going to have a different price history necessarily 10 seconds afterwards so we want to have a cach if you want to get fancy you can also Implement your own temporal cache that expires after a day or something but we're going to just go with a simple cache here and I'm going to use or I'm going to call this function price history and it's going to take obviously the zp ID from the URL and in here now what we want to do is we want to basically do what the curl request here is doing but we want to do it uh in Python so we have this endpoint here API bright data data sets V3 trigger and then the data set ID which is for the Zillow data set of course if you have a different application with a different data set it's going to be a different data set ID in the case of Zillow it's this ID here um so we're going to to model this in Python now we're going to start by saying the URL that we want to get the URL the property URL that we want to get the price history for is equal to the data frame where the data frame or the data frame row where the data frame ID is equal to the zp ID which is passed here from this we want to get the URL values zero so the first one now of course if you uh pass a zp ID that does not exist you're going to get a problem but this should not happen so the URL is just that then we also want to have the API URL this one is also again something I'm going to copy but you can also copy paste this just from the uh code here so you just copy paste this uh this here I'm going to copy paste it from my code and um that is the thing that we're going to Target um so we're going to Define now the header or the headers is going to be equal to the following we're going to say authorization and here now we need to use our API key so we're going to use a formatted string Barra so this is going to be a Barra token uh maybe want to load the token actually before so token is going to be equal to open the token file in reading mode and reading the content make sure you don't have any line breaks in here you just have uh one single line with the content and that's it or with the AP I key and that's it uh and here now we're going to use that token um and the content type should [Music] be uh application SL Json and um what we're going to do now is we're going to say that the data that we want to pass here is just a list with a single entry which is URL pointing to URL again this is what we do here as well they're basically you can do this with multiple properties as well but since we're clicking only at one property on one property at a time we just need a single URL but that's the structure that you see here as well so this is our data and then we just have to say that the response is equal to requests. poost now make sure you have requests not request because request is from flask we're using the package requests not the flask uh request so post and we want to post to the API URL with the headers that we defined um the Json data which we created here now this response will give us a snapshot ID this will not give us the data this will not give us the price history it gives us a snapshot ID and we need to do a second request to actually download the data so I'm going to say snapshot ID is equal to resp response. Json from this response we're going to get a snapshot ID and now this is why we need time I would like to sleep for 5 seconds because usually the snapshot is not available immediately and instead of just uh trying immediately we're going to wait for 5 seconds and then we're going to make the next request which is downloading uh the snapshot for this we have a different API so we use this one here and uh we download the snapshot um ID that we specified here so we're going to say API URL is equal now I'm going to copy this again from my prepared code um and here now we have API bright data com/ data sets B3 snapshot and then here our snapshot ID this is important this is what we get from the response and the format is equal to CSV then again headers is going to be actually let me copy this headers is going to be the same but we don't need to content type Json because this is going to be a get request we just need to provide the token and the response this time is going to be equal to requests.get so we're sending a get request to the API URL with the headers and um then we can get multiple responses one response that we can get is Snapshot is empty this means there is no historic data so so if we get the message snapshot is empty this is the literal string that's written there from Bright data uh if we get this string it means that there is no historic data for this property another message we could get is Snapshot is not ready yet try again in 10 seconds if we get this one it means that it's not done with the loading yet so either there is no data or there is data and it's being loaded but we need to wait and try again and the other way or the other response we can get is data so the actual CSV data so we're going to say here if snapshot is empty exactly written like this is in the response text we're going to return no historic data um otherwise we can say then while I do this immediately as a while loop I don't even use an if statement here while snapshot is not ready yet comma try again in 10 seconds now it needs to be exactly like this this is the exact message that we're waiting for if that is in the response text we're going to just say time sleep 10 seconds and try again so basically copy this do it again and if it at some point says snapshot is empty so basically I can copy this as well then just do that because chances are it will say it's not ready yet you need to try again but then it's still going to be empty so in this case we still need to return no historic data otherwise retry over and over and over again and at some point we're going to get hopefully some data and then we're going to say with open temp csb in writing mode SF F WR response dot uh actually in writing bytes mode do content because do content is a by uh by stream so we need to write bytes uh responsed content so this saves the CSV data into a file temp CSV and now we're going to say price history data frame is equal to PD read CSV temp. CSV and then price history data frame is equal to price history data frame and here we just want to have date and price and then I say price history data frame uh date I want to have this as PD to date time because this is a string usually uh price history why do I not get some Auto completion here come on price history data frame uh date so turn it into a date object and then I want to actually turn it into a uh good representation so I actually want to turn this into price history data frame uh or PR history data frame date. DT do string format time and the format should be a basic year so percent y- percent m d% d m and d need to be lowercase y needs to be uppercase and then finally we can return render template and we need to return a price history HTML file which we don't have yet and we're going to pass to it uh the price history data frame um as a parameter to process all right and the final thing if name equals main then we want to say app run debug equals true now of course if you deploy this you're going to do this differently I have a couple of videos on my channel on showing how to deploy an application properly but that is now our code the last thing missing here is we need a directory templates and here we want to create a price history HTML file we're going to keep this simple it's not going to be complicated price history uh HTML file and in this file we're going to import bootstrap now I'm going to copy paste this here as well just use a Content delivery Network for bootstrap um like this you can just type this if you want to or you can just copy it from somewhere else um but basically St path. bootstrapcdn.com bootstrap 4.5.2 CSS bootstrap Min CSS depending on when you watching this video in the future there's a different version so it doesn't make sense really to uh provide a specific line here but just get bootstrap from somewhere uh we don't really need a title for this and then we're going to say in this uh HTML document here we want to return a table the table is going to have class these are now bootstrap classes table table striped and table hover and it's going to contain a tad so a table header the class of this one is going to be T head- dark and then we're going to have a table row element here and we're going to say four column in price history data frame do columns and4 we're going to have a table header um field which is going to have the column and then in the table body so T body we're going to say that we're going to display the data how are we going to do that we're going to say for Row in price history DF do values and we're going to say end4 and here now we're going to say again table row and we're going to say uh for cell in row we want to display uh a table data object TD with the cell so that's just basically what we're going to return to the popup now that is it I think we wrote quite a bit of code actually as I said we don't need this and we actually also don't need this um this is the whole thing now with that much code written I'm sure I made a mistake but the idea now is to run this application and running this application will take some time first of all because we're going to have to load the data frame this will take some time already second of all we're going to have to onc
Original Description
Today we build a professional real estate scouting tool in Python. It uses a large Zillow data set and has access to historical data via an API. Both provided by Bright Data.
Bright Data: https://brdta.com/neuralnine
◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾◾
📚 Programming Books & Merch 📚
🐍 The Python Bible Book: https://www.neuralnine.com/books/
💻 The Algorithm Bible Book: https://www.neuralnine.com/books/
👕 Programming Merch: https://www.neuralnine.com/shop
💼 Services 💼
💻 Freelancing & Tutoring: https://www.neuralnine.com/services
🌐 Social Media & Contact 🌐
📱 Website: https://www.neuralnine.com/
📷 Instagram: https://www.instagram.com/neuralnine
🐦 Twitter: https://twitter.com/neuralnine
🤵 LinkedIn: https://www.linkedin.com/company/neuralnine/
📁 GitHub: https://github.com/NeuralNine
🎙 Discord: https://discord.gg/JU4xr8U3dm
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from NeuralNine · NeuralNine · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Visualizing Stock Data With Candlestick Charts in Python
NeuralNine
Python Beginner Tutorial #1 - Installation and First Program
NeuralNine
Python Beginner Tutorial #2 - Variables and Data Types
NeuralNine
Python Beginner Tutorial #3 - Operators and User Input
NeuralNine
Python Beginner Tutorial #4 - If Statements and Conditions
NeuralNine
Python Beginner Tutorial #5 - Loops
NeuralNine
Python Beginner Tutorial #6 - Sequences and Collections
NeuralNine
Python Beginner Tutorial #7 - Functions
NeuralNine
Python Beginner Tutorial #8 - Exception Handling
NeuralNine
Python Beginner Tutorial #9 - File Operations
NeuralNine
Python Beginner Tutorial #10 - String Functions
NeuralNine
Python Intermediate Tutorial #1 - Classes and Objects
NeuralNine
Python Intermediate Tutorial #2 - Inheritance
NeuralNine
Python Intermediate Tutorial #3 - Multithreading
NeuralNine
Python Intermediate Tutorial #4 - Synchronizing Threads
NeuralNine
Python Intermediate Tutorial #5 - Events and Daemon Threads
NeuralNine
Python Intermediate Tutorial #6 - Queues
NeuralNine
Python Intermediate Tutorial #7 - Sockets and Network Programming
NeuralNine
Python Intermediate Tutorial #8 - Database Programming
NeuralNine
Python Intermediate Tutorial #9 - Recursion
NeuralNine
Python Intermediate Tutorial #10 - XML Processing
NeuralNine
Python Intermediate Tutorial #11 - Logging
NeuralNine
Python Data Science Tutorial #1 - Anaconda and PyCharm Setup
NeuralNine
Python Data Science Tutorial #2 - NumPy Arrays
NeuralNine
Python Data Science Tutorial #3 - Numpy Functions
NeuralNine
Python Data Science Tutorial #4 - Plotting Functions With Matplotlib
NeuralNine
Python Data Science Tutorial #5 - Subplots and Multiple Windows
NeuralNine
Python Data Science Tutorial #6 - Matplotlib Styling
NeuralNine
Python Data Science Tutorial #7 - Bar Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #8 - Pie Charts with Matplotlib
NeuralNine
Python Data Science Tutorial #9 - Plotting Histograms with Matplotlib
NeuralNine
Python Data Science Tutorial #10 - Scatter Plots with Matplotlib
NeuralNine
Python Data Science Tutorial #11 - 3D Plotting with Matplotlib
NeuralNine
Python Data Science Tutorial #12 - Pandas Series
NeuralNine
Python Data Science Tutorial #13 - Pandas Data Frames
NeuralNine
Python Data Science Tutorial #14 - Pandas Statistics
NeuralNine
Python Data Science Tutorial #15 - Pandas Sorting and Functions
NeuralNine
Python Data Science Tutorial #16 - Pandas Merging Data Frames
NeuralNine
Python Data Science Tutorial #17 - Pandas Queries
NeuralNine
Python Machine Learning Tutorial #1 - What is Machine Learning?
NeuralNine
Python Machine Learning Tutorial #2 - Linear Regression
NeuralNine
Python Machine Learning Tutorial #3 - K-Nearest Neighbors Classification
NeuralNine
Python Machine Learning #4 - Support Vector Machines
NeuralNine
Python Machine Learning Tutorial #5 - Decision Trees and Random Forest Classification
NeuralNine
Python Machine Learning Tutorial #6 - K-Means Clustering
NeuralNine
Python Machine Learning Tutorial #7 - Neural Networks
NeuralNine
Python Machine Learning Tutorial #8 - Handwritten Digit Recognition with Tensorflow
NeuralNine
Generating Poetic Texts with Recurrent Neural Networks in Python
NeuralNine
Stock Portfolio Visualization with Matplotlib in Python
NeuralNine
Analyzing Coronavirus with Python (COVID-19)
NeuralNine
Making Text Images Readable Again with Python and OpenCV
NeuralNine
Neural Networks Simply Explained (Theory)
NeuralNine
Motion Filtering with OpenCV in Python
NeuralNine
Top 5 Programming Languages To Learn in 2020
NeuralNine
Simple TCP Chat Room in Python
NeuralNine
Image Classification with Neural Networks in Python
NeuralNine
Edge Detection with OpenCV in Python
NeuralNine
S&P 500 Web Scraping with Python
NeuralNine
Simple Sentiment Text Analysis in Python
NeuralNine
Introduction - Algorithms & Data Structures #1
NeuralNine
More on: AI Pair Programming
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Bloom Filters, Explained Properly
Dev.to · Daksh Gargas
Prefix Sums: The Preprocessing Trick That Makes Range Queries Instant
Medium · Programming
I Thought I Was Ready for the Interview — Then One Simple Math Question Destroyed Me
Medium · Programming
Week 2(Day 10): LeetCode Two Pointers(slow & fast): Remove Duplicates from Sorted Array (Brute…
Medium · Python
🎓
Tutor Explanation
DeepCamp AI