What is Predictive Analytics? | Data Analytics | Community Webinar

Data Science Dojo · Beginner ·🧠 Large Language Models ·3y ago

Key Takeaways

The video discusses the foundations of predictive analytics, its importance in making proactive decisions, and its applications in various fields, including business, healthcare, and finance, using tools such as Excel, Python, and machine learning algorithms.

Full Transcript

So we have a very wide variety in our audience. So we are probably having a second part or a more techy part. This is for a general audience. We are going to discuss everything that we were saying how we can use predictive analytics for you know describe events or or understand what's what come next and how we can leverage data right so I'm dividing this conversation in three parts then first we are going to talk about this very crazy hectic moment we are living now then how we can make sense out of the data from using predictive models and finally how we can use code and also check our models of predictive analytics. Fair enough. Okay. About me, I am a data scientist. I have many roles. I wear many hats. probably where I am working the most is a in a tech company. I am also working in academia as a professor and also as a researcher. I'm a consultant been a development and as I mentioned I work and I focus on smart cities right how we can improve or democratize access to basic rights using data and technology. So let's start with the first part. Again, feel free to stop me at any part. This is going to be a conversation, not a readout. Um, and we know that this is a very crazy and complex moment, right? COVID 19, inflation in almost every country, unemployment, raising up, a recession. And if that wouldn't be enough, we have a war that started in February. And we don't know when actually we have many wars right it's not Russia Ukrainia the only Ukraine the only war that that that we are having now so that said if we look at the bright side the last two decades starting in 2000 until probably even we can include pandemics we have less people dying from infections than aging, less people dying from hunger than obesity and less people dying from violence than accident. Right? That that is from Jubal Harrari and that is thanks to technology. Actually we have a a a vaccine we had in 2020 thanks to predictive analytics right this company that we mentioned before brought a vaccine using or leveraging technology and predictive analytics for this. So again we are not looking at models for forecasting we are looking at model for finding cures right. So that is why this is so important. Um so Socrates would say the only thing that I know is that I know nothing it is complex because we know that we know what we know. We know probably what we don't know. That is a very little thing that we know that we don't know. The complexity comes when we don't know what we don't know. Right? It sounds funny. It sounds like like a joke, but it it it is really complex, right? Because you you can even go to jail if you if you make mistakes for for for stupid dumb things that you can do for example when you are driving when you are in a country that you don't know. So it is complex. So that's why we need to understand what we what we know right and that's how we can probably use or or or leverage a predictive analyst to move this group here and also this group here. Right? So if not we are going to have many blind spots. Um, if you remember this episode, the one when everybody find out when the was saying they don't know that we know, they know we know, it is interesting, right? Because when we say everybody when everybody find out, we can fall in the everybody's fallacy. So, we are saying something assuming that that is part of our universe. But we have a whole entire universe that we need to first we need to understand and then we need to move that unknown unknowns to the known unknowns. Right? So that's why h we need to move from descriptive analytics. Don't get me wrong, having or understanding the picture of what happened yesterday is a lot because we need to make decisions based on data, right? We need to understand the the crypto analytics, the known unknowns, the the the having the answers that we would have every month. But but that is very reactive, right? As we mentioned before, we are we are living a very crazy hectic moment where when reality changes every day, right? That's why we don't also need to understand what is happening now in our company but also what is happening out there, right? That's why we need to capture, ingest, process, store and finally consume data inside and outside our walls. Right? So we need to take that we need to have the ability to drill down and also keep asking why why why until we we reach the atom of of the information right understand the unknown unknowns with predictive analytics and not just having answer or having answers but also having more question right that that is how we became how we become more proactive And that is the new normal right I stole that from dojo is we can have the best model ML model we can have the best tools but if we have dump data or we don't have enough data is worthless right that's why we need to work also it's not just about the model it's about the model and the data and the technology right So first takeaways understand the context what is happening out there and align our data to that use external data more efficiently. We need to go and capture everything that is out there. just our history, our history of sales, but also what is happening with our competition, what is happening with the government, what is happening in the world, what is happening with COVID, what is happening with the lockdowns, what is happening with happening with everything because if not, we are having we are answering probably a a question that probably is not the question that we need to answer, right? So we need to modernize our data architecture, accelerate the creation of data ecosystems, understand where the blind spots are, capture necessary data for each question, and finally identify and analyze the main indicators. Right? So let's go to the second part. How we connect the dots to have everything making sense, right? Like like Steve Jobs would say when he presented presenting actually he would say everything three different stories and and finally he would connect the dots having or making everything you know sense or having everything making sense. So the most useful BI and analytics tool as we know is Excel, right? Everyone downloading data, crunching data with Excel and then uploaded the the data again to the same system where they downloaded the the the data from. And this guy even he is retired keeps making money because this is the tool that everyone again go back to friends and Phoebe everyone is using and relying because that's where they feel comfortable with but again using Excel is like eating an expired job because you don't know where or when that data come from or where that data come from or when that data was downloaded, right? That's why we need to use real time data, right? And when we talk about the data life cycle, we we know that we connect to the source systems, we store the data, we do our analysis and we finally do our predictions. The point is that the traditional models or the traditional prediction models use or would use a traditional rel relational databases. But as we know relational databases data in the form of of rows and columns as we knew in the past is only 10% of the data that is out there. Right? The the other 80 to 90% is unstructured data. Data in the form of pictures, videos, JSONs, WhatsApp, emails, you name it. Right? And we are missing the big chunk of the picture, right? When we store data, we would do something like this. And when we do analysis, we do something like this. And when we go to prediction for some companies, for some people is utopia, right? The good news is that we can kill the excel and we can start creating models for creating prediction models and also accelerate and automate the the the flow of the data and eliminate the middle man. Right? And we need to make sense of the data, right? No one is going to do some to give you something in the same way that that you found in the past because we the bad news is that the world is not more linear as it used to, right? That's this is a very good example. everything would be like this in the in the 2000 there would they would be we would have people that would think that the prices or or the the the real estate would keep growing and growing and growing forever. Same with Bitcoin, right? So no one saw that coming what we had in 2008. Erh so that's why we need to understand the context of what what what's going on in order to understand the patterns right if we see Bitcoin actually Bitcoin is not different to Facebook for example if we compare Bitcoin with Facebook both sorry both have the same the same shape right everything we are living in a bare market. Everything is falling, everything is down and we are having this moment interestingly in the history that's that that is we can see pars like we can we can see things happening if we can connect this with the context we are living in right almost no one believe the world largest banks probably go bankrupt in in 2008. Almost no one believed that big firms like GM would go bacon after that and almost no one would think that in 2010 in 2020 almost all the governments I I grew up in Latin America. We have regimes in Latin America. So there are countries like Argentina where I I I grew up that they locked down an entire country for a year and no one saw that coming, right? But also many people made millions connecting the dots, right? And keep making money. So that's something that we need to understand. So everything needs to be put in a context, right? Datadriven means answering the right question. Tim Ferris would say doing something unimportant well doesn't make it important. Right? You are answering well to the wrong question. So we don't need to have the answers. We need to understand which are the right questions. Right? For that we need to connect with data with our data with external data. We need to get data sources and more specific levels, understand how we can automate this and the ability to capture, filter, store, process and finally consume and predict, right? That's that's what why we need to do or how we need to work with predictive analytics. It's not just creating models or downloading models from GitHub. Right? So uh when we talk about descriptive analytics again I know this is sometimes semantics but basically when we talk about the past we are talking about descriptive analytics again for some companies I'm a consultant I see this every day for some companies understanding what happened yesterday is a lot because they sometimes do everything with Excel they don't know exactly how much they sold the last month and you know there are there are many challenges around that. So that's why having information is a lot but we need to work with the next level right we need to work with optimization we need to work with predictive analytics analytics prescriptive analytics cognitive and deep learning and so on right that's why the analytics maturity has to do with the type of questions that we can ask right and also when we move from Excel to predictive analytics, you know, the degree of complexity is lower. So we can keep doing more models easily. We can rely on economies of scales, economies of scope and we can keep growing or do doing this more frequently with different models and different you know it's a virtual cycle right and at the same time the business value goes up right so there's a rule that is that says the timing for decision making is between 40 and 60% of information when you have less than 40% you don't have enough enough data. When you have more than 60% you probably do have enough data but probably the moment that you make your your call or make your decision is late right the moment you you make your decision you are answering well as Tim Ferris would say to the wrong the wrong question or the question change that's why you need to have systems in place or a a an architecture for doing this h frequently right and rely on these predictive models. Leo Dainci would say simplicity is the ultimate sophistication. know after all the great inventions the most innovative models are those that could simplify concepts right um we don't need to reinvent them right sometimes we can probably leverage something that is out there again no I'm not showing code but but probably we are having a a hands on session soon doing or playing with this model right But again sometimes you can probably rely on what the community the open source community is growing very fast the last two or three years we have a lot of breakthroughs based on the many things that that that you know the community is doing right that's why we need to simplify concept and not rebend the wheel right so takeaways number two is culture of data is culture of decision making making. So if whether we are data scientists or whether we are I don't know computer scientists or where whether we are developer devops data engineers whatever we do we need to support our business in order to have that information for decision making. By the way, when we if we call ourselves data scientist, we need to have three pillars, right? The first pillar of course is coding is computer science. But that's not enough. You cannot call yourself data scientist just because you are good at Python or you are good at R. You know, you need to have the second pillar that is math and stats. And the third pillar is having a a a domain or having that applied for business questions. If not, it's just something abstract that that you are doing some you are doing a theoretical work. That's why we need to have the at least something of the same, right? You don't need to be an expert in in stats or you don't need to be an expert, a business expert, but you cannot just call yourself a data scientist just because you took a boot camp on Python. Right? So we need the three pillars. Data culture, democratize access to data. Again the what we are doing at NYU what we what dojo is doing and what the the open source community is doing is is a lot right you can join competition you can join meetups but and keep this growing right data risk of course catalyst of culture and finally join talent and culture let's go to the last part and and Again, I would love to have more feedback questions and you can even open your mics, right? And we would love to have the same operation that that that we have with, you know, when we buy something via Amazon or we can or when we watch a show at Netflix or when we listen to something on Spotify. And that's something that we need to replicate with everything that we do, right? They they know exactly what we like because they they have our entire history, right? And they also have that is called what is called wisdom of crowds because they compare similar profiles with other people that like even even when we you start from scratch, right? That like the same songs or shows or whatever you like. So we need to replicate that with to our operation and with everything that we do, right? That that's the ultimate goal of predictive analytics applying for the business, right? To answer business questions. If not, we are going to keep doing keep crunching data with Excel until a point that we find out that we don't need or our our job don't need us anymore. Right? If we think about that um professions that we have today for example a driver, a cut driver, an attorney, a banker and what they do they are very good at following part right. A driver learn you know how to drive or how to avoid some streets. an attorney. What what the attorneys do is reading a lot of material, a lot of history, a lot of books, a lot of until they find patterns. Same with the bank, right? If you think about that, it's the same that technology does that ML does if predictive analytics and finding finding patterns, right? That's why these guys are you know very challenged right same with this a car driver is not competing with a self-driving car they are competing with a fleet of self-driving car interconnected right that is a collective intelligence that is you know connected to the internet and safer though right so human intuition is actually pattern recognition For example, let's say Sarah goes to college. She studied medicine. After five years, she's she becomes Dr. Sara. But she finally prescribed. What could possibly go wrong? If you are the first patient that Sara sees, she's probably to practice with you, right? If you are lucky, she probably saw hundreds of patients before you and she's going to give you a good prescription. The same that predictive analytics and ML does, right? And of course the first while you train your model, you are going to have false positives and false negatives, right? That that's how it works, right? You train the model. You have a business challenge. You train the model with historical data. Then you create the model and test the model and finally you move the model to production. Right? Same h future of medicine actually is very interesting, right? because everyone is going to have access to to treatments and you it's going to be the democratic process. Right? Today we have a lot of devices, a lot of sensors, a lot of wearables that where we collect the data from. We have these h models that can probably used to be to predict different diseases and finally analytics, right? repetitive analytics to show when we have challenges, right? But by the way, this is not the future. This is already happening, right? This is what we see every day. So, h today we are having a very interesting moment. As we said, predictive analytics works connecting the dot, connecting the dots. So, we need to go and find our historical data. But that's not enough. We need to keep feeding our data sets plus micro variables. We need to go to the government. We need to go to the industries to know we need to go to h the supply chains. We need to make sense of everything that is happening around right covid-19 data because we are still having covid 19. We are still having lockdowns in some cities plus financial crisis. We can go back to other crisis and make correlations in order to understand how this is impacting probably current crisis that we are having and also former catastrophes. Right? That's how it works. He who lives by the crystal ball soon learns to eat ground glass. Right? So when we talk about predictive analytics, we we see we have three components, right? The first one is the trend where we see if something is going up something is going down and so on. The seasonalities seasonalities today is not just winter and summer as it used to be. We we are having a very complex shape not just talking about the weather per se, right? But but talking about how trends change every day, right? And finally noise that arguably we can we can say that noise is all the information that we don't have right so when we inest more data when we collect more external data when we capture data from other sources the noise is going to be less right and also predictive analytics can support all the many operation that we have in our company right Customer acquisition more more related to marketing targeting segmentations identifications on boarding risk more related to finance debt retention and so on right if we see this probably sequence of numbers we don't have much time but if I tell you 20 25 30 35 40 45 50 something and 60. Can can anyone tell me what is missing here or right and where any brave 55 correct? Yes. Very good. Um same here. I'm gonna be I'm going to help you. Um so basically is what you guys did is detecting patterns right? So we have the dependent variable we have the independent this should be white by the way dependent variable we have the independent variable and you find or you found what what the trend is or what the pattern is right same here same here. So basically you you made the math and you understood how to connect the dots, right? That's how h medicine works, right? Give prescriptions to something or an understood challenge based on historical data, right? And if we go to data science probably Python is the most used you know tool or language for for coding. But today we also have a very interesting full disclosure. I'm a certified AWS engineer. I'm a certified ashure also engineer and also work with GCP, right? But the clouds offer also solutions that are free. So that's something that you don't need to start or to learn how to code. You can probably focus on the other two co columns or pillars that are mass math stats and also the business question right and you also have solutions on prem that you can store on your computer right like SAS all tricks or statical packages right sometimes expectations are collect the data download train the model download something from GitHub and boom, you are rich. The truth is that again you need to learn math. It's not just downloading code, copying and pasting, right? That's that's the most interesting part of this model of how we implement a predictive analytics model, right? Er and again this is not part of the cop our conversation but when we talk about ML probably the three most popular version that we have of MLS are supervised learning where where that we use label data we train this model super that's why it is supervised with historical data and we label the data and you we use that data to predict the next round, right? And supervised we use unlabelled data and we enforce we use or we learn as you know as we trial and error like the robots the games and so on. And when we talk about predictive analytics, we are probably more focused on the first group, right? H we use linear regression when we are predicting a number for example how much is going to be bitcoin for example or how much how is going to be the weather the temperature and so on then we have we have binary classes again sometime is a question of semantics right when we say binary we can use a logistic regression that we are predicting a value that is yes no or I don't know white, black and so on. Or it can be a classification. Is this picture a a a doll or not? Is this a a you know a is this a a table or not? So we are having a binary classification and finally multi-classification that we can use something like when is going to rain for example which day of the week we would have seven options right Monday Tuesday and so on or we can also classify a picture when you upload a picture to to Facebook and Facebook automatically detect your face and automatically classify that because it has years of history or pictures of you. Right. We have different Yeah, go ahead. Sergio, can I interrupt you for a minute? Sure. You have a question. Uh, this goes back to when you were talking about um the different maths involved. Um, Javier is asking which kind of math would you recommend studying? matrix algebra, stats, um any other um any other subjects when we are talking about for example regression logistic regression is linear algebra. So this is algebra. Algebra is the first or the most interesting thing that we can study or also calculus to start. There are many free books. There are many free courses. You don't need to pay a dime for this. And again we all studied this college, high school, elementary school, but we probably never pay attention to this. And I I I I wish I would. But the point is that I I always say nerd is a new rockstar, right? Having this background makes you or give you gives you superpowers, right? But I would say three things. Stats first that probably stats cover everything and then if you have time and you if you like it, I would say calculus and algebra. Again there are many free courses. All the universities have free courses on math on stat. You can take these many courses or you can go to YouTube or you can read a book. There are many interesting books that make this fun. You know we all hate or we all would have hate in the past math and stats. But there are many books for example naked stats. I I bet that you can find that PDF somewhere that that is a very interesting book on stats that that they make very fun to learn or to read something related to stat math right yeah I think in in our paid programs we would recommend I think there's a book called cartoon statistics or something like that um yeah yeah I I read that dojo has some some programs or courses on on material that you can probably research, right? Absolutely. Yeah. So again, the most used or popular flavors when we talk about supervisor and when we talk about predictive analytics are regressions, regression and classification. classification, white, black, red, blue, you know, dog, cat, these binary options. Or again, you can probably say when you talk about regression, what is the temperature going to be tomorrow or classification is going to rain? Yes or not? And so on, right? And we also have these ways to measure this. Again, this is not from machine learning. This come from stat and math and different, you know, different thing that we learn already in the past, right? Erh, we can probably say use erh probably mean mean error or we can we can use mean square error. Probably this is the the the the measure that we use the most right when we talk about regression probably what is mo the use the most is forecast for cath you are predicting how much you are going to sell next month and there's something measure that is for caser so you would measure how what is the error comparing what you predicted versus what you really sold and from that you're going to take the forecaster And the next month you are going to keep retraining the model. So the model is going to keep learning iteration after iteration and until the forecast error goes lower. Right? And when we talk about okay we don't have much time but when we talk about classification we have the computer the confusion matrix that you can you are going to measure true false false positive versus false negative and true positive versus true negative right we have different levels of to measure accuracy you can measure accuracy precision recall and these different metrics that when you study machine learning, you can probably excel or or keep practicing at this, right? Again, this is not machine learning. This is this is stat, right? And finally, understand how you can measure accuracy versus precision, right? And and you know there are probably some particular challenges that you can afford not be so accurate accurate and probably need to be more precise and same with the same with the opposite. You can probably can afford leave the the accuracy and focus on the precision. Right? And finally you have the the rock or AU ROC and RU curve that is more complex but is what we use for classification to measure our model. So we can have things like this and so on right there are many probably the most popular data set is the the flowers data set that is again Google that is already on GitHub. So you can download the data set and the code and you can start playing with this model. So again we can probably we are probably going to have a second part of this where we we are going to do very hands-on session and finally when we talk about forecasting for example we have three components. The first component is the data. Then we have the model and finally the feedback. Right? We have historical sales for example that we are going to use to train the model. And once we train the model we have the prediction and then we have the feedback that we are going to use to retrain the model. So cycle after cycle the predict the forecast error is going to be lower and the prediction is going to be more accurate. Right? you have new patterns, you retrain the model and you are going to be you have a lower forecast, right? Same with classification. Classification you can use this to tag pictures to tag different categories or to tag a objects in a picture, right? Is this a car? Is this a a bicycle? Is this a person? Is this a a dog? It is a something that we use a lot when we are working with the self-driving cars or these models that they need to identify in real time different objects why is is is working right that's something that's why we don't have you know cars driving alone in the streets we have the technology but probably we we need to have more accurate models right and finally This is something that you probably know, but when we talk about predictive analytics, we are focused on what we have at the top, right? We are going to be working with regression and classification and of course you have this the link here, but psych learn is what we are using the most for for this, right? for we can use it for with Python or with R or different models right if the model doesn't learn it's it is not machine learning right and in between we have very inate model models because again we don't have data right if we don't train the model with real data things that are happening again out there we are going to keep labeling erh cats as a dog, right? So, regression, we predict the model and we measure the the distance between our prediction and the actuals. Um, and we have the dependent variable that is the same formula that we learn when we've been in high school. the formula of the of the the curve and we are going to split our our data set in two parts. We are going to have the first 70% with the training data set and the other 30% to test our prediction with real data. Right? I use this a lot with time series again for for forecasting probably the the model that we use the most in the business world that is basically predicting how much we are going to sell next month let's say October when October comes we compare our prediction with the the number that we real we we real sold and from that we have the the metric for the forecaster, right? For example, if we have a data set with 23 months, we will split the the data set in 18 months for training and five months for testing, right? Same with classification. Um I I don't I think that we can stop here to to discuss if we have question but anyways we are going to have a target variable that we are going to use as a target and we are going to use this probably with the again math formulas for calculating how we come with that prediction right for example we have different multivariate model we are going to have different different variables and we are going to predict what else we can sell or what recommendation we can give to to a customer or to whatever we are doing. Right? When we work with a predictive models, the first thing that we are going to do before going to this something like this is working with what we call uni value. There is basically only one variable that is for example taking the our history of sales and we are going to be working with sales. We are not going to add more variables. Once we have a very accurate model that we feel comfortable with the the model, we can ingest more variables that can be for example going back to foratting instead of having variables like sales or historical sales. We can add more variables like as I mentioned before I don't know inflation unemployment rate cases of COVID how many death we have which cities have I don't know lockdowns for example so you are including more variables instead of having just sales you are including more variables so the model is more accurate Right? So you you are reducing the forecast error once you add more information. Right? You you you are not just relying on one variable. So your prediction or or your recommendation is going to be more accurate, right? And your forecast error is going to be lower. Right? Last thing that I I I I'm giving, we have a very interesting skilled gap between what we have and what we need. Erh that's why again this is a complete different world that we have here. We the nerds sometimes don't understand the business questions. The business leaders don't understand the analytics need. So it's like two different worlds that don't talk to each other. Right? There's a book that is called purple people that talks about how we can or how we should probably understand or leverage both worlds, right? Because if not we are only focusing on just one thing that is coding and we as data scientists are more than coding right that that are more than coders right that's why we need to understand what is the world asking for right what is the biggest question that we want to answer with data right there's a very interesting gap in the world that is is a saying that we can say 6.1 million shortage this from Boston Consulting Group of computer science and math just in the USA. So that's why you guys are here. I bet you are already studied on dojo or you are interested on this. try to balance that blue blue personality that we used to have that usually is related to data science or or what are sciences science at all and the red people that are usually the business people or people that work in HR psychology and things like that. So there's a saying that we all need to be purple, right? We need to balance those even if we are more blue, we need to have something of red, right? In order to have probably that gap probably fix, right? Finally, nerdy the new rockstar, right? This is a very interesting point that we are living and you know data science is saving the world if you will. We are having a very interesting moment where technology is democratizing something access to basic rights right and actually we have a vaccine now thanks to data science right so um thank you very much you have there my my my information my Twitter my my email my my my LinkedIn find me on LinkedIn I'm pretty active in social media and again we are going to have a more techy probably webinar soon. Yeah. Do we have questions? We do. Um before um we dive into questions, I want to go into um our webinar for next week really quick. Sure. Go ahead. Yeah. And and for those of you on Zoom, I know um on the slide Sergio was just sharing, it didn't have his LinkedIn URL. So, I copied that and pasted it um into the chat for you all in case you want to um ask him questions or or um uh connect with him or or follow him or anything like that. So, let's get this shared screen. All right. So, hopefully we're looking at the right the correct screen. It should be an introduction to Power Virtual Agents. Um, so tomorrow, September 15th at 12:00 p.m. Pacific. Uh, that's the same time as today, uh, we'll have one of our data scientists at Data Science Dojo, Adam Nadine. uh he'll be giving a quick introduction and I I shouldn't say quick it's about 60 minutes uh but an introduction to power virtual agents where we're going to familiar familiarize ourselves with no code chatbot experiences uh we're going to create topics questions conditions and messages and links and um test and deploy uh a real broad a real bot across different channels. Um, so if you're interested in something that's hands-on with a no code approach, um, be sure to join us tomorrow. Um, the link to the event, I know I think it was just posted in in the chat. All right, and with that, let's get to some questions. Um, so, uh, and I know, um, I'm sure somebody is probably asking about the certificate. Um, the link's been posted once. We'll probably post it a couple more times just to make sure everybody has a chance to grab that. Um, but let's start with um, Assad. And Assad wanted to make sure that we uh, kept this for the end. So, thank you Assad for mentioning that. So, once you're done with all of the study um, and studying predictive analytics, maths, uh, data science, where can you find projects to get experience? Would you recommend Kaggle as a starting point? Uh do you have any tips on building a portfolio to get into the field? First I I I like the problems that have probably some I like business problem that can be solved with data, right? Real problem, not just the the Titanic data set that we all play with. Um I think that we are living a very interesting moment in the history that we are we are living a very particular moment that we are having many challenges with uh the world say the world we we are living a change right we are living a change of paradigm that people are working from home going back we are we are we have a complete different world than what we were used to until 2020 right so is a very good point to are from I I also like everything that is related to health. There are many breakthroughs appearing from startups guy like us finding a a a solution for example for solution to predict something related to to health. There are many data sets out there that are not just what we find on car but also what we we can find from different hospitals that we can find from the CDC that we can find from from from you know different different universities related to health related to COVID related to you know cancer related to I love I love those projects that are for for the greater good right and also like I'm very committed to what is smart cities. As I mentioned before, there are many data sets that you can find from any cities you name it, right? I mean big cities particularly in the US, right? So you can go to any states, any state, any any municipality, any town and you are going to find data sets from traffic, from you know health, from hospitals, from from different macro varios as I mentioned before that you can start playing with and and that is that is a very good you know I would love to see projects if someone approach me say look I I I have something in my portfolio that is I I have done or I have played with you know to be honest what I look for when I'm hiring it's not just what the the the models that the people build or what are the the solution that the people build but the curiosity if if I find people trying to solve big problems for me that is more than enough to to understand that is something that is has curiosity and that is someone that I would love to work with. Right. Okay. And shoot to the moon. Yeah. And then um uh will this is from Adnan. Uh will data science make a major turn with quantum computing? Yes, absolutely. The data science but also you know we are going to have a very interesting impact. What is blockchain for example, right? What is bitcoin? What is crypto? Today crypto we are relying you know that we have a a delay a gap between when a transaction is confirmed. part when we have quantum computer you you we are going to need to change the the algorithm from S sh 256 to something higher because we are going to have we you know we are going to have a big change in everything it's going to impact everything I don't see that happening soon but it's going to happen at one point I'm going to see that probably 10 years so yes yes short answer Yes. And then uh this is a question from one of our live streams probably on LinkedIn. Um do you have a recommendation on which order uh someone should start the revision of math? Should they start with statistics? Should they start with algebra? Um where would you begin? Start again. There are many friendly book if you are not you know an engineer or something like that or if you hate stat and math again if you hate stat and math you're probably in the wrong field but let's say that that you don't like it there are many friendly books as I mentioned before probably stats statistic naked or naked statistics or or many books that you can find interesting interesting. Harvard has a course on on on stat MIT has a course on on on algebra. Google it and you are going to find it. I think that we at NYU also have a course on on math. Again, talking about free courses, right? No, not just paid courses. There are many interesting courses at the at the big universities in in the US that you can take. But short answer start with stat that everything that we actually machine learning is not machine learning. Machine learning is an evolution of stat. So start with stat. So so you are going to understand machine learning or it's going to be easier right and then probably algebra and then probably calculus. Perfect. Okay. And then uh because of time let's just do one more question. Um, this is from Jin Woo. Uh, I was wondering what would be a good way for someone with an engineering background, so not not computer science, to get a deep learning job. I'm doing a side project using uh RNN at at my company and planning on pursuing an online degree in data science. Thank you. I I'm going to tell you a secret. My secret sauce. I I have a a bias for engineers, a positive bias, right? is a bias too because every time that I find an engineer I I I keep it I steal it because I know that an engineer knows about processes again when we talk about data scientist data science we are talking about flipidas no just coding coding is just the first part right you need to understand stats and math and that what an engineer does and you need to understand business problems right because If not is this you are just a technician you are not the data scientist so an engineer has I would say 80% of what a data science what data science needs from resources so you have almost all the skills that a data scientist need right so you are in the right path you understand math you understand processes you understand business problems you are a consult consultant. So, you should go to a food show and you you you're going to find a a job overnight. Yeah, Jin Woo, I was about to say uh shameless self-promotion. Come talk to me and and we can we can help with with exactly that. Um and and what to add on to that, one of the uh one of my favorite quotes from one of my co-workers here, one of our data scientists, um we were talking about how, you know, data scientists don't need to be software engineers. They're not all just these coding experts, right? So, a lot of times it's just learning how to Google, right? Asking the right questions. So, um all right. So, thank you so much, Sergio, for being here with us today. My pleasure. I'm going to shade it.

Original Description

Learn about the foundations of predictive analytics and how to get started with it. The COVID-19 pandemic left a digital, decentralized, and yet hectic and volatile world. Data and analytics are helping companies to thrive and navigate uncertainty but understanding what happened yesterday is not enough. We need to work proactively with predictive and prescriptive analytics to optimize our operations and compete in a changing world. Table of Contents: 00:00 – Introduction 01:34 – From noise to wisdom 08:15 – Connecting the dots 20:10 – Predictive analytics 47:55 – QnA Looking for more crash courses? Check out our playlist: https://www.youtube.com/playlist?list=PL8eNk_zTBST9SJS5gUw_HYGSk5MdyC89- -- At Data Science Dojo, we believe data science is for everyone. Our data science trainings have been attended by more than 10,000 employees from over 2,500 companies globally, including many leaders in tech like Microsoft, Google, and Facebook. For more information please visit: https://hubs.la/Q01Z-13k0 💼 Learn to build LLM-powered apps in just 40 hours with our Large Language Models bootcamp: https://hubs.la/Q01ZZGL-0 💼 Get started in the world of data with our top-rated data science bootcamp: https://hubs.la/Q01ZZDpt0 💼 Master Python for data science, analytics, machine learning, and data engineering: https://hubs.la/Q01ZZD-s0 💼 Explore, analyze, and visualize your data with Power BI desktop: https://hubs.la/Q01ZZF8B0 -- Unleash your data science potential for FREE! Dive into our tutorials, events & courses today! 📚 Learn the essentials of data science and analytics with our data science tutorials: https://hubs.la/Q01ZZJJK0 📚 Stay ahead of the curve with the latest data science content, subscribe to our newsletter now: https://hubs.la/Q01ZZBy10 📚 Connect with other data scientists and AI professionals at our community events: https://hubs.la/Q01ZZLd80 📚 Checkout our free data science courses: https://hubs.la/Q01ZZMcm0 📚 Get your daily dos
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Data Science Dojo · Data Science Dojo · 0 of 60

← Previous Next →
1 Feature Engineering and Predictive Modeling | Data Analytics with R and Azure ML | Community Webinar
Feature Engineering and Predictive Modeling | Data Analytics with R and Azure ML | Community Webinar
Data Science Dojo
2 Data Exploration and Visualization | Beginning Azure ML | Part 3
Data Exploration and Visualization | Beginning Azure ML | Part 3
Data Science Dojo
3 Reading External Data Sources | Beginning Azure ML | Part 2
Reading External Data Sources | Beginning Azure ML | Part 2
Data Science Dojo
4 Importing Data, Accessing, & Creating a New Experiment | Beginning Azure ML | Part 1
Importing Data, Accessing, & Creating a New Experiment | Beginning Azure ML | Part 1
Data Science Dojo
5 Casting Columns & Renaming Columns | Beginning Azure ML | Part 4
Casting Columns & Renaming Columns | Beginning Azure ML | Part 4
Data Science Dojo
6 Scrub Missing Values & Project Columns | Beginning Azure ML | Part 5
Scrub Missing Values & Project Columns | Beginning Azure ML | Part 5
Data Science Dojo
7 Feature Engineering & R Script | Beginning Azure ML | Part 6
Feature Engineering & R Script | Beginning Azure ML | Part 6
Data Science Dojo
8 Building Your First Model | Beginning Azure ML |  Part 7
Building Your First Model | Beginning Azure ML | Part 7
Data Science Dojo
9 Run and Fine-Tune Multiple Models | Beginning Azure ML | Part 8
Run and Fine-Tune Multiple Models | Beginning Azure ML | Part 8
Data Science Dojo
10 Deploying Your First Predictive Model As a Web Service | Beginning Azure ML | Part 9
Deploying Your First Predictive Model As a Web Service | Beginning Azure ML | Part 9
Data Science Dojo
11 Using R API to Obtain Predictions From Your Web Service Beginning Azure ML | Part 10
Using R API to Obtain Predictions From Your Web Service Beginning Azure ML | Part 10
Data Science Dojo
12 Using Python API to Obtain Predictions From Your Web Service | Beginning Azure ML | Part 11
Using Python API to Obtain Predictions From Your Web Service | Beginning Azure ML | Part 11
Data Science Dojo
13 Twitter Sentiment Analysis | Natural Language Processing | Community Webinar
Twitter Sentiment Analysis | Natural Language Processing | Community Webinar
Data Science Dojo
14 Listening to the Melody of the Universe (LIGO Gravitational Waves Presentation) | Community Webinar
Listening to the Melody of the Universe (LIGO Gravitational Waves Presentation) | Community Webinar
Data Science Dojo
15 David Wechsler on the Impact of Data Science Bootcamp
David Wechsler on the Impact of Data Science Bootcamp
Data Science Dojo
16 Andrew Choi on the Impact of Data Science Bootcamp
Andrew Choi on the Impact of Data Science Bootcamp
Data Science Dojo
17 Microsoft's Software Engineer Shares Her Experience with Data Science Bootcamp
Microsoft's Software Engineer Shares Her Experience with Data Science Bootcamp
Data Science Dojo
18 Michael DAndrea on the Impact of Data Science Bootcamp
Michael DAndrea on the Impact of Data Science Bootcamp
Data Science Dojo
19 Data Driven Decision-Making with Data Science Bootcamp: Artem Kopelev's Revelation
Data Driven Decision-Making with Data Science Bootcamp: Artem Kopelev's Revelation
Data Science Dojo
20 Learn the Fundamentals of Data Science: Srinivas Rao's Experience with Data Science Bootcamp
Learn the Fundamentals of Data Science: Srinivas Rao's Experience with Data Science Bootcamp
Data Science Dojo
21 Re-Learning Data Science with Data Science Bootcamp: Analyst's Revelation
Re-Learning Data Science with Data Science Bootcamp: Analyst's Revelation
Data Science Dojo
22 Scale R to Big Data with Hadoop & Spark | Community Webinar
Scale R to Big Data with Hadoop & Spark | Community Webinar
Data Science Dojo
23 Enhancing Skills with Data Science Bootcamp: Sharon Lane-Getaz's Revelation
Enhancing Skills with Data Science Bootcamp: Sharon Lane-Getaz's Revelation
Data Science Dojo
24 Ryan DeMartino on the Impact of Data Science Bootcamp
Ryan DeMartino on the Impact of Data Science Bootcamp
Data Science Dojo
25 Software Engineer at Microsoft Reveals About His Experience with Data Science Bootcamp
Software Engineer at Microsoft Reveals About His Experience with Data Science Bootcamp
Data Science Dojo
26 Wade Wimer on the Impact of Data Science Bootcamp
Wade Wimer on the Impact of Data Science Bootcamp
Data Science Dojo
27 Analyzing Data with Data Science Bootcamp: Hannah Richta's Revelation
Analyzing Data with Data Science Bootcamp: Hannah Richta's Revelation
Data Science Dojo
28 Applying Data Science Skills to The Current Role with Bootcamp: Marcos Lacayo's Revelation
Applying Data Science Skills to The Current Role with Bootcamp: Marcos Lacayo's Revelation
Data Science Dojo
29 Lance Milner on the Impact of Data Science Bootcamp
Lance Milner on the Impact of Data Science Bootcamp
Data Science Dojo
30 Deloitte's Data Scientist Revelation: Learning Predictive Analytics with Data Science Bootcamp
Deloitte's Data Scientist Revelation: Learning Predictive Analytics with Data Science Bootcamp
Data Science Dojo
31 Rajesh Patil's Experience at Data Science Bootcamp As an Enterprise Architect
Rajesh Patil's Experience at Data Science Bootcamp As an Enterprise Architect
Data Science Dojo
32 Michael Atlin on the Impact of Data Science Bootcamp
Michael Atlin on the Impact of Data Science Bootcamp
Data Science Dojo
33 Amina Tariq's In-Person Experience at Data Science Bootcamp
Amina Tariq's In-Person Experience at Data Science Bootcamp
Data Science Dojo
34 Ceo's Revelation about Data Science Bootcamp
Ceo's Revelation about Data Science Bootcamp
Data Science Dojo
35 Stephen Miller Describes His Experience at Data Science Dojo's Bootcamp
Stephen Miller Describes His Experience at Data Science Dojo's Bootcamp
Data Science Dojo
36 Kevin Hillaker on the Impact of Data Science Bootcamp
Kevin Hillaker on the Impact of Data Science Bootcamp
Data Science Dojo
37 Marko Topalovic's Experience with Data Science Bootcamp
Marko Topalovic's Experience with Data Science Bootcamp
Data Science Dojo
38 Text Analytics With Python, Cognitive Services & PowerBI | Data Analytics | Community Webinar
Text Analytics With Python, Cognitive Services & PowerBI | Data Analytics | Community Webinar
Data Science Dojo
39 Unisys Manager's Revelation: Visualizing Real Time Data with Data Science Bootcamp
Unisys Manager's Revelation: Visualizing Real Time Data with Data Science Bootcamp
Data Science Dojo
40 Learn Data Mining with Data Science Bootcamp: Ryan LaBrie's Revelation
Learn Data Mining with Data Science Bootcamp: Ryan LaBrie's Revelation
Data Science Dojo
41 Vang Xiong on the Impact of Data Science Bootcamp
Vang Xiong on the Impact of Data Science Bootcamp
Data Science Dojo
42 Data Scientist's Experience at Our Data Science Bootcamp
Data Scientist's Experience at Our Data Science Bootcamp
Data Science Dojo
43 Alejandro Wolf Yadlin on the Impact of Data Science Bootcamp
Alejandro Wolf Yadlin on the Impact of Data Science Bootcamp
Data Science Dojo
44 Introduction To Titanic Kaggle Competition | Part 1
Introduction To Titanic Kaggle Competition | Part 1
Data Science Dojo
45 Learning How to Code in R with Data Science Bootcamp: Priscilla Mannuel's Revelation
Learning How to Code in R with Data Science Bootcamp: Priscilla Mannuel's Revelation
Data Science Dojo
46 Andrew Berman On Why Data Science Bootcamp Is Better Fit for Him
Andrew Berman On Why Data Science Bootcamp Is Better Fit for Him
Data Science Dojo
47 How To Do Titanic Kaggle Competition in R | Part 3.1
How To Do Titanic Kaggle Competition in R | Part 3.1
Data Science Dojo
48 How to do the Titanic Kaggle competition in R | Part 3.1
How to do the Titanic Kaggle competition in R | Part 3.1
Data Science Dojo
49 Delve Deeper into Data Science with Data Science Bootcamp
Delve Deeper into Data Science with Data Science Bootcamp
Data Science Dojo
50 Bank of America Data Scientist Reveals His Experience of Data Science Bootcamp
Bank of America Data Scientist Reveals His Experience of Data Science Bootcamp
Data Science Dojo
51 Shaena Montanari on the Impact of Data Science Bootcamp
Shaena Montanari on the Impact of Data Science Bootcamp
Data Science Dojo
52 Types of Sampling | Introduction to Data Mining | Part 12
Types of Sampling | Introduction to Data Mining | Part 12
Data Science Dojo
53 Sampling for Data Selection | Introduction to Data Mining | Part 11
Sampling for Data Selection | Introduction to Data Mining | Part 11
Data Science Dojo
54 Data Aggregation | Introduction to Data Mining | Part 10
Data Aggregation | Introduction to Data Mining | Part 10
Data Science Dojo
55 Data Cleaning | Introduction to Data Mining | Part 9
Data Cleaning | Introduction to Data Mining | Part 9
Data Science Dojo
56 Missing & Duplicated Data | Introduction to Data Mining | Part 8
Missing & Duplicated Data | Introduction to Data Mining | Part 8
Data Science Dojo
57 Data Noise | Introduction to Data Mining | Part 7
Data Noise | Introduction to Data Mining | Part 7
Data Science Dojo
58 Graph and Ordered Data | Introduction to Data Mining | Part 5
Graph and Ordered Data | Introduction to Data Mining | Part 5
Data Science Dojo
59 Document Data & Transaction Data | Introduction to Data Mining | Part 4
Document Data & Transaction Data | Introduction to Data Mining | Part 4
Data Science Dojo
60 Data Quality | Introduction to Data Mining | Part 6
Data Quality | Introduction to Data Mining | Part 6
Data Science Dojo

This video introduces the concept of predictive analytics, its importance in making proactive decisions, and its applications in various fields, including business, healthcare, and finance. It covers the basics of machine learning, statistics, and data science, and provides examples of tools and techniques used in predictive analytics.

Key Takeaways
  1. Define predictive analytics and its importance
  2. Understand the difference between descriptive and predictive analytics
  3. Learn about machine learning algorithms and statistical models
  4. Apply predictive analytics to real-world problems
  5. Evaluate model performance using metrics such as mean error and mean square error
  6. Use tools such as Excel, Python, and Power Virtual Agents to implement predictive analytics
  7. Explore applications of predictive analytics in various fields
  8. Discuss the future of predictive analytics and its potential impact on business and society
💡 Predictive analytics is a powerful tool for making proactive decisions and driving business success, and its applications are diverse and rapidly expanding.

Related Reads

Chapters (5)

Introduction
1:34 From noise to wisdom
8:15 Connecting the dots
20:10 Predictive analytics
47:55 QnA
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →