Take Your Data Science Projects From Good to Great
Key Takeaways
Ken Jee discusses key steps to improve data science projects, including telling a cohesive story, attention to detail, creative feature engineering, and productionizing models.
Full Transcript
hello everyone that can hear back with another video for you today I'm talking about how you can take your data science projects from good to great now in my most recent video I talked about the types of data science projects you can do to effectively learn the field and to catch the eyes of companies I'm expanding on that a little bit and focusing on the areas in which you can really improve and make the greatest returns if you put an effort so these are a couple of the kind of key steps to having a cohesive data science project that will really stand out from others that are applying or that you're showing this to as usual if you enjoy this video please hit that like button and if you want to see other videos similar to this please subscribe and hit that notifications to be alerted when I post a new video now something I think is really important when you're starting off a data science project is having a clear vision about what the project is and what you want to find out the ability to tell a story about your data science project I think is one of the most important skills that you can have especially if you're explaining it to a potential employer if you can relate your personal interest in the topic you can relate the company you're applying to to this specific project it's always going to come off better you're going to be more of a motivated to work on it and you're also going to be able to say the exact type of value that you set out to create and the exact type of value that you did create the next thing that I think can really set a good data science project apart is attention to detail with the data and focusing on data collection so it's always interesting to me when someone writes a web scraper or gets data from a unique source so let's say I even collected my own fitness and eating data and did a project on that that to me is going the extra effort and showing that you're really invested in data science and and integrating it with your life I always look for people that are passionate about the field and you're willing to take a couple extra steps that are related to data data collection not necessarily data science I think that that shows well on a candidate it also says something about your personality and your character that you're willing to go maybe an extra set that other people don't it's great to get data from kaggle or from some of these other sources but I'm always again impressed when someone goes out and it gets this data themselves the next thing that I like to see in data science projects is creative feature engineering so you have a specific data set I either like to see you go out and get another data set and append of that so let's say I have information on a specific group of people if I have their zip codes perhaps I can append some information related to their incomes and that can add new information to my analysis also again a fives if codes and let's say I want to estimate if they're going to attend a specific school I can actually use that zip code to feature engineer how far they are from a specific school and use that as a feature as well so that is something that I think just like the the previous idea is going the extra mile but this one can have perhaps even more of an impact on your model I like to see how creative people can get here because this is one of the areas in data science where you can be the most creative and you can think the most outside of the box so definitely explore different feature engineering technique techniques that goes as far as looking also into principal component analysis or using some sort of clustering in your inputs a PCA is great especially if you have a very large feature set but it can also help you understand which features are related to each other so keep those tools in your tool kit and make sure you use them or at least consider using them in your projects I really like to also see people get creative with the algorithms that they use in their data science projects so a lot of people just go through the gambit they try four or five different algorithm algorithms see which one works the best well you can also start looking into ensemble approaches or you know different ways to tune your models and that is what it takes to go the next level in my opinion so it's not just that a random force is going to be the best for this but a random forest combined with a multiple linear regression might actually generalize better so layering these models understanding what some of the you know deficits are in specific models and seeing if a combination of a couple different can improve your accuracy or improve whatever you know variable that your you're wanting top demas so again I encourage exploring the combinations of models not just models by themselves after you have a completed model it's always nice to see it packaged and usable to you know anyone and a company or through a website so I like to see it when people productionize their project model so that's could be as simple as putting your model into a flask wrapper and making it an API endpoint through your website so let's say I made a blood pressure predictor and it took in someone's basic weight their height and a couple other features I could put that on my website and you as a consumer could go in and put in your variables and it would return what your systolic and whatever the other type of blood pressure is that was projected from the model that to me is a cool implementation use case it's something that would probably be used by people on your website and that's also something that's very common in industry where you're actually making your model useful to someone that shows that you can go from square one of collecting the data all the way through to production ization and that's a great skill that frankly not all data scientists have and that's a very useful skill in my opinion the next thing that I think is important getting away from the actual model building or any of the technical components is creating a model that is valuable to someone so for example if I make a trading model and it can actually make me money that model is in you know inherently valuable to me and it could be valuable to other people in you know the generation of wealth if I create a model that helps you know high school students choose what college they should go to that helps other people as well and that is the goal of data science is to help other people help a corporation make more money help you make better decisions help other people in general and if your model serves to do that and people will actually use it that is the sign of a really really good project and the Sun again of a great project is if you can actually get someone else to use this model of yours I really recommend people go to nonprofits or their school or someone that might have a problem that you could help solve because that adds purpose to your project if you can create a project that will help one of these organizations then that is about as good as something can look on your resume well you heard the sirens I think that that means the police think my video is going to long so let's end it here as usual thank you so much for watching please in the comment section below write it write what you think makes a great data science project good luck on your data science journey
Original Description
In this video, I talk about what I think makes a great data science project. This expands on my past video about the types of data science projects that I recommend that you do.
#DataScience #DataScienceProjects
1) Tell a cohesive story and be able to explain why you are working on a project
2) Put the extra time in to collect your own data
3) Get creative with feature engineering techniques
4) Try ensemble methods with your models
5) Productionize your models
6) Make your models useful to others
7) Get others to use your projects
#KenJee
⭕ Subscribe: https://www.youtube.com/c/kenjee1?sub_confirmation=1
🎙 Listen to My Podcast: https://www.youtube.com/c/KensNearestNeighborsPodcast
🕸 Check out My Website - https://kennethjee.com/
✍️Sign up for My Newsletter - https://www.kennethjee.com/newsletter
📚 Books and Products I use - https://www.amazon.com/shop/kenjee (affiliate link)
Partners & Affiliates
🌟 365 Data Science - Courses ( 57% Annual Discount): https://365datascience.pxf.io/P0jbBY
🌟 Interview Query - https://www.interviewquery.com/?ref=kenjee
MORE DATA SCIENCE CONTENT HERE:
🐤My Twitter - https://twitter.com/KenJee_DS
👔 LinkedIn - https://www.linkedin.com/in/kenjee/
📈 Kaggle - https://www.kaggle.com/kenjee
📑 Medium Articles - https://medium.com/@kenneth.b.jee
💻 Github - https://github.com/PlayingNumbers
🏀 My Sports Blog -https://www.playingnumbers.com
Check These Videos Out Next!
My Leaderboard Project: https://www.youtube.com/watch?v=myhoWUrSP7o&ab_channel=KenJee
66 Days of Data: https://www.youtube.com/watch?v=qV_AlRwhI3I&ab_channel=KenJee
How I Would Learn Data Science in 2021: https://www.youtube.com/watch?v=41Clrh6nv1s&ab_channel=KenJee
My Playlists
Data Science Beginners: https://www.youtube.com/playlist?list=PL2zq7klxX5ATMsmyRazei7ZXkP1GHt-vs
Project From Scratch: https://www.youtube.com/watch?v=MpF9HENQjDo&list=PL2zq7klxX5ASFejJj80ob9ZAnBHdz5O1t&ab_channel=KenJee
Kaggle Projects: https://www.youtube.com/playlist?list=PL2zq7k
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Ken Jee · Ken Jee · 33 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
▶
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Predicting Crypto-Currency Price Using RNN lSTM & GRU
Ken Jee
Predicting Season Long NBA Wins Using Multiple Linear Regression
Ken Jee
How I Became A Data Scientist From a Business Background
Ken Jee
Should You Get A Masters in Data Science?
Ken Jee
How to Simulate NBA Games in Python
Ken Jee
Demystifying Data Science Roles
Ken Jee
The Best Way to Predict NBA Minutes Played
Ken Jee
IT'S NOT TOO LATE TO LEARN CODE!
Ken Jee
My Top 5 Data Science Resources for 2019
Ken Jee
Watch This Before Applying to Data Science Jobs
Ken Jee
Where YOU Should Start With Data Science Projects
Ken Jee
Welcome To My Channel | Ken Jee | Data Science
Ken Jee
Why You DON'T Want to be a WFH Data Scientist
Ken Jee
Was Captain Marvel Bad? A Sentiment Analysis of Twitter Data
Ken Jee
Data Science, Machine Learning, and AI: What's the Difference?
Ken Jee
Data Science: Startup vs. Large Corporation
Ken Jee
Where to Look for Data Science Jobs
Ken Jee
Work From Home Data Scientist: Day in the Life
Ken Jee
Scrape Twitter Data in Python with Twitterscraper Module
Ken Jee
Should You Learn R for Data Science?
Ken Jee
NASA Physicist Turned Data Scientist (Tim Bowling) - KNN EP. 02
Ken Jee
I Wish I Had Known THIS Before Starting in Data Science
Ken Jee
What I Learned From My Three Degrees
Ken Jee
Most Data Science Hopefuls Overlook This Important Skill
Ken Jee
Golf STATS: Strokes Gained Explained
Ken Jee
My Top 5 Data Science Internship Tips
Ken Jee
How I Got My First Data Science Internship (And How You Can Land One)
Ken Jee
Data Science: Pros and Cons
Ken Jee
Data Science Fundamentals: Data Exploration in Python (Pandas)
Ken Jee
Data Science Fundamentals: Data Manipulation in Python (Pandas)
Ken Jee
What Does a Data Scientist Actually Do?
Ken Jee
The Projects You Should Do To Get A Data Science Job
Ken Jee
Take Your Data Science Projects From Good to Great
Ken Jee
How To Get Data Science Experience (Without a Job)
Ken Jee
Data Science Fundamentals: Data Cleaning in Python
Ken Jee
Is Data Science Right For You?
Ken Jee
Thank You For The Support | What's Next | Ken Jee | Data Science
Ken Jee
How To Build A Word Cloud From Scraped Data (Python)
Ken Jee
6 Habits of Successful Data Scientists
Ken Jee
How Far Should the NBA 3-Point Line Actually Be?
Ken Jee
How to Stay Productive & Motivated When Learning Data Science
Ken Jee
Why is Balance Important in Data Science?
Ken Jee
By The Numbers: Where Should The NBA Put a 4 Point Line?
Ken Jee
Why Selling Is An Important Data Science Skill
Ken Jee
Applying Data Science To My YouTube Data: My Surprising Findings
Ken Jee
9 Ways You Can Make Extra Income as a Data Scientist
Ken Jee
Sports Analytics 101: The Pythagorean Theorem of Sports
Ken Jee
Golf: Would You Rather Be the LONGEST or STRAIGHTEST Driver on the PGA Tour?
Ken Jee
Data Science Fundamentals: Linear Regression
Ken Jee
How YOU Can Land a Sports Analytics Job
Ken Jee
The 5 Stages of Data Science Adoption
Ken Jee
Math Needed for Mastering Data Science
Ken Jee
5 Sports Analytics Books to Get You Started
Ken Jee
3 Reasons You Should NOT Become a Data Scientist
Ken Jee
Collision Course: Sports Betting + Data Science
Ken Jee
How to Scrape NBA Data Using the nba_api Python Module
Ken Jee
5 Data Science Resolutions for 2020
Ken Jee
The Data Science Interview: What to Expect
Ken Jee
The 9 Books That Changed My Perspective in 2019
Ken Jee
Questions You Should Ask Your Data Science Interviewers
Ken Jee
More on: Data Storytelling
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
AI: Energy Taker or Energy Maker
Medium · AI
When AI Asks for More Electricity Than a Country Can Imagine
Medium · AI
You Are Not Behind. The World Is.
Medium · AI
Career choice with the advent of AI - pure Computer Science or learn software with a background of core engineering area
Dev.to AI
🎓
Tutor Explanation
DeepCamp AI