I Wish I Had Known THIS Before Starting in Data Science

Ken Jee · Beginner ·📊 Data Analytics & Business Intelligence ·7y ago

Key Takeaways

Ken Jee discusses his experiences and lessons learned in the data science field, highlighting the importance of data wrangling, effective communication, and understanding business objectives, as well as the role of tools like Python, Docker, and SageMaker in productionizing code.

Full Transcript

hello everyone ken here back with some more data science content for you today I'm talking about the things that I wish I had known before I started down the data science career path I'm going to talk a bit about the expectations related to the job as well as some of the tools that would have been useful for me to really get going full steam ahead as usual please hit that like button if this content is interesting to you and please subscribe if you'd like to see more videos like this now the first thing I'd like to mention is related to data wrangling data manipulation that's a lot larger part of the job then I expected going in most data scientists probably spend more than 50% of their time manipulating data and you know less than 50% of their time actually running algorithms it's doing data exploration etc so that's something that if you're really good at it can pay huge dividends in the long run and if you're quick and efficient at it that can also really add to your value and add to your efficiency another thing that was a little bit different than my expectation going in every day to science job that I've had is I didn't have as much control over the projects I was working on then I was expecting when you come out of school you're usually your assignments focus on projects that you choose and with a lot of the data science work really work at any company it's more of a push you're supposed to work on this because this aligns with business objectives so that's just something to keep in mind that is very common pretty much across to every type of job but if you want to be a go-getter you can also push for projects that are interesting to you I think it shows really good initiative to leadership that you want to work on specific initiatives and if you can have success there you might have more freedom going forward in your project selection etc now coming out of school I really thought I knew everything I was kind of a smartass and I started working and I learned that basically everyone that I worked with was really intelligent they knew what they were doing they knew a lot about data science and they knew about stuff other than data and I think one of the greatest things about working in this field is that you get to meet a ton of really intelligent people and you get to learn from them and that's a great asset to have there are people that come from physics backgrounds people that come from very technical computer science backgrounds and taking their knowledge and building it into your skillset is how you grow in this career and how you grow as a person now on the other hand you do work with people that aren't quite as technical there's always business stakeholders that don't have a good grasp of data science and it's an important skill to understand how to communicate with these people how to be able to take technical requirements how to take what's going on inside of an algorithm and explain them to to basically anyone now it's not that business people are less intelligent it's just that they have less focus and they haven't been studying this for a long period of time with learning from other people comes the feeling that data science is a lot bigger than you thought when I first started I kind of thought the data science was going well I have some data I run it through an algorithm I productionize it and I'm done data science can be whatever you want it to be it can be involving simulation you can bring in other computer science topics you can bring in a bunch of different types of modeling and it's constantly growing and changing the techniques that are used on a day to day basis and can even fluctuate and there's a wealth of research and things that are going on continuously that are improving the field so it's important to continually keep growing to keep learning to see what's on the cutting edge there's also new technologies that are constantly released that can make your life easier and can also change the way that you think about different problems now in terms of tools I think there are a couple of things I really wish I had put more upfront effort in in learning there's the traditional stuff like Python a bunch of different sequel things or dealing with databases relational etc but there's another element of production ization of code that is generally overlooked people use things like docker they use sage maker and these things are being more and more commonly used in the field and it's important to have a grasp of them you can write really good code but if it's not generalizable if it's not your if you're not able to make it an endpoint it's not super useful in the field so I would focus on learning different frameworks that are out there for production iliza code and production Eliza's in data science while tools are really important at the end of the day work revolves around other people and in every company the culture the way data science is done is very different so the way that projects are scoped the way that projects are run the way that requirements are gathered really very significantly by company and you can work in they vary by the size of the company if it's small or large what stage the company is at if it is a startup or a high-growth company or an established corporation so learning about how the companies that you're applying for positions at how they operate is extremely important and you have to find a culture that really that really meets your style some people work on a bunch of different projects at the same time some people have a very linear focus in their in their projects and the things that they do on a day to day basis so make sure you do your homework make sure you really research the type of organization that you are applying to and getting into and to that point to a certain extent office politics do play a fairly large role in data science data science sits in between engineering and software and the business side and so there is a lot more communication and kind of jockeying for time than there are in some other positional roles so I think that that's just something that's an inevitability but you have to be prepared for it you're going to be asked to do a bunch of different things you're going to be pulled different ways by different people and it's really important to make sure that your manager the people that you're working for have your best interest in mind I think that you find this out relatively quickly through the interview process and that's one of the questions that I generally ask when I'm interviewing I ask what the structure is like who are our main stakeholders and how does my perceived boss react with those stakeholders and you know how does my work that shown etc so that's something to keep in mind that's something to focus on because in some companies office politics can be really bad and some they exist but they are beneficial for you in your role one of the other biggest deviations from education - in practice data science is the idea of statistical significance in business you're just trying to create positive expected value rather than to have certainty that something works you want to evaluate what the return is compared to what the cost is if the cost is that things stay the exact same if your algorithm doesn't have any impact that's generally fine if you want to do things that like can possibly create value now I'm not saying that statistical significance isn't important in some areas you know if you're testing a new drug if you're in the pharmaceuticals industry you really want to be sure that people are getting sick but if you're focusing on different customer groups to target if you can get a 1% better outcome that might be worth putting into production now people really get wrapped up in this but we forget the statistical significance nowadays with the size of data that we have is relatively arbitrary we just want to be able to create a positive net effect and that really goes a long way with the business some people get too theoretical but at the end of the day the sometimes the the best we can do is to create a 1% lift and that can create millions even billions of dollars of revenue which is pretty cool thank you so much if you have any questions relating to any of the topics I talked but please leave them in the comments section as usual have a great one and thank you for watching

Original Description

I talk about the things that I wish I had known before I started down the data science career path. 1) Data wrangling is a bigger part of the job than many expect. Learn SQL and be efficient with manipulating data in Python or R. 2) You don't always get to work on projects that are exciting to you. In school you choose many of the data sets and projects that you do. Almost universally, at work, your day to day projects are dictated by your boss. 3) You can learn a tremendous amount from other people. Data Scientists are incredibly smart and come from a range of backgrounds. 4) Data science is constantly growing. With the field being broadly defined, there is almost a limitless amount of things to learn. 5) Office politics do play a role in data science. If you are interested in management after data science, it could be good to know who makes the strategic decisions. 6) Data science is very different at different companies depending on the size and growth phase of the organization. 7) Put some time into learning implementation frameworks (Docker & Sagemaker) 8) Statistical significance plays a different role in the business setting. If the expected value is positive, you should at least try it. #DataScience #KenJee ⭕ Subscribe: https://www.youtube.com/c/kenjee1?sub_confirmation=1 🎙 Listen to My Podcast: https://www.youtube.com/c/KensNearestNeighborsPodcast 🕸 Check out My Website - https://kennethjee.com/ ✍️Sign up for My Newsletter - https://www.kennethjee.com/newsletter 📚 Books and Products I use - https://www.amazon.com/shop/kenjee (affiliate link) Partners & Affiliates 🌟 365 Data Science - Courses ( 57% Annual Discount): https://365datascience.pxf.io/P0jbBY 🌟 Interview Query - https://www.interviewquery.com/?ref=kenjee MORE DATA SCIENCE CONTENT HERE: 🐤My Twitter - https://twitter.com/KenJee_DS 👔 LinkedIn - https://www.linkedin.com/in/kenjee/ 📈 Kaggle - https://www.kaggle.com/kenjee 📑 Medium Articles - https://medium.com/@kenneth.b.jee
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Ken Jee · Ken Jee · 22 of 60

1 Predicting Crypto-Currency Price Using RNN lSTM & GRU
Predicting Crypto-Currency Price Using RNN lSTM & GRU
Ken Jee
2 Predicting Season Long NBA Wins Using Multiple Linear Regression
Predicting Season Long NBA Wins Using Multiple Linear Regression
Ken Jee
3 How I Became A Data Scientist From a Business Background
How I Became A Data Scientist From a Business Background
Ken Jee
4 Should You Get A Masters in Data Science?
Should You Get A Masters in Data Science?
Ken Jee
5 How to Simulate NBA Games in Python
How to Simulate NBA Games in Python
Ken Jee
6 Demystifying Data Science Roles
Demystifying Data Science Roles
Ken Jee
7 The Best Way to Predict NBA Minutes Played
The Best Way to Predict NBA Minutes Played
Ken Jee
8 IT'S NOT TOO LATE TO LEARN CODE!
IT'S NOT TOO LATE TO LEARN CODE!
Ken Jee
9 My Top 5 Data Science Resources for 2019
My Top 5 Data Science Resources for 2019
Ken Jee
10 Watch This Before Applying to Data Science Jobs
Watch This Before Applying to Data Science Jobs
Ken Jee
11 Where YOU Should Start With Data Science Projects
Where YOU Should Start With Data Science Projects
Ken Jee
12 Welcome To My Channel | Ken Jee | Data Science
Welcome To My Channel | Ken Jee | Data Science
Ken Jee
13 Why You DON'T Want to be a WFH Data Scientist
Why You DON'T Want to be a WFH Data Scientist
Ken Jee
14 Was Captain Marvel Bad? A Sentiment Analysis of Twitter Data
Was Captain Marvel Bad? A Sentiment Analysis of Twitter Data
Ken Jee
15 Data Science, Machine Learning, and AI: What's the Difference?
Data Science, Machine Learning, and AI: What's the Difference?
Ken Jee
16 Data Science: Startup vs. Large Corporation
Data Science: Startup vs. Large Corporation
Ken Jee
17 Where to Look for Data Science Jobs
Where to Look for Data Science Jobs
Ken Jee
18 Work From Home Data Scientist: Day in the Life
Work From Home Data Scientist: Day in the Life
Ken Jee
19 Scrape Twitter Data in Python with Twitterscraper Module
Scrape Twitter Data in Python with Twitterscraper Module
Ken Jee
20 Should You Learn R for Data Science?
Should You Learn R for Data Science?
Ken Jee
21 NASA Physicist Turned Data Scientist (Tim Bowling) - KNN EP. 02
NASA Physicist Turned Data Scientist (Tim Bowling) - KNN EP. 02
Ken Jee
I Wish I Had Known THIS Before Starting in Data Science
I Wish I Had Known THIS Before Starting in Data Science
Ken Jee
23 What I Learned From My Three Degrees
What I Learned From My Three Degrees
Ken Jee
24 Most Data Science Hopefuls Overlook This Important Skill
Most Data Science Hopefuls Overlook This Important Skill
Ken Jee
25 Golf STATS: Strokes Gained Explained
Golf STATS: Strokes Gained Explained
Ken Jee
26 My Top 5 Data Science Internship Tips
My Top 5 Data Science Internship Tips
Ken Jee
27 How I Got My First Data Science Internship (And How You Can Land One)
How I Got My First Data Science Internship (And How You Can Land One)
Ken Jee
28 Data Science: Pros and Cons
Data Science: Pros and Cons
Ken Jee
29 Data Science Fundamentals: Data Exploration in Python (Pandas)
Data Science Fundamentals: Data Exploration in Python (Pandas)
Ken Jee
30 Data Science Fundamentals: Data Manipulation in Python (Pandas)
Data Science Fundamentals: Data Manipulation in Python (Pandas)
Ken Jee
31 What Does a Data Scientist Actually Do?
What Does a Data Scientist Actually Do?
Ken Jee
32 The Projects You Should Do To Get A Data Science Job
The Projects You Should Do To Get A Data Science Job
Ken Jee
33 Take Your Data Science Projects From Good to Great
Take Your Data Science Projects From Good to Great
Ken Jee
34 How To Get Data Science Experience (Without a Job)
How To Get Data Science Experience (Without a Job)
Ken Jee
35 Data Science Fundamentals: Data Cleaning in Python
Data Science Fundamentals: Data Cleaning in Python
Ken Jee
36 Is Data Science Right For You?
Is Data Science Right For You?
Ken Jee
37 Thank You For The Support | What's Next | Ken Jee | Data Science
Thank You For The Support | What's Next | Ken Jee | Data Science
Ken Jee
38 How To Build A Word Cloud From Scraped Data (Python)
How To Build A Word Cloud From Scraped Data (Python)
Ken Jee
39 6 Habits of Successful Data Scientists
6 Habits of Successful Data Scientists
Ken Jee
40 How Far Should the NBA 3-Point Line Actually Be?
How Far Should the NBA 3-Point Line Actually Be?
Ken Jee
41 How to Stay Productive & Motivated When Learning Data Science
How to Stay Productive & Motivated When Learning Data Science
Ken Jee
42 Why is Balance Important in Data Science?
Why is Balance Important in Data Science?
Ken Jee
43 By The Numbers: Where Should The NBA Put a 4 Point Line?
By The Numbers: Where Should The NBA Put a 4 Point Line?
Ken Jee
44 Why Selling Is An Important Data Science Skill
Why Selling Is An Important Data Science Skill
Ken Jee
45 Applying Data Science To My YouTube Data: My Surprising Findings
Applying Data Science To My YouTube Data: My Surprising Findings
Ken Jee
46 9 Ways You Can Make Extra Income as a Data Scientist
9 Ways You Can Make Extra Income as a Data Scientist
Ken Jee
47 Sports Analytics 101: The Pythagorean Theorem of Sports
Sports Analytics 101: The Pythagorean Theorem of Sports
Ken Jee
48 Golf: Would You Rather Be the LONGEST or STRAIGHTEST Driver on the PGA Tour?
Golf: Would You Rather Be the LONGEST or STRAIGHTEST Driver on the PGA Tour?
Ken Jee
49 Data Science Fundamentals: Linear Regression
Data Science Fundamentals: Linear Regression
Ken Jee
50 How YOU Can Land a Sports Analytics Job
How YOU Can Land a Sports Analytics Job
Ken Jee
51 The 5 Stages of Data Science Adoption
The 5 Stages of Data Science Adoption
Ken Jee
52 Math Needed for Mastering Data Science
Math Needed for Mastering Data Science
Ken Jee
53 5 Sports Analytics Books to Get You Started
5 Sports Analytics Books to Get You Started
Ken Jee
54 3 Reasons You Should NOT Become a Data Scientist
3 Reasons You Should NOT Become a Data Scientist
Ken Jee
55 Collision Course: Sports Betting + Data Science
Collision Course: Sports Betting + Data Science
Ken Jee
56 How to Scrape NBA Data Using the nba_api Python Module
How to Scrape NBA Data Using the nba_api Python Module
Ken Jee
57 5 Data Science Resolutions for 2020
5 Data Science Resolutions for 2020
Ken Jee
58 The Data Science Interview: What to Expect
The Data Science Interview: What to Expect
Ken Jee
59 The 9 Books That Changed My Perspective in 2019
The 9 Books That Changed My Perspective in 2019
Ken Jee
60 Questions You Should Ask Your Data Science Interviewers
Questions You Should Ask Your Data Science Interviewers
Ken Jee

Ken Jee shares his experiences and lessons learned in the data science field, emphasizing the importance of data wrangling, communication, and business acumen. He discusses the role of tools like Python, Docker, and SageMaker in productionizing code and highlights the need to prioritize business value over statistical significance.

Key Takeaways
  1. Learn data wrangling and manipulation skills
  2. Develop effective communication skills for non-technical stakeholders
  3. Understand business objectives and priorities
  4. Familiarize yourself with tools like Python, Docker, and SageMaker
  5. Prioritize business value over statistical significance
💡 In data science, business value and return on investment are often more important than statistical significance, and data scientists must be able to communicate effectively with non-technical stakeholders to drive business decisions.

Related AI Lessons

What are the real-world applications of data science?
Learn how data science is applied in real-world industries to drive better decisions and improve efficiency
Dev.to AI
Why Statistics is Important in Data Science
Statistics is the foundation of data science, enabling professionals to extract insights and make informed decisions from data, and its importance cannot be overstated
Medium · Data Science
Does This Have AI in It Yet?
You can build AI-friendly systems using existing data discipline skills, no new skills required
Medium · Data Science
Foundation First : Why Poor Data Quality Silently Destroys Enterprise AI, Analytics, and System…
Poor data quality can silently destroy enterprise AI, analytics, and systems, making it crucial to prioritize data foundation
Medium · AI
Up next
Spreadsheet Guy Meets the CFO: "Define How Much"
Digital Transformation with Eric Kimberling
Watch →