Foundations for Data Science & ML - First steps for every beginner!
Key Takeaways
The video covers the foundations for data science and machine learning, including programming, data engineering, machine learning, deep learning, mathematics, and statistics, with a focus on Python programming and essential libraries such as NumPy and Pandas.
Full Transcript
hello everyone i am back after a very long break of three months now what was i doing in these three months this video is all about that i was actually working on a very important topic on a very important course and it has finally come to completion or you can say i'm just about to complete in another 10 to 15 days so what is it about why have i started it let's find out in this video [Music] [Applause] [Music] so at the beginning of this year i published a roadmap on data science learning so it was widely accepted the article was translated in many different languages there is a video on my channel as well which has close to like 10 000 views or something so the road map was widely accepted people thanked me for publishing it students were asking questions everything was really good but there were a few students who pointed out that the article or the video was loaded with resources there were resources on every topic programming data engineering machine learning deep learning mathematics statistics but most of those resources were paid some of them were free as well some of them were really good and free but one of the flaw of that article or the video is that i didn't talk about the foundation like people asked me what all foundations one should work on when they are aspiring to learn data science or diving deep into machine learning now here i am going to talk about the three pillars of data science and machine learning that will provide you the very solid foundation for your career as a data scientist as a data analyst machine learning engineer practitioner researcher whatever you want to be so let's talk about the first pillar of data science or machine learning which is programming always start off by learning programming first and i personally prefer python over any other language just because of its versatility as well as ease of learning you can develop end-to-end projects using python so my personal go-to language for beginners would be python now what all concepts you should master how much you should learn let's talk about that so if we head down to the curriculum here you see in the programming section you should first focus on how to set up environment how to work with jupyter notebooks or google collaboratory notebooks how to do analysis in them and then you should start learning introduction to programming basically what are variables what are data types then strings python lists control flow loops how to iterate over those loops how to iterate over different data structures dictionaries iterating over a dictionary list comprehension sets tuples functions and then you move on to object-oriented programming learn about classes objects and further on move to python scripts so here by now you would be very comfortable with jupyter notebooks but there are python scripts they are modules their libraries that are written in vs code or any other text editor learn how to work with external libraries learn how to work with the files how to read files how to write two files best practices and lastly you should be able to extract data or collect data from different apis or databases now in the numpy module you see two lectures as of now but there are 15 to 18 more lectures that i'm working on and they would be published by the end of this month both for the numpy module as well as the pandas module but here you should be able to handle multi-dimensional areas indexing slicing transposing broadcasting creating pseudo random numbers and performing vectorized operations using you know scientific computing then for pandas you should be able to manipulate data you should know how to create series how to create data frames indexing in a data frame comparisons boolean indexing merging data frames mapping and applying functions and then data cleaning and wrangling as well so by the end of your pandas module you would have a really good understanding of how to analyze and crunch data lastly data visualization now visualization plays a crucial role as well you should know the matplotlib api hierarchy you should know how to add styles colors markers to applaud you should have a very good understanding of what kind of plots are used in what scenarios so line plots bar plots scatter plots histograms box plots and you should be familiar with all of these concepts when it comes to programming for data science the second pillar of data science for machine learning is mathematics now this is again a very controversial topic some people do not want to learn mathematics they say that they can do just well without learning mathematics i personally do not agree with it i think that you should really be familiar with essential mathematics in order to understand how the algorithm works and if something very custom or something you know very specific a niche problem comes your way you won't be able to handle it if you do not really understand how those algorithms are working and the back end of all of those algorithms is computational you must have seen that all of these job descriptions of data scientists machine learning engineers and analysts as well they require you to come from a computational background they want people who have done ms or phd in physics in mathematics these people are really good in mathematics and if you think that you can just you know build on top of a very brief or very superficial understanding of very simple mathematics then i don't think you would have a really long lasting career in this particular domain at least and i'm not saying you have to be a gold medalist or anything don't go too deep into it just learn enough linear algebra enough basic algebra learn enough calculus and some of the important functions that we use in common mathematics some high school mathematics so on and so forth and you can then build on top of what the algorithm does now after learning these topics you would have a really good base to understand all of those you know really heavy machine learning algorithms or deep learning algorithms you would understand okay how back propagation works how chain rule supports back propagation you would understand how partial derivatives help you compute those gradients so those are very important topics that one must actually pay attention to and now comes the last or the third pillar of data science and machine learning specifically data science i would say is statistics now every organization you know wants to be data driven uh they want people who can actually drive decision making data scientists who can actually crunch numbers and help them make decisions that could actually help the organization grow and statistics plays a very crucial role across every stage of this whole process now data scientists are required to explain data describe data they need to look at how the data is distributed they need to design experiments they need to quantify risk they need to quantify uncertainty they need to understand how metrics work so statistics i would say is a must game every interview every data science interview is going to grill you grill you on statistics so that is something that is absolutely essential and this course right here talks about those essential topics that one must actually start off with you can always keep building on top of it there are like two branches that i basically teach one is the descriptive statistics and the second one is inferential statistics inferential statistics basically talks about hypothesis testing different measurements significance testing and different types of other tests now the third branch of statistic that's very important is probability probability helps you quantify uncertainty it helps you quantify risk and all organizations all businesses want to learn how much risk is involved in a particular decision so that's what you are able to do once you understand the importance of these topics conditional probability probability distributions pdf cdf pmf probability mass functions all these things are very important in order to have a very long lasting query in order to have a really good foundation now i feel that there is a need of a very compact course that actually talks about the first steps for learning data science or machine learning that talks about developing that foundation that is required now if you take an example of google's machine learning course here is the prerequisites and pre-work that is actually required before you start learning machine learning this is what google has recommended you take the example of andrew anger's very very famous course on machine learning again a very good and free resource now on youtube and the thing is it requires mathematics as well you should be familiar with partial derivatives you should be familiar with linear algebra you should be familiar with important basic algebra important functions all of those things but there aren't enough resources that actually give you that compact course that goes just deep enough to complete and cover all of those topics and tell you how those are related to artificial intelligence or data science so this is why i have actually started this academy called viplane and here my aim is just to help you master data science or ai it might take time it is a slow process i know but you gotta pick a domain you gotta pick a field and then dive deep into it after building a very solid foundation now when students reached out to me after going through my article on the data science learning roadmap they asked me what would be the first steps what would be the foundational concepts that they can start off with and to be honest i was unable to find any particular compact yet affordable course that actually covers all of these topics and in the right amount of depth so here i present you viplane.com which is basically wip lane so that's the lane where you want to toggle yourself into work in progress so the biplane academy is all about mastering data science and ai and i would be publishing courses you know on a regular basis on this platform people can enroll there would be a community there would be discord channels coming up really soon now the important thing is i will not just keep publishing courses on this platform i will be updating these courses every month based on the inputs of the students and here i present you the first course which is foundations for data science and machine learning now these basically cover the essentials of programming mathematics and statistics all of these concepts all of these topics that i have just talked about is covered in this course and after completing this course you would be able to start doing projects on data analysis data science you will need to learn a little bit more about machine learning algorithms and basically i would say first do data analysis projects learn how to crunch data and then move on to machine learning side of things so data analysis always comes first but you would be in a very good position to actually understand these concepts really quickly now the course not only covers the essential programming or you know prerequisites or pre-work that is required for data science or machine learning you actually cover every topic computationally as well as programmatically so we learn how to program or code those concepts as well be it any topic from mathematics or statistics you would be coding a lot in this course you would be working on assignments you would be working on some projects some exercises to get comfortable with each of those topics now i am currently in the process of finalizing this entire course there are a few videos that are left for the numpy module and pandas module and matplotlib but mathematics calculus linear algebra descriptive statistics programming all of those are actually complete and i've marked some of the videos as free so you can actually preview all of those videos and find out whether you whether the course actually meets your expectations or not now i'm actually releasing this course for pre-sales so i'm pre-selling this course now it would actually be completed by the first week of september so right now the course is actually priced at a very affordable 35 us dollars or if you are an indian it's basically 2500 rupees and after 30th of august it will be 50 so that would be the price i personally feel for the amount of content that has been put into it as well as some cost that has incurred in order to set up this whole platform it's something that i would have to charge and also one thing that i personally feel is if you do not pay for something you're not that sincere or serious about learning so that kind of does the job as well so there's a whole lot out there to help you build a solid foundation for data science and machine learning now it's on you whether you want it or not i'll catch you guys in the next one
Original Description
Check out the course: https://www.wiplane.com/p/foundations-for-data-science-ml
You can follow me on:
Newsletter: https://dswharshit.substack.com/
LinkedIn: https://www.linkedin.com/in/tyagiharshit/
Medium: https://dswharshit.medium.com/
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Harshit Tyagi · Harshit Tyagi · 38 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
▶
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Your PATH to learning Data Science
Harshit Tyagi
Ideal Python environment setup for Data Science projects - Unix shell, Anaconda and Git.
Harshit Tyagi
Building COVID-19 interactive dashboard from Jupyter Notebook | No frontend/backend coding required.
Harshit Tyagi
Introduction to Jupyter Notebooks - Interface | Ipython Kernel | Sharing | GitHub
Harshit Tyagi
Python fundamentals for Data Science - Part 1 | Data types | Strings | Lists
Harshit Tyagi
Python fundamentals for Data Science - Part 2 Dictionaries | Conditionals | Loops | Functions
Harshit Tyagi
Python fundamentals for Data Science - Part 3 OOPS | Working with External Libraries & Modules
Harshit Tyagi
NumPy Essentials for Data Science - part-1 | One Dimensional Array
Harshit Tyagi
NumPy Essentials for Data Science - part-2 | Multi-Dimensional Array
Harshit Tyagi
Math For Data Science | Practical reasons to learn math for Machine/Deep Learning
Harshit Tyagi
Linear Algebra Ep 1 | Introduction to Vectors, Matrices and Tensors using NumPy
Harshit Tyagi
Linear Algebra Ep 2 | Dot Product in Linear Algebra for Data Science
Harshit Tyagi
Python vs R | The BEST programming language for your Data Science Project
Harshit Tyagi
Linear Algebra for Data Science Ep3 | Identity and Inverse Matrices | NumPy
Harshit Tyagi
The Data Show Ep1 | Elucidating Data Science in Drug Discovery - A CTO's Account
Harshit Tyagi
Google Certified TensorFlow Developer | Learning Plan, Tips, FAQs & my Journey
Harshit Tyagi
Speeding up your Data Analysis | Hacks & Libraries
Harshit Tyagi
How to build an Effective Data Science Portfolio
Harshit Tyagi
End-to-End Machine Learning Project Tutorial - Part 1
Harshit Tyagi
Data Preparation with Sci-kit learn and Pandas | End-to-End ML Project Tutorial - Part 2
Harshit Tyagi
Training and Fine-Tuning ML Models with Sklearn | End-to-End ML Project Tutorial - Part 3
Harshit Tyagi
Deploying a Trained ML model via Flask on Heroku | End-to-End ML Project Tutorial - Part 4
Harshit Tyagi
Three Decades of Practising Data Science | Interview with Dean Abbott
Harshit Tyagi
Calculating Vector Norms - Linear Algebra for Data Science - IV
Harshit Tyagi
Ep1 - Getting Started | Zero to Hero in Computer Vision with TensorFlow
Harshit Tyagi
Ep3 - Designing Data Experiments to enhance your Product | Rapido's Data Science Lead, Pramod N
Harshit Tyagi
Building projects with fastai - From Model Training to Deployment
Harshit Tyagi
October AI - Video Calling with One-Tenth of Internet Bandwidth
Harshit Tyagi
November AI - Breakthrough in biology after 50 years | Datasets, books, research papers and more...
Harshit Tyagi
Data Science learning roadmap for 2021
Harshit Tyagi
Talk is cheap, BUILD - Microsoft Software Engineer | Interview with Abhirath Batra
Harshit Tyagi
Building a Habit of Reading Research Papers | Ft. Anurag Ghosh(Microsoft Researcher)
Harshit Tyagi
Tableau vs Python - Building a COVID tracker dashboard
Harshit Tyagi
[Explained] What is MLOps | Getting started with ML Engineering
Harshit Tyagi
Dmitry Petrov - Creator of DVC | ML Systems, Teams, Scaling challenges, and Learning Data Science
Harshit Tyagi
Five hard truths about building a career in Data Science
Harshit Tyagi
Computing gradients using TensorFlow | Training a Linear Regression model from scratch.
Harshit Tyagi
Foundations for Data Science & ML - First steps for every beginner!
Harshit Tyagi
Course Outline - Foundations for Data Science & ML
Harshit Tyagi
How Machine Learning uses Linear Algebra to solve data problems
Harshit Tyagi
Calculus for ML - How much you should know to get started
Harshit Tyagi
Building a buzzing stocks news feed using NLP and Streamlit | Named Entity Recognition & Linking
Harshit Tyagi
AI Engineer - The next big tech role!
Harshit Tyagi
AI researcher vs AI engineer | The next big tech role!
Harshit Tyagi
Reviewing LLMs for content creation
Harshit Tyagi
Building a chatGPT-like bot on WhatsApp #coding #chatgpt #engineering
Harshit Tyagi
High Signal AI - the most action-oriented newsletter on the web! #ai
Harshit Tyagi
Building an AI-powered Discord Chatbot Locally for FREE using Ollama
Harshit Tyagi
Build a second brain with Khoj 🧠 #ai #obsidian #plugins #productivity #engineering #notes
Harshit Tyagi
Summarising YouTube Videos using Ollama on Discord | Becoming an AI Engineer - Ep 2
Harshit Tyagi
Watch the full video on my channel - Roadmap to become an AI Engineer.
Harshit Tyagi
Mesop - Python-based UI framework from Google!
Harshit Tyagi
How I automated my YouTube | Gumloop tutorial | No Code
Harshit Tyagi
ARC PRIZE - Win $1Million to Beat the ARC-AGI benchmark
Harshit Tyagi
Microsoft's Autogen vs CrewAI - tested on a diverse range of use cases
Harshit Tyagi
Claude #AI artifacts are just amazing!
Harshit Tyagi
OpenAI releases CriticGPT to correct GPT-4's mistakes | Read the paper with me
Harshit Tyagi
Day in my life | Vlog #1
Harshit Tyagi
How to add AI Copilot to your application using CopilotKit | Tutorial
Harshit Tyagi
Quick Questions with an AI Founder - Anudeep Yegireddi
Harshit Tyagi
More on: LLM Foundations
View skill →Related Reads
📰
📰
📰
📰
GuardFall: When Decades-Old Shell Injection Tricks Beat Modern AI Safety Guardrails
Dev.to · Cor E
What 116 court judgments taught me about the limits of AI
Medium · AI
Your ChatGPT History Is a Liability. I Fixed That With a $80 Chip and a Pi5.
Medium · AI
Your Skepticism About AI Is an Asset. Here’s How to Use It.
Medium · Programming
🎓
Tutor Explanation
DeepCamp AI