How Machine Learning uses Linear Algebra to solve data problems
Key Takeaways
The video discusses the importance of linear algebra in machine learning, covering topics such as data representation, dimensionality reduction, and vector embeddings, with tools like numpy being utilized for linear algebra operations.
Full Transcript
hello everyone uh welcome to the channel so as you all know that there are so many conjectures around learning mathematics for data science or machine learning and this is one of those videos which will actually help you understand why we say that mathematics is very crucial to understand how the algorithms work or how machine learning models train or how the data is represented in the whole machine learning space so in this lecture i am basically talking about linear algebra how it helps us solve different sort of problems of data representation embedding in deep learning and you know that are performed in neural networks so it's very very important to have an understanding of why we say that mathematics is important and we don't say that you need to be a really you know expert mathematician or statistician to be a really good ml engineer but that definitely helps so how much is enough what all you should know is covered in this lecture of the course that i have just created for the vplane academy and you can definitely go check it out and more on the course towards the end of this video so find out why linear algebra is important for machine learning and data science i'll catch you towards the end of this video machines or your computers only understand numbers and these numbers need to be represented and processed in a way that enables these machines to solve problems by learning from data instead of predefined instructions as in the case of programming and linear algebra is that mathematical foundation that supports almost everything that we are doing in machine learning and data science to solve these problems in the data science context a large part of it has linear algebra running behind the scenes the main areas that are enabled by linear algebra are data representation word embeddings or you can say vector embeddings and dimensionality reduction so data representation you can basically represent your data using vectors matrices and tensors and in word embeddings or victim beddings it is just about replacing a large dimensional vector with a smaller one for instance in nlp we deal with a lot of textual data we deal with so many words and each word represents a different meaning which might be similar to another word but vector embeddings in linear algebra allow us to represent these words more efficiently and finally dimensionality reductions so concepts like eigenvectors allow us to reduce the number of features or dimensions of the data while keeping the essence of all of them using something called principal component analysis and all of it is actually driven by a linear algebra so we'll talk more about it in the course now linear algebra basically deals with vectors and operations on vectors so in numpy this might be just a one-dimensional array of numbers but geometrically this has both a magnitude and direction so our data can be represented using a vector for example here one row in this data is represented by this feature vector which has three elements or components representing three different dimensions so n entries in a vector make it n dimensional vector space and in this case we can see we have three dimensions height weight and age now linear algebra can be seen in action across all the major applications today be sentiment analysis on a linkedin post or a twitter post be it detecting a type of lung infection from an x-ray image or any speech to text bought all of these data types are represented by numbers and tensors and we run vectorized operations to learn patterns from them using neural networks which then outputs processed tensor again which can be deduced to represent the final meaningful output so here in the case you can see there's a lung infection x-ray image which is first converted into a tensor then fed to a neural network the neural network after learning the patterns from these numbers then output a learned tensor and then we finally get the prediction from that output tensor when it comes to embeddings you can basically think of an n-dimensional vector being replaced with another vector that belongs to a lower dimensional space which is more meaningful and the one that overcomes computational complexities for example here is a three dimensional vector which is replaced by a two dimensional vector so you can basically think of a very large numbered dimensional space which is then converted into a lower dimensional space in a real world scenario finally you can think of embedding as a 2d plane being embedded into a 3d space and that's where this term bedding comes from you can think of the ground you're standing on as a 2d plane which is then embedded into the space in which we live so that's the 3d space now just to give you a real-world use case to relate all of this discussion on vector embeddings all applications that are giving you personalized recommendations are using vector embedding in some form for example here is a graphic from google's course on recommendation system where we are given the data on different users and their preferred movies some users are kids and others are adults some movies were all time classics while others are more artistic some movies are targeted towards a younger audience kids while movies like memento are preferred by adults now we not only need to represent this information in numbers but also need to find a smaller dimensional vector representation that capture all of these features so a very quick way to understand how we can pull this task is by understanding something called matrix factorization which allows us to break a large matrix down to smaller matrices now ignore the numbers and the colors for now and just try to understand how we have broken down one big matrix into two smaller ones for example here we have this matrix of 4 cross five which has four rows and five features which was then broken down to two matrices one of shape four cross two and the other of shape two cross five we basically have new smaller dimensional vectors for users and for movies the horizontal one for the movies and the vertical one for the users and this allows us to plot this on a 2d vector space and here you will see that the user 1 and the movie harry potter are closer whereas user 3 and the movie shrek are closer now you can basically relate that the movie shrek is preferred more by kids that's why they are closer but the concept of dot product of vectors tell us more about the similarity of two vectors which we'll dive into in the course with dimensionality reduction we have a goal to narrow down our search and analysis to a lower number of dimensions and features our data points are often clustered along a line or a lower dimensional space and we are basically looking for that principle direction that explains most of our data and this is done using the concept of eigenvectors in linear algebra and the technique is called principal component analysis which is widely used in unsupervised learning now linear algebra is driving a host of areas and to name a few here is the list it's basically used in statistics chemical physics genomics word embeddings neural network deep learning robotics image processing quantum physics you just name it now the question is how is all of this possible in programming and how can we learn to program these concepts of linear algebra so the answer is we don't have to reinvent the wheel numpy gives us access to all the underlying concepts of linear algebra it's just about we need to understand the basics the fundamental and the programming part of it is actually taken care of by numpy it is fast as it runs on compiled c code and it has a large number of mathematical and scientific functions that we can use so i think that's enough context and motivation for learning linear algebra let's get down to it so i hope you must have gotten some motivation some understanding of how linear algebra is used in machine learning and you know everything that happens with model training or data representation or embeddings or you know dimensionality reduction for that matter so you can find out different sort of resources on the internet there are resources i'm not saying my course is the only thing that's out there but yeah if you are comfortable with the mathematical notations the deep learning book by ian goodfellow and yoshua benju i think that's something that i would recommend if you do not want to you know dive in too deep and you don't want something that is very notational heavy then you can definitely check out my course and i've covered it in very simple terms so even if you think that you know you are not very good at mathematics you can definitely easily follow that course it's enough for you to get started with machine learning or data science it covers statistics it covers math it covers calculus linear algebra the basic programming as well so it's a whole package if you want a specific part of it do let me know you can always reach out to me at the email that's provided in the description the course link is provided in the description as well feel free to ask any questions related to the course i have provided the coupon code as well which you can use to get the five dollars or 5.5 or i think 10 percent off on the launch price that i have set so yeah i am waiting for you to enroll and let me know in case you have any other queries apart from the course as well i can really help you out sort out your questions and your queries so until next time this is harshat yagi signing off [Music] you
Original Description
Use code STUDENT10 to get $10 off!
Course: https://www.wiplane.com/p/foundations-for-data-science-ml
In case of any queries, reach out at harshit@wiplane.com
You can connect with me via:
Newsletter: https://dswharshit.substack.com/
LinkedIn: https://www.linkedin.com/in/tyagiharshit/
Medium: https://dswharshit.medium.com/
Twitter: https://twitter.com/dswharshit
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Harshit Tyagi · Harshit Tyagi · 40 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
▶
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Your PATH to learning Data Science
Harshit Tyagi
Ideal Python environment setup for Data Science projects - Unix shell, Anaconda and Git.
Harshit Tyagi
Building COVID-19 interactive dashboard from Jupyter Notebook | No frontend/backend coding required.
Harshit Tyagi
Introduction to Jupyter Notebooks - Interface | Ipython Kernel | Sharing | GitHub
Harshit Tyagi
Python fundamentals for Data Science - Part 1 | Data types | Strings | Lists
Harshit Tyagi
Python fundamentals for Data Science - Part 2 Dictionaries | Conditionals | Loops | Functions
Harshit Tyagi
Python fundamentals for Data Science - Part 3 OOPS | Working with External Libraries & Modules
Harshit Tyagi
NumPy Essentials for Data Science - part-1 | One Dimensional Array
Harshit Tyagi
NumPy Essentials for Data Science - part-2 | Multi-Dimensional Array
Harshit Tyagi
Math For Data Science | Practical reasons to learn math for Machine/Deep Learning
Harshit Tyagi
Linear Algebra Ep 1 | Introduction to Vectors, Matrices and Tensors using NumPy
Harshit Tyagi
Linear Algebra Ep 2 | Dot Product in Linear Algebra for Data Science
Harshit Tyagi
Python vs R | The BEST programming language for your Data Science Project
Harshit Tyagi
Linear Algebra for Data Science Ep3 | Identity and Inverse Matrices | NumPy
Harshit Tyagi
The Data Show Ep1 | Elucidating Data Science in Drug Discovery - A CTO's Account
Harshit Tyagi
Google Certified TensorFlow Developer | Learning Plan, Tips, FAQs & my Journey
Harshit Tyagi
Speeding up your Data Analysis | Hacks & Libraries
Harshit Tyagi
How to build an Effective Data Science Portfolio
Harshit Tyagi
End-to-End Machine Learning Project Tutorial - Part 1
Harshit Tyagi
Data Preparation with Sci-kit learn and Pandas | End-to-End ML Project Tutorial - Part 2
Harshit Tyagi
Training and Fine-Tuning ML Models with Sklearn | End-to-End ML Project Tutorial - Part 3
Harshit Tyagi
Deploying a Trained ML model via Flask on Heroku | End-to-End ML Project Tutorial - Part 4
Harshit Tyagi
Three Decades of Practising Data Science | Interview with Dean Abbott
Harshit Tyagi
Calculating Vector Norms - Linear Algebra for Data Science - IV
Harshit Tyagi
Ep1 - Getting Started | Zero to Hero in Computer Vision with TensorFlow
Harshit Tyagi
Ep3 - Designing Data Experiments to enhance your Product | Rapido's Data Science Lead, Pramod N
Harshit Tyagi
Building projects with fastai - From Model Training to Deployment
Harshit Tyagi
October AI - Video Calling with One-Tenth of Internet Bandwidth
Harshit Tyagi
November AI - Breakthrough in biology after 50 years | Datasets, books, research papers and more...
Harshit Tyagi
Data Science learning roadmap for 2021
Harshit Tyagi
Talk is cheap, BUILD - Microsoft Software Engineer | Interview with Abhirath Batra
Harshit Tyagi
Building a Habit of Reading Research Papers | Ft. Anurag Ghosh(Microsoft Researcher)
Harshit Tyagi
Tableau vs Python - Building a COVID tracker dashboard
Harshit Tyagi
[Explained] What is MLOps | Getting started with ML Engineering
Harshit Tyagi
Dmitry Petrov - Creator of DVC | ML Systems, Teams, Scaling challenges, and Learning Data Science
Harshit Tyagi
Five hard truths about building a career in Data Science
Harshit Tyagi
Computing gradients using TensorFlow | Training a Linear Regression model from scratch.
Harshit Tyagi
Foundations for Data Science & ML - First steps for every beginner!
Harshit Tyagi
Course Outline - Foundations for Data Science & ML
Harshit Tyagi
How Machine Learning uses Linear Algebra to solve data problems
Harshit Tyagi
Calculus for ML - How much you should know to get started
Harshit Tyagi
Building a buzzing stocks news feed using NLP and Streamlit | Named Entity Recognition & Linking
Harshit Tyagi
AI Engineer - The next big tech role!
Harshit Tyagi
AI researcher vs AI engineer | The next big tech role!
Harshit Tyagi
Reviewing LLMs for content creation
Harshit Tyagi
Building a chatGPT-like bot on WhatsApp #coding #chatgpt #engineering
Harshit Tyagi
High Signal AI - the most action-oriented newsletter on the web! #ai
Harshit Tyagi
Building an AI-powered Discord Chatbot Locally for FREE using Ollama
Harshit Tyagi
Build a second brain with Khoj 🧠 #ai #obsidian #plugins #productivity #engineering #notes
Harshit Tyagi
Summarising YouTube Videos using Ollama on Discord | Becoming an AI Engineer - Ep 2
Harshit Tyagi
Watch the full video on my channel - Roadmap to become an AI Engineer.
Harshit Tyagi
Mesop - Python-based UI framework from Google!
Harshit Tyagi
How I automated my YouTube | Gumloop tutorial | No Code
Harshit Tyagi
ARC PRIZE - Win $1Million to Beat the ARC-AGI benchmark
Harshit Tyagi
Microsoft's Autogen vs CrewAI - tested on a diverse range of use cases
Harshit Tyagi
Claude #AI artifacts are just amazing!
Harshit Tyagi
OpenAI releases CriticGPT to correct GPT-4's mistakes | Read the paper with me
Harshit Tyagi
Day in my life | Vlog #1
Harshit Tyagi
How to add AI Copilot to your application using CopilotKit | Tutorial
Harshit Tyagi
Quick Questions with an AI Founder - Anudeep Yegireddi
Harshit Tyagi
More on: ML Maths Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Mastering TypeScript — Understanding the TypeScript Compiler (tsc) from Scratch — Lesson 2
Medium · JavaScript
Stop Overfitting With Basically One Line of Code
Medium · AI
Stop Overfitting With Basically One Line of Code
Medium · Machine Learning
Stop Overfitting With Basically One Line of Code
Medium · Data Science
🎓
Tutor Explanation
DeepCamp AI