Features and Feature Engineering in Machine Learning - An Introduction

Imaad Mohamed Khan · Beginner ·📐 ML Fundamentals ·4y ago
Skills: ML Pipelines80%

Key Takeaways

This video introduces features and feature engineering in machine learning, explaining their roles in the development of a machine learning system and how they differ from data.

Full Transcript

hi everyone welcome back to my youtube channel my name is imath and today we will be talking about features and feature engineering this is an introductory video very very basic if you're just getting started in the field of machine learning and you want to know what features are and what feature engineering means this is the video for you in this we will just take a look at how you define features and what the process of feature engineering means in the context of building a machine learning model or a machine learning system so let's get started uh to start off uh i will present a few statements to you that you might have heard while reading online blogs or talking to some of the people who are working actively and trying to build models so statements like what features did you use while trying to build your model or the features you used did not have a lot of protective power or feature or statements like these are very few features you need to add more these are a lot of features you need to reduce them so everywhere you see uh there's a lot of talk of features and people especially while building a machine learning model and then they ask you about what kind of features you used how effective are they how protective are they and and if you if you are wondering what this even means then you've come to the right video to understand what feature mean means and what future engineering means in the context of building a machine learning system so now let's get started with what is a feature so before we look at what a feature is we would i think what's important to understand is the entire process that uh or entire process of building machine learning model is dependent at its core on something we call as data and data at its core is observation of real world around us so whatever you see around you you see people you see machines you see systems you see logs all of these things are in a way what we call data right so data at its core is how you observe the world around you and then store that in a format that could be reused and re-read features is essentially a numeric representation of the raw data that you observe around you so it's a way of representing these observations that could be useful to perform further tasks and that is it right so features are derived in a way from data but they are a way of storing your data i mean you store your data but there are way of perhaps transforming or changing your data that is numeric and that could be useful to perform for the tasks so i have an example for you here and this is this will hopefully make you understand how what what is features uh in the context of building this uh machine learning systems right so you have a real world activity here and perhaps there's a doctor or a nurse uh care of a patient and figuring out different vital information about them and that is stored in a numeric format that could be useful to do a lot of further analysis or prediction that you see on the right so on the left is the process with a real-world activity of a doctor monitoring a patient and on the right you see different columns or different features and when i say features each of these individual columns like the patient gender oxygen level pulse rate all of them could be considered as individual features and based on and each of the row that you see are different patients what's interesting also to note here is that all these values are numeric so the numeric representation of raw data is called features as we saw was what's the definition earlier right so we have a real world activity we store the data and then we have features so that is all about features right features is essentially storing data in a numeric format what is feature engineering so as you might have probably guessed engineering features or or creating features uh is what is called feature engineering and to put it into context i would like to take you through this diagram where you see where feature engineering fits while building a machine learning system right so if you look at this diagram this is a very popular diagram on the internet you have different data sources the source ones are source two so on up to source n and all of these data sources are in different formats with different data types with say different amount of information all of this gets stored as raw data when you're trying to model a certain process and that raw data is then transformed into features and these features are fed into your modeling process and then you get your insights or your predictions so that is the entire end-to-end pipeline and feature engineering sits right in the place between your raw data and the features so essentially the process of converting your raw data into features is called feature engineering and uh you do that in a multitude of ways so here is here it says clean and transform but there are a lot of different things that you do to be able to create those features that are useful for a task that you want to perform so when you're modeling you're essentially trying to perform a certain task you're not modeling without an inherent goal in your mind right so you have a certain task you want to perform and modeling uh is done with respect to that so feature engineering is also in a way dependent on the task at hand it's also dependent on the model you are going to choose and the data you have so these three things are what uh contribute to your future engineering so feature engineering uh is not like an independent activity like some of the people around might think but it is also dependent like i said it's dependent on the data you have it's dependent on the modeling task you have and the models that you want to subsequently use so this is the phase this is the this is an example of how feature engineering is used it's the process of converting raw data to features and now let's take a look at the example i was talking about so earlier we saw this example but i intentionally omitted one part of the process because i was trying to show you features but in the real world you might not directly go from real-world activity to features they might be and and it happens more often than not that you store the real world activity as data uh and then you go through an entire feature engineering process and convert that to features uh like for example in this case right so like i mentioned the the example is about a certain process that the doctor or the nurse is doing and they might they of course they will not go ahead and write patient gender as one or zero because it doesn't make sense to them so they would represent it as m or f male or female in this case but for the purpose of our modeling we need to convert all character and string features into numeric features and i mean all characteristic data into numeric features and that is where the process of feature engineering comes into play so this is one very basic kind of feature engineering where you're basically encoding the different categories that you see in patient gender to a representative numeric value so male is represented as 1 m is represented as 1 and f is represented as 0 in the case of patient gender so that is what you do in future engineering you convert your raw data into numeric values feature engineering uh yeah just to summarize right so basically conversion of raw data into a numeric value is future engineering which engineering is dependent on the data you use is depend on the modeling you want to do the task you have at hand and the models you select eventually for the task so that is what i wanted to talk about give you an introduction to what features are uh talk about the process of feature engineering and where future engineering fits in the entire pipeline and how data can be converted from raw form into a form that is uh useful uh by a machine learning model and also what does feature engineering depend on so all of these things is what i wanted to convey in this video if you like this video please don't forget to like share and subscribe and yeah of course subscribe to my channel because i will be putting out more such videos in the upcoming few weeks and months hopefully until next time see you around

Original Description

In this video, I will take you through an introduction to what features mean and where Feature Engineering takes place in the process of the development of a Machine Learning system. I explain, with an example, how features differ from data and how Feature Engineering is dependent on multiple different factors. If you enjoyed watching this video, then please don't forget to like, share and subscribe to the channel!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Imaad Mohamed Khan · Imaad Mohamed Khan · 31 of 34

1 Does AI know Fashion? - Mitali Sodhi - Mantissa Data Science Meetups
Does AI know Fashion? - Mitali Sodhi - Mantissa Data Science Meetups
Imaad Mohamed Khan
2 Mantissa Data Science Webinar - 1 with Santhosh Shetty
Mantissa Data Science Webinar - 1 with Santhosh Shetty
Imaad Mohamed Khan
3 Recommender Systems -  Imaad Mohamed Khan - Mantissa Data Science Meetups
Recommender Systems - Imaad Mohamed Khan - Mantissa Data Science Meetups
Imaad Mohamed Khan
4 Data Science is more than just Data Scientist - Different Roles in the field of Data Science
Data Science is more than just Data Scientist - Different Roles in the field of Data Science
Imaad Mohamed Khan
5 What topics to prepare for Data Science Interviews in 2020?
What topics to prepare for Data Science Interviews in 2020?
Imaad Mohamed Khan
6 Programming as a human activity
Programming as a human activity
Imaad Mohamed Khan
7 What are the languages or tools used by Data Scientists in their work?
What are the languages or tools used by Data Scientists in their work?
Imaad Mohamed Khan
8 Linear Regression From Scratch - Part 1
Linear Regression From Scratch - Part 1
Imaad Mohamed Khan
9 Linear Regression From Scratch - Part 2
Linear Regression From Scratch - Part 2
Imaad Mohamed Khan
10 Linear Regression From Scratch - Part 3
Linear Regression From Scratch - Part 3
Imaad Mohamed Khan
11 Journey into Data Science - Fireside chat with Adarsha and Karthikeyan
Journey into Data Science - Fireside chat with Adarsha and Karthikeyan
Imaad Mohamed Khan
12 Off the ground - Python in 5 Steps
Off the ground - Python in 5 Steps
Imaad Mohamed Khan
13 How LinkedIn uses Data Science to build your feed - LinkedIn Feed Algorithm Explained
How LinkedIn uses Data Science to build your feed - LinkedIn Feed Algorithm Explained
Imaad Mohamed Khan
14 Fireside chat with Eric Weber - Learnings in Data Science
Fireside chat with Eric Weber - Learnings in Data Science
Imaad Mohamed Khan
15 Part 2 - How LinkedIn uses Data Science to build your feed | LinkedIn Feed Algorithm Explained
Part 2 - How LinkedIn uses Data Science to build your feed | LinkedIn Feed Algorithm Explained
Imaad Mohamed Khan
16 Using Streamlit's Share Feature to easily deploy (and share) videos using Github
Using Streamlit's Share Feature to easily deploy (and share) videos using Github
Imaad Mohamed Khan
17 Airbnb Experiences Ranking Algorithm Explained - Part I
Airbnb Experiences Ranking Algorithm Explained - Part I
Imaad Mohamed Khan
18 Airbnb Experiences Ranking Algorithm Explained - Part II
Airbnb Experiences Ranking Algorithm Explained - Part II
Imaad Mohamed Khan
19 Airbnb Experiences Ranking Algorithm Explained - Part III
Airbnb Experiences Ranking Algorithm Explained - Part III
Imaad Mohamed Khan
20 Big Data, Hadoop and Machine Learning Explained using Dams
Big Data, Hadoop and Machine Learning Explained using Dams
Imaad Mohamed Khan
21 Fireside Chat with Hiromu Hota - Transitioning from Research to Industry
Fireside Chat with Hiromu Hota - Transitioning from Research to Industry
Imaad Mohamed Khan
22 Introduction to Anomaly Detection and One Class Classification
Introduction to Anomaly Detection and One Class Classification
Imaad Mohamed Khan
23 Reading and manipulating Google Sheets (GSheets) using Python libraries
Reading and manipulating Google Sheets (GSheets) using Python libraries
Imaad Mohamed Khan
24 Writing to Google Sheets (GSheets) using Python libraries
Writing to Google Sheets (GSheets) using Python libraries
Imaad Mohamed Khan
25 Fireside Chat with Mirza Rahim Baig - Business Problem Solving and Data Science Career Tips
Fireside Chat with Mirza Rahim Baig - Business Problem Solving and Data Science Career Tips
Imaad Mohamed Khan
26 Six types of Data Analysis you will do as a Data Scientist
Six types of Data Analysis you will do as a Data Scientist
Imaad Mohamed Khan
27 Automatic Speech Recognition (ASR) with Facebook AI's wav2vec 2.0 model using Huggingface
Automatic Speech Recognition (ASR) with Facebook AI's wav2vec 2.0 model using Huggingface
Imaad Mohamed Khan
28 9 Anti-patterns to avoid MLOps mistakes
9 Anti-patterns to avoid MLOps mistakes
Imaad Mohamed Khan
29 8 pitfalls to avoid while using Machine Learning Interpretation Techniques (SHAP, PDP, LIME, PFI)
8 pitfalls to avoid while using Machine Learning Interpretation Techniques (SHAP, PDP, LIME, PFI)
Imaad Mohamed Khan
30 Fireside Chat with Shadab Khan - AI in Healthcare and Data Science Career Tips
Fireside Chat with Shadab Khan - AI in Healthcare and Data Science Career Tips
Imaad Mohamed Khan
Features and Feature Engineering in Machine Learning - An Introduction
Features and Feature Engineering in Machine Learning - An Introduction
Imaad Mohamed Khan
32 Building your own AI text generation tool with aitextgen using GPT-2/GPT-3
Building your own AI text generation tool with aitextgen using GPT-2/GPT-3
Imaad Mohamed Khan
33 Organising Data Science projects using CRISP-DM
Organising Data Science projects using CRISP-DM
Imaad Mohamed Khan
34 Introduction to Prompt Engineering
Introduction to Prompt Engineering
Imaad Mohamed Khan

This video teaches the basics of features and feature engineering in machine learning, including how they fit into the machine learning development process and how they differ from data. It provides an introduction to the importance of feature engineering in building effective machine learning models.

Key Takeaways
  1. Define features and their role in machine learning
  2. Understand the difference between features and data
  3. Learn how feature engineering fits into the machine learning development process
  4. Identify factors that influence feature engineering
  5. Develop a basic understanding of feature extraction and data preprocessing
💡 Feature engineering is a critical step in the machine learning development process, and it depends on multiple factors, including the type of data and the goals of the project.

Related AI Lessons

Up next
Learn Deep Learning by Hand (Beginner's Guide - Part 1)
Thu Vu
Watch →