Data frames in R - Transforming data PART I
Skills:
ML Pipelines70%
Key Takeaways
The video covers data transformation in R using the dplyr package, specifically the filter, select, and mutate functions.
Full Transcript
hello and welcome back to our first attest Excel data science in the next few lessons we will dive deep into the Star Wars data and we'll learn how to transform data sets in various creative and not so creative ways let's get to it this is the first real lesson in which we will use the deployer package for the distracted souls out there the player is part of the tidy verse and we got it when we install the tie diverse ecosystem of packages it specializes in data manipulation tools that deal with filtering mutating and summarizing data first things first let's fire up the Star Wars data frame that comes with deep liar this time I will save it as Star notice that the Digga are saved as a table instead of on our based data frame let's keep it this way and use some of the table properties tables come in handy here because this is a relatively big data set and we don't want to see the entire thing every time we do an operation and print to see our results tables limit the printing to just a few rows okay although we've already looked at it before if you want to see the data in all of its glory run view star this will open the viewer and you can scroll through the values to your heart's content right transforming data the filter function does what we think it does subsets data according to a set of criteria it looks like this we pass the data and then the expression according to which we want our data filtered there can be more than one criteria of course for instance I can select all the droids in the data frame and now I can only call on the ones from Tatooine right yes that makes sense it was young Anakin Skywalker who rebuilt c-3po while still on Tatooine and our 5d for well I'm not sure I know anything about that little r5 unit okay filter also works with logical operators so for example I can call every character that has red orange or yellow as an eye color okay the majority of these aren't human hmm I wonder if there are any more humans with weird eyes apart from Darth Vader and Palpatine No yikes alright next we have the Select function now our database may not have hundreds of variables but looking at the column names it does feel like I genuinely don't need to know about some of these things to narrow down the data to the information I want I can use select this selects specific individual columns by name if I want to select the column and then everything between two other columns I can do this isn't this already a lot easier to do than with the base our functions we learned earlier hmm it is but check this out to select works nicely with a couple of nifty functions like starts with or ends with which let us subset data in a super intuitive way so if I wanted to get all the columns that have to do with coloration I can run this okay new scenario there are a bunch of interesting variables you want to look at but you also don't want to ignore the rest of the data what do you do well you can use the everything function with select to move the variables you want to the beginning of the table and then show everything else like this sweet right finally let's look at the mutate function mutate is the pliers easy way of creating new variables from variables that already exist in the data set for example I can calculate the BMI for our characters because the starwars data has recorded both height and mass information you of course this is largely uninformative because the BMI scale is extremely human centered but you know anything to get the point across now if mutate is the function to use when you want to add a column to your data while also retaining all the other columns in your data frame then transmute is what you will opt for if you only want to keep the variable you create let me show you what I mean see effectively transmitted created my new variable and allowed me to extract it without tagging everything else along as well fantastic ok I will win this lesson here because otherwise I'm at risk of going into way too much detail about side comments I make so thanks for watching everyone and in the next lesson we will pick it right where we left off see you there for more videos like this one please subscribe
Original Description
👉🏻 Download Our Free Data Science Career Guide: https://bit.ly/2DZt6hc
👉🏻 Sign up for Our Complete Data Science Training with 57% OFF: https://bit.ly/2QctScR
How to filter, mutate, and summarize a data frame in R using the dplyr package.
The filter() function does what we think it does: subsets a data frame according to a set of criteria. It works like this: we pass the data, and then the expression according to which we want or data filtered. There can be more than 1 criteria, of course. Filter() also works with logical operators.
The select() function narrows down the data frame to the information you specifically want and need to see. Select() works nicely with a couple of nifty functions like starts_with(), or ends_with(), which let us subset data in a super intuitive way.
Mutate() is dplyr’s easy way of creating new variables from variables that already exist in the data frame. For example, if you have height and mass information, you can create a BMI variable.
If mutate() is the function to use when you want to add a column to your data frame while also retaining all the other columns in your data frame, then transmute() is what you will opt for if you only want to keep the new variable you create.
► Consider hitting the SUBSCRIBE button if you LIKE the content: https://www.youtube.com/c/365DataScience?sub_confirmation=1
► VISIT our website: https://bit.ly/365ds
🤝 Connect with us LinkedIn: https://www.linkedin.com/company/365datascience/
365 Data Science is an online educational career website that offers the incredible opportunity to find your way into the data science world no matter your previous knowledge and experience. We have prepared numerous courses that suit the needs of aspiring BI analysts, Data analysts and Data scientists.
We at 365 Data Science are committed educators who believe that curiosity should not be hindered by inability to access good learning resources. This is why we focus all our efforts on creating high-quali
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from 365 Data Science · 365 Data Science · 40 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
▶
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Population vs Sample
365 Data Science
Data Science & Statistics: Levels of measurement
365 Data Science
Statistics Tutorials: Mean, median and mode
365 Data Science
Skewness
365 Data Science
What is a distribution?
365 Data Science
The Normal Distribution
365 Data Science
Central limit theorem
365 Data Science
Student's T Distribution
365 Data Science
Type I error vs Type II error
365 Data Science
Hypothesis testing. Null vs alternative
365 Data Science
The linear regression model
365 Data Science
Simple linear regression model. Geometrical representation
365 Data Science
INDEX and MATCH application of the two functions separately and combined [Advanced Excel]
365 Data Science
INDIRECT Excel Function: How it works and when to use it [Advanced Excel]
365 Data Science
VLOOKUP and MATCH another useful functions combination [Advanced Excel]
365 Data Science
VLOOKUP COLUMN and ROW - Handle large data tables with ease [Advanced Excel]
365 Data Science
The ELIF keyword [Python Fundamentals]
365 Data Science
Working with Tuples in Python
365 Data Science
Database Terminology - A Beginners Guide
365 Data Science
Relational Database Essentials
365 Data Science
Database vs Spreadsheet - Advantages and Disadvantages
365 Data Science
Conditional Statements and Loops
365 Data Science
Backpropagation – The Math Behind Optimization
365 Data Science
Monte Carlo: Forecasting Stock Prices Part I
365 Data Science
Monte Carlo: Forecasting Stock Prices Part II
365 Data Science
Monte Carlo: Forecasting Stock Prices Part III
365 Data Science
365 Data Science Online Program
365 Data Science
Data frames - Creating a data frame
365 Data Science
Data Science & Statistics: Slicing a matrix in R
365 Data Science
Data frames in R - Exporting data in R
365 Data Science
Data frames in R - Transforming data PART II
365 Data Science
Data Frames in R - Subsetting a data frame
365 Data Science
Data Science & Statistics: Matrix arithmetic in R
365 Data Science
Data Science & Statistics: Indexing an element from a matrix
365 Data Science
Data Frames in R - Extending a data frame
365 Data Science
Data Science & Statistics: Creating a matrix in R FASTER
365 Data Science
Data Science & Statistics: Creating a Matrix in R
365 Data Science
Data frames - Importing data in R
365 Data Science
Data frames in R - Getting a sense of your data
365 Data Science
Data frames in R - Transforming data PART I
365 Data Science
Data frames in R - Import a CSV in R
365 Data Science
Data Science & Statistics: Matrix operations in R
365 Data Science
Data Science & Statistics: Matrix recycling in R
365 Data Science
Tableau vs Excel: When to use Tableau and when to use Excel
365 Data Science
Download Tableau: Learn how to download Tableau Public
365 Data Science
Connecting data sources: Useful tips when connecting data sources to Tableau
365 Data Science
The Tableau interface: See how to navigate through the Tableau interface
365 Data Science
Tableau data visualization: Create your first Tableau visualization!
365 Data Science
Duplicating sheets: This is how to duplicate a sheet in Tableau
365 Data Science
Build a table in Tableau: The steps needed to create a simple table in Tableau
365 Data Science
Custom fields in Tableau: Using Tableau operators to create custom fields
365 Data Science
Custom fields in Tableau: Add calculations to tables through custom fields
365 Data Science
Totals in Tableau: Learn how to display subtotals and totals in Tableau
365 Data Science
Gross Margin calculation in Tableau
365 Data Science
What is a filter in Tableau: Set up a filter in Tableau to specify the data you want to show
365 Data Science
Joins in Tableau: Inner, outer, left, or a right join in Tableau
365 Data Science
Building a Tableau dashboard: Three types of charts you want to have in a Tableau dashboard
365 Data Science
Creating great looking charts in Tableau: Real life Exercise on charts in Tableau
365 Data Science
Joins in Tableau: Choose the correct join type
365 Data Science
How to make a data check in Tableau: A quick data check is better than no data check
365 Data Science
More on: ML Pipelines
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Medium · AI
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Medium · Data Science
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Medium · Deep Learning
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Medium · LLM
🎓
Tutor Explanation
DeepCamp AI