Data Frames in R - Subsetting a data frame

365 Data Science · Beginner ·🔢 Mathematical Foundations ·8y ago

Key Takeaways

The video demonstrates how to subset a data frame in R using square brackets, the dollar sign, and other methods, with the Star Wars data set from the dplyr package as an example.

Full Transcript

so far we've talked about creating and naming data frames importing them and the tiny verse package now let's get more close and personal with them and learn about sub setting data frames we've already talked about some of these methods with mover covering matrices and lists so Volvo try to be super quick when glossing over familiar stuff first I will make use of the tiny version get a data set from the deployer package that's called Star Wars it starts as a table which as we know is just another take on the data frame there are benefits to using a table instead of a data frame but that usually works best for the more advanced our user alright I will save the data as a data frame and I will remove the last three columns because all of them contain lists and that will make exploring our data difficult excellent now that's a pretty large data frame and it isn't very efficient to print all of it when we do an operation so from now on whenever our data frame is hefty I will use the head function it is part of a set of functions that act like a sanity check and only show you the top six or the bottom six rows of your data like this this is a great way for keeping tabs on whatever you're doing without flooding your konso with data all right so subsetting in terms of notation all the usual suspects are here we can subset data frames like we would subset a matrix or like we would index a list to index a specific element we use square brackets row index comma column index so if I want to know where r2d2 is from I need to code this or I could use the column name instead of its number because it's easier to ask about all the data on Princess Leia Organa I must reference a row this happens by passing her row name in the square brackets and leaving a space for the column indicator remember alternatively if I want to see only the character names but nothing else I can call the first column leaving a space for the rows I will use the head function here because 87 Rose is too long or I can borrow from the lists of setting notation and use double brackets to Co the name column notice that in the first instance our returned data frame and in the second a vector why does that happen well remember how lists behaved when we were sub setting them in the previous section double brackets drilled down to the individual elements and extract them without keeping the larger structure the same things happens in their frames why because there are two dimensional lists excellent now lists could also be subset it using the dollar sign and that stands true in this situation as well and just like with lists we don't need to use quotation marks around column names again our returns a vector one last way to reference a column but get a data frame in return is by using single brackets and column name inside quotation marks like this okay and how about if I want to extract multiple columns for a subset of characters we can use the combined function to specify what exactly we're interested in easy and that returns a dataframe - this is sweet alright now we know how to subset a dataframe to operate on whatever part of it we need that is fantastic but what if we got some new observations and wanted to add them to the structure or if we isolated a new variable and wanted to attach it to the data frame you guessed it we will cover that in the next lesson thanks everyone see you in the next video and of course may the force be with you for more videos like this one please subscribe

Original Description

👉🏻 Download Our Free Data Science Career Guide: https://bit.ly/3kUHmIy 👉🏻 FREE MONTH! Get full access to our newly redesigned platform and all our courses (18th October - 18th November): https://bit.ly/3hfTtO8 How to subset a data frame using square brackets and the dollar sign. We can subset data frames like we would subset a matrix, or like we would index a list. To index a specific element in a data frame, we use square brackets, row index, coma, column index. To reference a row, pass the row number in the square brackets and leave a space on the column indicator. To reference a variable in a data frame, call the column number, leaving a space for the rows. Or use double brackets to call the column by name. ► Consider hitting the SUBSCRIBE button if you LIKE the content: https://www.youtube.com/c/365DataScience?sub_confirmation=1 ► VISIT our website: https://bit.ly/365ds 🤝 Connect with us LinkedIn: https://www.linkedin.com/company/365datascience/ 365 Data Science is an online educational career website that offers the incredible opportunity to find your way into the data science world no matter your previous knowledge and experience. We have prepared numerous courses that suit the needs of aspiring BI analysts, Data analysts and Data scientists. We at 365 Data Science are committed educators who believe that curiosity should not be hindered by inability to access good learning resources. This is why we focus all our efforts on creating high-quality educational content which anyone can access online. Check out our Data Science Career guides: https://www.youtube.com/playlist?list=PLaFfQroTgZnyQFq4nUfb-w2vEopN3ULMb #Statistics #Datascience #RProgramming
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from 365 Data Science · 365 Data Science · 32 of 60

1 Population vs Sample
Population vs Sample
365 Data Science
2 Data Science & Statistics: Levels of measurement
Data Science & Statistics: Levels of measurement
365 Data Science
3 Statistics Tutorials: Mean, median and mode
Statistics Tutorials: Mean, median and mode
365 Data Science
4 Skewness
Skewness
365 Data Science
5 What is a distribution?
What is a distribution?
365 Data Science
6 The Normal Distribution
The Normal Distribution
365 Data Science
7 Central limit theorem
Central limit theorem
365 Data Science
8 Student's T Distribution
Student's T Distribution
365 Data Science
9 Type I error vs Type II error
Type I error vs Type II error
365 Data Science
10 Hypothesis testing. Null vs alternative
Hypothesis testing. Null vs alternative
365 Data Science
11 The linear regression model
The linear regression model
365 Data Science
12 Simple linear regression model. Geometrical representation
Simple linear regression model. Geometrical representation
365 Data Science
13 INDEX and MATCH application of the two functions separately and combined [Advanced Excel]
INDEX and MATCH application of the two functions separately and combined [Advanced Excel]
365 Data Science
14 INDIRECT Excel Function: How it works and when to use it [Advanced Excel]
INDIRECT Excel Function: How it works and when to use it [Advanced Excel]
365 Data Science
15 VLOOKUP and MATCH another useful functions combination [Advanced Excel]
VLOOKUP and MATCH another useful functions combination [Advanced Excel]
365 Data Science
16 VLOOKUP COLUMN and ROW - Handle large data tables with ease [Advanced Excel]
VLOOKUP COLUMN and ROW - Handle large data tables with ease [Advanced Excel]
365 Data Science
17 The ELIF keyword [Python Fundamentals]
The ELIF keyword [Python Fundamentals]
365 Data Science
18 Working with Tuples in Python
Working with Tuples in Python
365 Data Science
19 Database Terminology - A Beginners Guide
Database Terminology - A Beginners Guide
365 Data Science
20 Relational Database Essentials
Relational Database Essentials
365 Data Science
21 Database vs Spreadsheet - Advantages and Disadvantages
Database vs Spreadsheet - Advantages and Disadvantages
365 Data Science
22 Conditional Statements and Loops
Conditional Statements and Loops
365 Data Science
23 Backpropagation – The Math Behind Optimization
Backpropagation – The Math Behind Optimization
365 Data Science
24 Monte Carlo: Forecasting Stock Prices Part I
Monte Carlo: Forecasting Stock Prices Part I
365 Data Science
25 Monte Carlo: Forecasting Stock Prices Part II
Monte Carlo: Forecasting Stock Prices Part II
365 Data Science
26 Monte Carlo: Forecasting Stock Prices Part III
Monte Carlo: Forecasting Stock Prices Part III
365 Data Science
27 365 Data Science Online Program
365 Data Science Online Program
365 Data Science
28 Data frames - Creating a data frame
Data frames - Creating a data frame
365 Data Science
29 Data Science & Statistics: Slicing a matrix in R
Data Science & Statistics: Slicing a matrix in R
365 Data Science
30 Data frames in R - Exporting data in R
Data frames in R - Exporting data in R
365 Data Science
31 Data frames in R - Transforming data PART II
Data frames in R - Transforming data PART II
365 Data Science
Data Frames in R - Subsetting a data frame
Data Frames in R - Subsetting a data frame
365 Data Science
33 Data Science & Statistics: Matrix arithmetic in R
Data Science & Statistics: Matrix arithmetic in R
365 Data Science
34 Data Science & Statistics: Indexing an element from a matrix
Data Science & Statistics: Indexing an element from a matrix
365 Data Science
35 Data Frames in R - Extending a data frame
Data Frames in R - Extending a data frame
365 Data Science
36 Data Science & Statistics: Creating a matrix in R FASTER
Data Science & Statistics: Creating a matrix in R FASTER
365 Data Science
37 Data Science & Statistics: Creating a Matrix in R
Data Science & Statistics: Creating a Matrix in R
365 Data Science
38 Data frames - Importing data in R
Data frames - Importing data in R
365 Data Science
39 Data frames in R - Getting a sense of your data
Data frames in R - Getting a sense of your data
365 Data Science
40 Data frames in R - Transforming data PART I
Data frames in R - Transforming data PART I
365 Data Science
41 Data frames in R - Import a CSV in R
Data frames in R - Import a CSV in R
365 Data Science
42 Data Science & Statistics: Matrix operations in R
Data Science & Statistics: Matrix operations in R
365 Data Science
43 Data Science & Statistics: Matrix recycling in R
Data Science & Statistics: Matrix recycling in R
365 Data Science
44 Tableau vs Excel: When to use Tableau and when to use Excel
Tableau vs Excel: When to use Tableau and when to use Excel
365 Data Science
45 Download Tableau: Learn how to download Tableau Public
Download Tableau: Learn how to download Tableau Public
365 Data Science
46 Connecting data sources: Useful tips when connecting data sources to Tableau
Connecting data sources: Useful tips when connecting data sources to Tableau
365 Data Science
47 The Tableau interface: See how to navigate through the Tableau interface
The Tableau interface: See how to navigate through the Tableau interface
365 Data Science
48 Tableau data visualization: Create your first Tableau visualization!
Tableau data visualization: Create your first Tableau visualization!
365 Data Science
49 Duplicating sheets: This is how to duplicate a sheet in Tableau
Duplicating sheets: This is how to duplicate a sheet in Tableau
365 Data Science
50 Build a table in Tableau: The steps needed to create a simple table in Tableau
Build a table in Tableau: The steps needed to create a simple table in Tableau
365 Data Science
51 Custom fields in Tableau: Using Tableau operators to create custom fields
Custom fields in Tableau: Using Tableau operators to create custom fields
365 Data Science
52 Custom fields in Tableau: Add calculations to tables through custom fields
Custom fields in Tableau: Add calculations to tables through custom fields
365 Data Science
53 Totals in Tableau: Learn how to display subtotals and totals in Tableau
Totals in Tableau: Learn how to display subtotals and totals in Tableau
365 Data Science
54 Gross Margin calculation in Tableau
Gross Margin calculation in Tableau
365 Data Science
55 What is a filter in Tableau: Set up a filter in Tableau to specify the data you want to show
What is a filter in Tableau: Set up a filter in Tableau to specify the data you want to show
365 Data Science
56 Joins in Tableau: Inner, outer, left, or a right join in Tableau
Joins in Tableau: Inner, outer, left, or a right join in Tableau
365 Data Science
57 Building a Tableau dashboard: Three types of charts you want to have in a Tableau dashboard
Building a Tableau dashboard: Three types of charts you want to have in a Tableau dashboard
365 Data Science
58 Creating great looking charts in Tableau: Real life Exercise on charts in Tableau
Creating great looking charts in Tableau: Real life Exercise on charts in Tableau
365 Data Science
59 Joins in Tableau: Choose the correct join type
Joins in Tableau: Choose the correct join type
365 Data Science
60 How to make a data check in Tableau: A quick data check is better than no data check
How to make a data check in Tableau: A quick data check is better than no data check
365 Data Science

This video teaches how to subset a data frame in R using various methods, including square brackets and the dollar sign, and demonstrates how to extract specific data from a data frame.

Key Takeaways
  1. Load the dplyr package and import the Star Wars data set
  2. Convert the data set to a data frame
  3. Remove unnecessary columns
  4. Use the head function to view the top rows of the data frame
  5. Subset the data frame using square brackets and column names
  6. Use the dollar sign to extract a specific column
  7. Use double brackets to extract a specific column as a vector
  8. Use single brackets to extract a specific column as a data frame
  9. Extract multiple columns for a subset of characters
💡 Subsetting a data frame in R can be done using various methods, including square brackets, the dollar sign, and double brackets, each returning different types of data structures.

Related AI Lessons

Up next
How to Open OSM Files (OpenStreetMap Data)
File Extension Geeks
Watch →