Python Tutorial: Descriptive Statistics
Want to learn more? Take the full course at https://learn.datacamp.com/courses/human-resources-analytics-predicting-employee-churn-in-python at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.
---
So now our dataset is ready to develop a predtictive algorithm. But before then, let's first get some quick descriptive insights.
The variable that is providing information whether an employee has left the company or not is the column **churn**. Basically, if the value of this column is 1 then an employee has churned, and if it is 0 then we have not obsereved turnover in this case. To calculate the turnover rate we have to count number of times this variable has the value 1 and 0 and then divide it by the total. If we multiply the result by 100 then the outcome will be the % of employees who left and stayed. This task is again accomplished in 3 steps:
- First we get the number of all the emplyees, which is basically the length of our data,
- Then, we count 1s and 0s in the column churn,
- Finally, we divide the counted values by the number of employees and multiple by 100 to get percentages.
As you can see around 76% of our emplyees stayed, while 24% have churned. Thus, we conclude that turnover rate is 24%.
Next, we are interested to learn what are the variables that are in a positive or negative linear relationship with our target. To see that, we will first of all develop the correlation matrix using the `corr()` method provided by **pandas** and then visualize the matrix using the `heatmap()` function by seaborn, a statistical visualization library. As you can see the target varaible **churn** has the highest negative correlation with satisfaction level. This shows that the increase in satisfaction level is associated with decrease in probability of turnover.
Now it's your turn to practice.
#DataCamp #PythonTutorial #Human #Resources #Analytics #Predicting #Employee #Churn #Python #Descriptive #Statisti
What You'll Learn
Calculates descriptive statisticsiscopal statistics using Python
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from DataCamp · DataCamp · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
SQL Server Tutorial: Date manipulation
DataCamp
R Tutorial: Intermediate Interactive Data Visualization with plotly in R
DataCamp
R Tutorial: Adding aesthetics to represent a variable
DataCamp
R Tutorial: Moving Beyond Simple Interactivity
DataCamp
Python Tutorial: Why use ML for marketing? Strategies and use cases
DataCamp
Python Tutorial: Preparation for modeling
DataCamp
Python Tutorial: Machine Learning modeling steps
DataCamp
R Tutorial: The prior model
DataCamp
R Tutorial: Data & the likelihood
DataCamp
R Tutorial: The posterior model
DataCamp
R Tutorial: An Introduction to plotly
DataCamp
R Tutorial: Plotting a single variable
DataCamp
R Tutorial: Bivariate graphics
DataCamp
Python Tutorial: Customer Segmentation in Python
DataCamp
Python Tutorial: Time cohorts
DataCamp
Python Tutorial: Calculate cohort metrics
DataCamp
Python Tutorial: Cohort analysis visualization
DataCamp
R Tutorial: Building Dashboards with flexdashboard
DataCamp
R Tutorial: Anatomy of a flexdashboard
DataCamp
R Tutorial: Layout basics
DataCamp
R Tutorial: Advanced layouts
DataCamp
Python Tutorial: Time Series Analysis in Python
DataCamp
Python Tutorial: Correlation of Two Time Series
DataCamp
Python Tutorial: Simple Linear Regressions
DataCamp
Python Tutorial: Autocorrelation
DataCamp
R Tutorial: The gapminder dataset
DataCamp
R Tutorial: The filter verb
DataCamp
R Tutorial: The arrange verb
DataCamp
R Tutorial: The mutate verb
DataCamp
R Tutorial: What is cluster analysis?
DataCamp
R Tutorial: Distance between two observations
DataCamp
R Tutorial: The importance of scale
DataCamp
R Tutorial: Measuring distance for categorical data
DataCamp
Python Tutorial: Plotting multiple graphs
DataCamp
Python Tutorial: Customizing axes
DataCamp
Python Tutorial: Legends, annotations, & styles
DataCamp
Python Tutorial: Introduction to iterators
DataCamp
Python Tutorial: Playing with iterators
DataCamp
Python Tutorial: Using iterators to load large files into memory
DataCamp
SQL Tutorial: Introduction to Relational Databases in SQL
DataCamp
SQL Tutorial: Tables: At the core of every database
DataCamp
SQL Tutorial: Update your database as the structure changes
DataCamp
Python Tutorial: Classification-Tree Learning
DataCamp
Python Tutorial: Decision-Tree for Classification
DataCamp
Python Tutorial: Decision-Tree for Regression
DataCamp
Python Tutorial: Census Subject Tables
DataCamp
Python Tutorial: Census Geography
DataCamp
Python Tutorial: Using the Census API
DataCamp
R Tutorial: A/B Testing in R
DataCamp
R Tutorial: Baseline Conversion Rates
DataCamp
R Tutorial: Designing an Experiment - Power Analysis
DataCamp
R Tutorial: Introduction to qualitative data
DataCamp
R Tutorial: Understanding your qualitative variables
DataCamp
R Tutorial: Making Better Plots
DataCamp
SQL Tutorial: OLTP and OLAP
DataCamp
SQL Tutorial: Storing data
DataCamp
SQL Tutorial: Database design
DataCamp
Python Tutorial: Introduction to spaCy
DataCamp
Python Tutorial: Statistical Models
DataCamp
Python Tutorial: Rule-based Matching
DataCamp
Related AI Lessons
🎓
Tutor Explanation
DeepCamp AI