Python Tutorial: Readability tests
Key Takeaways
The video demonstrates the use of readability tests, specifically the Flesch Reading Ease and Gunning Fog Index, to determine the readability of a passage in Python using the textstat library.
Full Transcript
in this lesson we will look at a set of interesting features known as readability tests these tests are used to determine the readability of a particular passage in other words it indicates at what educational level a person needs to be in in order to comprehend a particular piece of text the scale usually ranges from primary school up to college graduate level and as in context of the American education system readability tests are done using a mathematical formula that utilizes the word syllable and sentence count of the passage they are routinely used by organizations to determine how easy the publication's are to understand they have also found applications in domains such as fake news and opinion spam detection there are a variety of readability tests in use some of the common ones include the flesh reading ease the gunning fog index the simple measure of gobbledygook are small and the dáil shall scope note that these tests are used for texts in English tests for other languages also exist that take into consideration the nuances of that particular language for the sake of brevity we will cover only the first two scores in detail however once you understand them you will be in a good position to understand and use the other scores to the flush reading ease is one of the oldest and most widely used readability tests the score is based on two ideas the first is that greater the average sentence length harder the text is to read consider of these two sentences the first is easier to follow than the second the second is that the greater the number of a lab average number of syllables in the word the harder the text is to read therefore I live in my home is considered easier to read then I decide in my domicile an account of its usage of lesser syllables per word the higher the Flesch reading ease go the greater is the readability therefore a higher score indicates that the text is easier to understand this table shows how to interpret the Flesch reading ease scores a score above 90 would imply that the text is comprehensible to a fifth grader whereas a score below thirty would imply that the text can only be understood by college graduates the Gunung fog index was developed in 1954 light flesh this code is also dependent on the average sentence length however it uses percentage of complex words in place of average syllables to compute its score here complex words refer to all words that have three or more syllables unlike flesh the formula for gunning fog index is such that the higher the score the more difficult the passage is to understand the index can be interpreted using this table a score of six would indicate sixth grade reading difficulty whereas a score of seventeen would indicate college graduate level reading difficulty we can conduct these readability tests in Python using the text artistic library we import text artistic class from tech statistic next we create a text artistic object and pass in the passage or text we evaluating we then access the dictionary of readability scores from the text artistic object using the scores attribute and then store it in a variable named readability scores finally we access the various scores from the readability scores dictionary using the corresponding skis as shown in this example the text that was passed is between the reading level of a college senior and that of a college graduate let's now practice computing readability scores using the text artistic library
Original Description
Want to learn more? Take the full course at https://learn.datacamp.com/courses/feature-engineering-for-nlp-in-python at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.
---
In this lesson, we will look at a set of interesting features known as readability tests. These tests are used to determine the readability of a particular passage. In other words, it indicates at what educational level a person needs to be in, in order to comprehend a particular piece of text. The scale usually ranges from primary school up to college graduate level and is in context of the American education system. Readability tests are done using a mathematical formula that utilizes the word, syllable and sentence count of the passage. They are routinely used by organizations to determine how easy their publications are to understand. They have also found applications in domains such as fake news and opinion spam detection.
There are a variety of readability tests in use. Some of the common ones include the Flesch reading ease,
the Gunning fog index,
the simple measure of gobbledygook
or SMOG and the Dale-Chall score.
Note that these tests are used for texts in English. Tests for other languages also exist that that take into consideration, the nuances of that particular language. For the sake of brevity, we will cover only the
first two scores in detail. However, once you understand them, you will be in a good position to understand and use the other scores too.
The Flesch Reading Ease is one of the oldest
and most widely used readability tests. The score is based on two ideas:
the first is that the greater the average sentence length, harder the text is to read. Consider these two sentences. The first is easier to follow than the second. The second is that the greater the average number of syllables in a word, the harder the text is to read. Therefore,
I live in my home is considered easier to read than
I reside in
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from DataCamp · DataCamp · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
SQL Server Tutorial: Date manipulation
DataCamp
R Tutorial: Intermediate Interactive Data Visualization with plotly in R
DataCamp
R Tutorial: Adding aesthetics to represent a variable
DataCamp
R Tutorial: Moving Beyond Simple Interactivity
DataCamp
Python Tutorial: Why use ML for marketing? Strategies and use cases
DataCamp
Python Tutorial: Preparation for modeling
DataCamp
Python Tutorial: Machine Learning modeling steps
DataCamp
R Tutorial: The prior model
DataCamp
R Tutorial: Data & the likelihood
DataCamp
R Tutorial: The posterior model
DataCamp
R Tutorial: An Introduction to plotly
DataCamp
R Tutorial: Plotting a single variable
DataCamp
R Tutorial: Bivariate graphics
DataCamp
Python Tutorial: Customer Segmentation in Python
DataCamp
Python Tutorial: Time cohorts
DataCamp
Python Tutorial: Calculate cohort metrics
DataCamp
Python Tutorial: Cohort analysis visualization
DataCamp
R Tutorial: Building Dashboards with flexdashboard
DataCamp
R Tutorial: Anatomy of a flexdashboard
DataCamp
R Tutorial: Layout basics
DataCamp
R Tutorial: Advanced layouts
DataCamp
Python Tutorial: Time Series Analysis in Python
DataCamp
Python Tutorial: Correlation of Two Time Series
DataCamp
Python Tutorial: Simple Linear Regressions
DataCamp
Python Tutorial: Autocorrelation
DataCamp
R Tutorial: The gapminder dataset
DataCamp
R Tutorial: The filter verb
DataCamp
R Tutorial: The arrange verb
DataCamp
R Tutorial: The mutate verb
DataCamp
R Tutorial: What is cluster analysis?
DataCamp
R Tutorial: Distance between two observations
DataCamp
R Tutorial: The importance of scale
DataCamp
R Tutorial: Measuring distance for categorical data
DataCamp
Python Tutorial: Plotting multiple graphs
DataCamp
Python Tutorial: Customizing axes
DataCamp
Python Tutorial: Legends, annotations, & styles
DataCamp
Python Tutorial: Introduction to iterators
DataCamp
Python Tutorial: Playing with iterators
DataCamp
Python Tutorial: Using iterators to load large files into memory
DataCamp
SQL Tutorial: Introduction to Relational Databases in SQL
DataCamp
SQL Tutorial: Tables: At the core of every database
DataCamp
SQL Tutorial: Update your database as the structure changes
DataCamp
Python Tutorial: Classification-Tree Learning
DataCamp
Python Tutorial: Decision-Tree for Classification
DataCamp
Python Tutorial: Decision-Tree for Regression
DataCamp
Python Tutorial: Census Subject Tables
DataCamp
Python Tutorial: Census Geography
DataCamp
Python Tutorial: Using the Census API
DataCamp
R Tutorial: A/B Testing in R
DataCamp
R Tutorial: Baseline Conversion Rates
DataCamp
R Tutorial: Designing an Experiment - Power Analysis
DataCamp
R Tutorial: Introduction to qualitative data
DataCamp
R Tutorial: Understanding your qualitative variables
DataCamp
R Tutorial: Making Better Plots
DataCamp
SQL Tutorial: OLTP and OLAP
DataCamp
SQL Tutorial: Storing data
DataCamp
SQL Tutorial: Database design
DataCamp
Python Tutorial: Introduction to spaCy
DataCamp
Python Tutorial: Statistical Models
DataCamp
Python Tutorial: Rule-based Matching
DataCamp
🎓
Tutor Explanation
DeepCamp AI