1. Readability tests
In this lesson,
we will look at a set of interesting features known as readability tests.
2. Overview of readability tests
These tests are used to determine the readability
of a particular passage. In other words, it indicates at what educational level a person needs to be in, in order to comprehend a particular piece of text. The scale usually ranges from
primary school up to college graduate level and is in context of the American education system. Readability tests are done using a mathematical formula
that utilizes the word, syllable and sentence count of the passage. They are routinely used by organizations to determine how easy their publications are to understand. They have also found applications in domains such as fake news
and opinion spam detection.
3. Readability text examples
There are a variety of readability tests in use. Some of the common ones include the Flesch reading ease,
the Gunning fog index,
the simple measure of gobbledygook
or SMOG and the Dale-Chall score.
Note that these tests are used for texts in English. Tests for other languages also exist that take into consideration, the nuances of that particular language. For the sake of brevity, we will cover only the
4. Readability test examples
first two scores in detail. However, once you understand them, you will be in a good position to understand and use the other scores too.
5. Flesch reading ease
The Flesch Reading Ease is one of the oldest
and most widely used readability tests. The score is based on two ideas:
the first is that the greater the average sentence length,
harder the text is to read. Consider
these
two sentences. The first is easier to follow than the second. The second is that the greater the average number
of syllables in a word, the harder the text is to read. Therefore,
I live in my home is considered easier to read than
I reside in my domicile on account of its usage of lesser syllables per word. The higher the Flesch Reading Ease score,
the greater is the readability. Therefore, a higher score indicates that the text is easier to understand.
6. Flesch reading ease score interpretation
This table shows how to interpret the Flesch Reading Ease scores. A score above 90 would imply that the text is comprehensible to a 5th grader whereas a score below 30 would imply the text can only be understood by college graduates.
7. Gunning fog index
The Gunning fog index was
developed in 1954. Like Flesch, this score is also dependent
on the average sentence length. However, it uses
percentage of complex words in place of average syllables per word to compute its score. Here, complex words refer to all words that have three or more syllables. Unlike Flesch, the formula for Gunning fog index is such that the higher the score,
the more difficult the passage is to understand.
8. Gunning fog index interpretation
The index can be interpreted using this table. A score of 6 would indicate 6th grade reading difficulty whereas a score of 17 would indicate college graduate level reading difficulty.
9. The readability library
We can conduct these tests in Python using the readability metrics library. In order to use this package, we first need to download the
punkt module from nltk. We then import the Readability class
from readability. Next, we create a Readability object and pass in the passage
or text we're evaluating. To compute a readability score, we call a method that computes the score of our interest, for instance, gunning fog.
We store this variable in a variable named gf. Next, we access the score using gf.score. In this example, the text that was passed
is between the reading level of a college senior and that of a college graduate.
10. Let's practice!
Let's now practice computing readability scores using the readability library in the exercises.