Finding the most frequent words by Shakespeare
Explore advanced Bash techniques for mining Shakespearean plays and poems. Learn how to extract, combine, and sort text data to identify the most frequent words across multiple works by Shakespeare using shell scripting.
We'll cover the following...
Given a text, what are the most frequent words?
Finding the most frequent words for a given text (e.g., Knight_of_the_Burning_Pestle) is easy, we can build a function toptokens(), which is nothing but the topcrimes() function developed in our previous project. Let’s watch the following video lecture first:

For example, if we want to grab the most frequent words in the Romeo and Juliet play, we can execute the following:
Given an author, what are the most frequent words?
This is slightly complicated! becuase we again need to perform several steps:
- For the given author, trim out the plays/ poems names,