Quiz: Tokens, N-Grams, tf-idf, and Stemming

Test your knowledge of document search strategies, tokenization, n-grams, stemming, and tf-idf.

Natural Language Tools


What will we get from the following command? findFreqTerms(DTmatrix, lowfreq = 400)


This creates a list of terms that appear less than 400 times.


This will create a list of the most frequent terms for each document.


This command creates a list of terms that appear a minimum of 400 times in DTmatrix.


findFreqTerms(DTmatrix, lowfreq = 400) trims DTmatrix to a list of 400 terms.

Question 1 of 40 attempted

Get hands-on with 1200+ tech skills courses.