...

An Application of Markov Model

In this lesson, let's have a look at an application of Markov Model: Randomized Text Generation.

We'll cover the following...

- Randomized Text Generation
- Generating Text from Shakespeare Corpus
- Implementation

In the previous lesson, we implemented the Markov process distribution, which is a distribution over state sequences, where the initial state and each subsequent state is random. There are lots of applications of Markov processes. In this lesson, we will have a look at randomized text generation.

Randomized Text Generation

Randomized text generation using a Markov process has a long history on the internet; one of our favorite examples was “Mark V. Shaney”.

Mark V. Shaney is a synthetic Usenet user whose postings in the net.singles newsgroups were generated by Markov chain techniques, based on text from other postings. The username is a play on the words “Markov chain”. We can now think of it as a “bot”, that was running a Markov process generated by analysis of USENET posts, sampling from the resulting distribution to produce a “Markov chain” sequence and posting the resulting generated text right back to USENET.

In this lesson, we are going to replicate that. The basic technique is straightforward:

Start with a corpus of texts.
Break up the text into words.
Group the words into sentences.
Take the first word of each sentence and make a weighted distribution; the more times this word appears as a first word in a sentence, the higher it is weighted. This gives us our initial distribution.
Take every word in every sentence. For each word, generate a distribution of the words which follow it in the corpus of sentences. For example, if we have “frog” in the corpus, then we make a distribution based on the words which follow: “prince” twice, “pond” ten times, and so on. This gives us

...

Introduction to System.Random in C#

Introduction to Fixing Random

Fixing Random - Discrete Distribution

Fixing Random - Continuous Distribution

Conclusion

Appendix

An Application of Markov Model

Randomized Text Generation