Markovian text generation uses a Markov chain to generate texts based on training data.

The training data is processed to create a Markov Chain. In a first-order Markov Model, the states of the Markov process are the words of the training corpus. A transition is created from one word (state) to another when one follows another. So the corpus “I am” would have two states, if a transition from “I” to “am”.

Text is generated by a markov chain by first selecting a random state and then selecting the next one based on the current state and possible transitions. This is done until the length of the desired text is reached, or no transition is possible.

It’s also possible to continue generating text after no transition exists by selecting a new random state.

Higher-order markov models are ones that the states are not only one word, but multiple words, and the state is based on the last N words on the text, where N is equal to the order of the Markov Model.

cs