Re: [Haskell-cafe] How would you hack it?

5 Jun 2008


      On Wed, 4 Jun 2008, John Melesky wrote:
...
So you use those occurrence statistics to pick a feasible next word
(let's choose "system", since it's the highest probability here -- in
practice you'd probably choose one randomly based on a weighted
likelihood). Then you look for all the word pairs which start with
"system", and choose the next word in the same fashion. Repeat for as
long as you want.
"Markov chain" means, that you have a sequence of random experiments,
where the outcome of each experiment depends exclusively on a fixed number
(the level) of experiments immediately before the current one.
...
Those word-pair statistics, when you have them for all the words in
your vocabulary, comprise the first-level Markov data for your corpus.
When you extend it to word triplets, it's second-level Markov data
(and it will generate more reasonable fake text). You can build higher
and higher Markov levels if you'd like.
If the level is too high, you will just reproduce the training text.

Re: [Haskell-cafe] How would you hack it?

Henning Thielemann