Markov Text Generator & Randomness

Hi -cafe, I'm coding a Markov Text Generator (of order 1). Basically, you have a source text, and knowing the frequencies of pairs of consecutive words, you generate a somewhat syntactically correct text from this. Here's the link to my code and to a source text you can use as example. test.txt http://lpaste.net/raw/4004174907431714816 code http://lpaste.net/4147715261379641344 The kickers is that this code generates sentences with consecutive words that never appears next to each other in the source text. For example, the code generated "They sat over at because old those the lighted.", but "over at" never occurs in the source text, so it shouldn't occur in a generated sentence. The makeDb function gives is correct, so my problem actually lies in generate and/or in draw. I think there's something about RVar that I messed up, but I don't see the problem. Any ideas? Cheers, -- Charles-Pierre

Charles-Pierre Astolfi
Hi -cafe,
I'm coding a Markov Text Generator (of order 1). Basically, you have a source text, and knowing the frequencies of pairs of consecutive words, you generate a somewhat syntactically correct text from this.
Here's the link to my code and to a source text you can use as example.
test.txt http://lpaste.net/raw/4004174907431714816 code http://lpaste.net/4147715261379641344
The kickers is that this code generates sentences with consecutive words that never appears next to each other in the source text. For example, the code generated "They sat over at because old those the lighted.", but "over at" never occurs in the source text, so it shouldn't occur in a generated sentence.
You mean like, "The old man looked from his glass across the square, then over at the waiters." Otherwise my cursory look turned up no bugs. Cheers, - Ben

"The old man looked from his glass across the square, then over at the waiters."
You're embarrassingly right! But then, "those the" definitely never appears, altough it does in my generated text.
Otherwise my cursory look turned up no bugs. Unfortunately there is :(
-- Cp

Just a note: in the first "where" clause in `generate`, you don't need
to pass around the `db` variable (it is visible to the "where"
clause).
I never used that `RVar` monad, but I guess that every time you run
`rword` you might get a different result. So in your `go` function,
once you have executed `word <- rword`, you should not pass `rword`
down the `draw` function, but instead, say, `return word`.
2014-07-24 14:24 GMT+02:00 Charles-Pierre Astolfi
"The old man looked from his glass across the square, then over at the waiters."
You're embarrassingly right! But then, "those the" definitely never appears, altough it does in my generated text.
Otherwise my cursory look turned up no bugs. Unfortunately there is :(
-- Cp _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Charles-Pierre Astolfi
"The old man looked from his glass across the square, then over at the waiters."
You're embarrassingly right! But then, "those the" definitely never appears, altough it does in my generated text.
Otherwise my cursory look turned up no bugs. Unfortunately there is :(
Ahh yes, looking a bit more closely now I have a few points: 1. In `draw`: The first argument is an action which will return a new word. Instead of passing `rword :: RVar Word`, you presumably rather want to pass `word :: Word`. This is likely the cause of your bug. 2. In `draw`: Instead of `weightedSample` which produces a random shuffling of the entire list, you really just want to draw a single word. This is a categorical distribution; use `Data.Random.Categorical.fromList` to construct the distribution and `R.rvar` to draw a variate. Note that you may only want to avoid doing the former more than once as construction of the distribution requires sorting and normalizing. 3. `map (\(x,y)->(y,x))` is just `map swap` where `swap` is provided by `Data.Tuple`. My quick rework of your code can be found here [1]. Cheers, - Ben [1] http://lpaste.net/108025

You're right Ben, changing the signature to Word instead of RVar Word
did the trick. Stupid mistake.
Thanks!
--
Cp
On Thu, Jul 24, 2014 at 9:55 AM, Ben Gamari
Charles-Pierre Astolfi
writes: "The old man looked from his glass across the square, then over at the waiters."
You're embarrassingly right! But then, "those the" definitely never appears, altough it does in my generated text.
Otherwise my cursory look turned up no bugs. Unfortunately there is :(
Ahh yes, looking a bit more closely now I have a few points:
1. In `draw`: The first argument is an action which will return a new word. Instead of passing `rword :: RVar Word`, you presumably rather want to pass `word :: Word`. This is likely the cause of your bug.
2. In `draw`: Instead of `weightedSample` which produces a random shuffling of the entire list, you really just want to draw a single word. This is a categorical distribution; use `Data.Random.Categorical.fromList` to construct the distribution and `R.rvar` to draw a variate. Note that you may only want to avoid doing the former more than once as construction of the distribution requires sorting and normalizing.
3. `map (\(x,y)->(y,x))` is just `map swap` where `swap` is provided by `Data.Tuple`.
My quick rework of your code can be found here [1].
Cheers,
- Ben
participants (3)
-
Ben Gamari
-
Charles-Pierre Astolfi
-
Vo Minh Thu