Thanks for your replies.
I actually starting out returning a single element instead. But a
given lookup might return [], and the only way I could think of to
handle it in (State StdGen a) would be to fail in the monad. But
that's not really the effect I want - I'd rather have it ignore that
element. Another option was to wrap with Maybe, but then since I
really want a sequence of them anyway, I decided to just wrap in a
List instead. Is there a way Maybe would work out better?
I've seen PFP, but I don't see where that would help here. I'd still
end up with an enormous list of tuples. This could be generated
lazily, but sampling with replacement (yes I want this, not a shuffle)
would require forcing the whole list anyway, wouldn't it? Using my
approach, even asking ghci for the length of the list ran for 30+
minutes.
If there's a way to lazily sample with replacement from a list without
even requiring the length of the list to be known in advance, that
could lead to a solution.
Thanks,
Chad
On 8/15/07, Paul Johnson
Chad Scherrer wrote:
There's a problem I've been struggling with for a long time...
I need to build a function buildSample :: [A] -> State StdGen [(A,B,C)]
given lookup functions f :: A -> [B] g :: A -> [C]
The idea is to first draw randomly form the [A], then apply each lookup function and draw randomly from the result of each.
I don't understand why this returns a list of triples instead of a single triple. Your description below seems to imply the latter.
You should probably look at the "Gen" monad in Test.QuickCheck, which is basically a nice implementation of what you are doing with "State StdGen" below. Its "elements" function gets a single random element, and you can combine it with replicateM to get a list of defined length.
(BTW, are you sure want multiple random samples rather than a shuffle? A shuffle has each element exactly once whereas multiple random samples can pick any element an arbitrary number of times. I ask because shuffles are a more common requirement. For the code below I'll assume you meant what you said.)
Using Test.QuickCheck I think you want something like this (which I have not tested):
buildSample :: [A] -> Gen (A,B,C) buildSample xs = do x <- elements xs f1 <- elements $ f x g1 <- elements $ g x return
If you want n such samples then I would suggest
samples <- replicateM n $ buildSample xs
It's actually slightly more complicated than this, since for the real problem I start with type [[A]], and want to map buildSample over these, and sample from the results.
There seem to be so many ways to deal with random numbers in Haskell.
Indeed.
After some false starts, I ended up doing something like
sample :: [a] -> State StdGen [a] sample [] = return [] sample xs = do g <- get let (g', g'') = split g bds = (1, length xs) xArr = listArray bds xs put g'' return . map (xArr !) $ randomRs bds g'
Not bad, although you could instead have a sample function that returns a single element and then use replicateM to get a list.
buildSample xs = sample $ do x <- xs y <- f x z <- g x return (x,y,z)
This is really bad, since it builds a huge array of all the possibilities and then draws from that. Memory is way leaky right now. I'd like to be able to just have it apply the lookup functions as needed.
Also, I'm still using GHC 6.6, so I don't have Control.Monad.State.Strict. Not sure how much difference this makes, but I guess I could just copy the source for that module if I need to.
Strictness won't help. In fact you would be better with laziness if that were possible (which it isn't here). The entire array has to be constructed before you can look up any elements in it. That forces the entire computation. But compare your implementation of buildSample to mine.
Paul.
-- Chad Scherrer "Time flies like an arrow; fruit flies like a banana" -- Groucho Marx