monte carlo trouble

newer
Haskell for the Enterprise (topic...

Chad Scherrer

15 Aug 2007 15 Aug '07

5:33 p.m.

There's a problem I've been struggling with for a long time... I need to build a function buildSample :: [A] -> State StdGen [(A,B,C)] given lookup functions f :: A -> [B] g :: A -> [C] The idea is to first draw randomly form the [A], then apply each lookup function and draw randomly from the result of each. It's actually slightly more complicated than this, since for the real problem I start with type [[A]], and want to map buildSample over these, and sample from the results. There seem to be so many ways to deal with random numbers in Haskell. After some false starts, I ended up doing something like sample :: [a] -> State StdGen [a] sample [] = return [] sample xs = do g <- get let (g', g'') = split g bds = (1, length xs) xArr = listArray bds xs put g'' return . map (xArr !) $ randomRs bds g' buildSample xs = sample $ do x <- xs y <- f x z <- g x return (x,y,z) This is really bad, since it builds a huge array of all the possibilities and then draws from that. Memory is way leaky right now. I'd like to be able to just have it apply the lookup functions as needed. Also, I'm still using GHC 6.6, so I don't have Control.Monad.State.Strict. Not sure how much difference this makes, but I guess I could just copy the source for that module if I need to. Any help is greatly appreciated! -- Chad Scherrer "Time flies like an arrow; fruit flies like a banana" -- Groucho Marx

Show replies by date

Paul Johnson

15 Aug 15 Aug

6:38 p.m.

Chad Scherrer wrote:

...

There's a problem I've been struggling with for a long time...

I need to build a function buildSample :: [A] -> State StdGen [(A,B,C)]

given lookup functions f :: A -> [B] g :: A -> [C]

The idea is to first draw randomly form the [A], then apply each lookup function and draw randomly from the result of each.

I don't understand why this returns a list of triples instead of a single triple. Your description below seems to imply the latter. You should probably look at the "Gen" monad in Test.QuickCheck, which is basically a nice implementation of what you are doing with "State StdGen" below. Its "elements" function gets a single random element, and you can combine it with replicateM to get a list of defined length. (BTW, are you sure want multiple random samples rather than a shuffle? A shuffle has each element exactly once whereas multiple random samples can pick any element an arbitrary number of times. I ask because shuffles are a more common requirement. For the code below I'll assume you meant what you said.) Using Test.QuickCheck I think you want something like this (which I have not tested): buildSample :: [A] -> Gen (A,B,C) buildSample xs = do x <- elements xs f1 <- elements $ f x g1 <- elements $ g x return If you want n such samples then I would suggest samples <- replicateM n $ buildSample xs

...

It's actually slightly more complicated than this, since for the real problem I start with type [[A]], and want to map buildSample over these, and sample from the results.

There seem to be so many ways to deal with random numbers in Haskell.

Indeed.

...

After some false starts, I ended up doing something like

sample :: [a] -> State StdGen [a] sample [] = return [] sample xs = do g <- get let (g', g'') = split g bds = (1, length xs) xArr = listArray bds xs put g'' return . map (xArr !) $ randomRs bds g'

Not bad, although you could instead have a sample function that returns a single element and then use replicateM to get a list.

...

buildSample xs = sample $ do x <- xs y <- f x z <- g x return (x,y,z)

This is really bad, since it builds a huge array of all the possibilities and then draws from that. Memory is way leaky right now. I'd like to be able to just have it apply the lookup functions as needed.

Also, I'm still using GHC 6.6, so I don't have Control.Monad.State.Strict. Not sure how much difference this makes, but I guess I could just copy the source for that module if I need to.

Thomas Hartman

6:52 p.m.

have you looked at pfp, the haskell "probabilistic functional programming library "? http://web.engr.oregonstate.edu/~erwig/pfp/ the paper http://web.engr.oregonstate.edu/~erwig/papers/abstracts.html#JFP06a describes modeling various statisticy things this way, like tree growth and the monty hall problem, I think it's likely this is applicable to monte carlo processes as well. thomas. Paul Johnson Sent by: haskell-cafe-bounces@haskell.org 08/15/2007 02:38 PM To Chad Scherrer cc haskell-cafe@haskell.org Subject Re: [Haskell-cafe] monte carlo trouble Chad Scherrer wrote:

...

There's a problem I've been struggling with for a long time...

I need to build a function buildSample :: [A] -> State StdGen [(A,B,C)]

given lookup functions f :: A -> [B] g :: A -> [C]

The idea is to first draw randomly form the [A], then apply each lookup function and draw randomly from the result of each.

...

It's actually slightly more complicated than this, since for the real problem I start with type [[A]], and want to map buildSample over these, and sample from the results.

There seem to be so many ways to deal with random numbers in Haskell.

Indeed.

...

After some false starts, I ended up doing something like

sample :: [a] -> State StdGen [a] sample [] = return [] sample xs = do g <- get let (g', g'') = split g bds = (1, length xs) xArr = listArray bds xs put g'' return . map (xArr !) $ randomRs bds g'

Not bad, although you could instead have a sample function that returns a single element and then use replicateM to get a list.

...

buildSample xs = sample $ do x <- xs y <- f x z <- g x return (x,y,z)

This is really bad, since it builds a huge array of all the possibilities and then draws from that. Memory is way leaky right now. I'd like to be able to just have it apply the lookup functions as needed.

Also, I'm still using GHC 6.6, so I don't have Control.Monad.State.Strict. Not sure how much difference this makes, but I guess I could just copy the source for that module if I need to.

Strictness won't help. In fact you would be better with laziness if that were possible (which it isn't here). The entire array has to be constructed before you can look up any elements in it. That forces the entire computation. But compare your implementation of buildSample to mine. Paul. _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Chad Scherrer

7:05 p.m.

Thanks for your replies. I actually starting out returning a single element instead. But a given lookup might return [], and the only way I could think of to handle it in (State StdGen a) would be to fail in the monad. But that's not really the effect I want - I'd rather have it ignore that element. Another option was to wrap with Maybe, but then since I really want a sequence of them anyway, I decided to just wrap in a List instead. Is there a way Maybe would work out better? I've seen PFP, but I don't see where that would help here. I'd still end up with an enormous list of tuples. This could be generated lazily, but sampling with replacement (yes I want this, not a shuffle) would require forcing the whole list anyway, wouldn't it? Using my approach, even asking ghci for the length of the list ran for 30+ minutes. If there's a way to lazily sample with replacement from a list without even requiring the length of the list to be known in advance, that could lead to a solution. Thanks, Chad On 8/15/07, Paul Johnson wrote:

...

Chad Scherrer wrote:

...
There's a problem I've been struggling with for a long time...

I need to build a function buildSample :: [A] -> State StdGen [(A,B,C)]

given lookup functions f :: A -> [B] g :: A -> [C]

The idea is to first draw randomly form the [A], then apply each lookup function and draw randomly from the result of each.

I don't understand why this returns a list of triples instead of a single triple. Your description below seems to imply the latter.

You should probably look at the "Gen" monad in Test.QuickCheck, which is basically a nice implementation of what you are doing with "State StdGen" below. Its "elements" function gets a single random element, and you can combine it with replicateM to get a list of defined length.

(BTW, are you sure want multiple random samples rather than a shuffle? A shuffle has each element exactly once whereas multiple random samples can pick any element an arbitrary number of times. I ask because shuffles are a more common requirement. For the code below I'll assume you meant what you said.)

Using Test.QuickCheck I think you want something like this (which I have not tested):

buildSample :: [A] -> Gen (A,B,C) buildSample xs = do x <- elements xs f1 <- elements $ f x g1 <- elements $ g x return

If you want n such samples then I would suggest

samples <- replicateM n $ buildSample xs

...
It's actually slightly more complicated than this, since for the real problem I start with type [[A]], and want to map buildSample over these, and sample from the results.

There seem to be so many ways to deal with random numbers in Haskell.

Indeed.

...
After some false starts, I ended up doing something like

sample :: [a] -> State StdGen [a] sample [] = return [] sample xs = do g <- get let (g', g'') = split g bds = (1, length xs) xArr = listArray bds xs put g'' return . map (xArr !) $ randomRs bds g'

Not bad, although you could instead have a sample function that returns a single element and then use replicateM to get a list.

...
buildSample xs = sample $ do x <- xs y <- f x z <- g x return (x,y,z)

This is really bad, since it builds a huge array of all the possibilities and then draws from that. Memory is way leaky right now. I'd like to be able to just have it apply the lookup functions as needed.

Also, I'm still using GHC 6.6, so I don't have Control.Monad.State.Strict. Not sure how much difference this makes, but I guess I could just copy the source for that module if I need to.

Strictness won't help. In fact you would be better with laziness if that were possible (which it isn't here). The entire array has to be constructed before you can look up any elements in it. That forces the entire computation. But compare your implementation of buildSample to mine.

Paul.

-- Chad Scherrer "Time flies like an arrow; fruit flies like a banana" -- Groucho Marx

Thomas Hartman

7:13 p.m.

...

...
I've seen PFP, but I don't see where that would help here. I'd still end up with an enormous list of tuples.

I'm not sure I understand what you need, but did you read the bits about replacing a "pure" state expansion (all the possibile states) with an approximation using random/io ? the approximation of course used much less resources (orders of orders of magnitude :) ) , the more time the random process had to evolve the better the approximation matched the "pure" calculation. very wonderful. thomas. "Chad Scherrer" 08/15/2007 03:05 PM To "Paul Johnson" , Thomas Hartman/ext/dbcom@DBAmericas cc haskell-cafe@haskell.org Subject Re: [Haskell-cafe] monte carlo trouble Thanks for your replies. I actually starting out returning a single element instead. But a given lookup might return [], and the only way I could think of to handle it in (State StdGen a) would be to fail in the monad. But that's not really the effect I want - I'd rather have it ignore that element. Another option was to wrap with Maybe, but then since I really want a sequence of them anyway, I decided to just wrap in a List instead. Is there a way Maybe would work out better? I've seen PFP, but I don't see where that would help here. I'd still end up with an enormous list of tuples. This could be generated lazily, but sampling with replacement (yes I want this, not a shuffle) would require forcing the whole list anyway, wouldn't it? Using my approach, even asking ghci for the length of the list ran for 30+ minutes. If there's a way to lazily sample with replacement from a list without even requiring the length of the list to be known in advance, that could lead to a solution. Thanks, Chad On 8/15/07, Paul Johnson wrote:

...

Chad Scherrer wrote:

...
There's a problem I've been struggling with for a long time...

I need to build a function buildSample :: [A] -> State StdGen [(A,B,C)]

given lookup functions f :: A -> [B] g :: A -> [C]

The idea is to first draw randomly form the [A], then apply each lookup function and draw randomly from the result of each.

I don't understand why this returns a list of triples instead of a single triple. Your description below seems to imply the latter.

You should probably look at the "Gen" monad in Test.QuickCheck, which is basically a nice implementation of what you are doing with "State StdGen" below. Its "elements" function gets a single random element, and you can combine it with replicateM to get a list of defined length.

(BTW, are you sure want multiple random samples rather than a shuffle? A shuffle has each element exactly once whereas multiple random samples can pick any element an arbitrary number of times. I ask because shuffles are a more common requirement. For the code below I'll assume you meant what you said.)

Using Test.QuickCheck I think you want something like this (which I have not tested):

buildSample :: [A] -> Gen (A,B,C) buildSample xs = do x <- elements xs f1 <- elements $ f x g1 <- elements $ g x return

If you want n such samples then I would suggest

samples <- replicateM n $ buildSample xs

...
It's actually slightly more complicated than this, since for the real problem I start with type [[A]], and want to map buildSample over these, and sample from the results.

There seem to be so many ways to deal with random numbers in Haskell.

Indeed.

...
After some false starts, I ended up doing something like

sample :: [a] -> State StdGen [a] sample [] = return [] sample xs = do g <- get let (g', g'') = split g bds = (1, length xs) xArr = listArray bds xs put g'' return . map (xArr !) $ randomRs bds g'

Not bad, although you could instead have a sample function that returns a single element and then use replicateM to get a list.

...
buildSample xs = sample $ do x <- xs y <- f x z <- g x return (x,y,z)

This is really bad, since it builds a huge array of all the possibilities and then draws from that. Memory is way leaky right now. I'd like to be able to just have it apply the lookup functions as needed.

Also, I'm still using GHC 6.6, so I don't have Control.Monad.State.Strict. Not sure how much difference this makes, but I guess I could just copy the source for that module if I need to.

Strictness won't help. In fact you would be better with laziness if that were possible (which it isn't here). The entire array has to be constructed before you can look up any elements in it. That forces the entire computation. But compare your implementation of buildSample to mine.

Paul.

-- Chad Scherrer "Time flies like an arrow; fruit flies like a banana" -- Groucho Marx --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Chaddaï Fouché

10:05 p.m.

2007/8/15, Chad Scherrer :

...

If there's a way to lazily sample with replacement from a list without even requiring the length of the list to be known in advance, that could lead to a solution.

I'm not sure what you mean by "with replacement" but I think it don't change the fundamental problem. There is in fact a way to get a random sample from a list without getting it's length first, but it still require to look at the whole list so it shouldn't change anything anyway... except if you're dealing with data on disk (which the time for length() suggests... are you swapping and did you try to push your data out of RAM ?) where reading is expensive. Here it is : getRandomElt :: (RandomGen t) => [a] -> t -> a getRandomElt (x:xs) g = aux 2 g xs x where aux _ _ [] y = y aux i g (x:xs) y = let (r,ng) = randomR (1, i) g in if r == 1 then aux (i+1) ng xs x else aux (i+1) ng xs y There is an equiprobability that each element of the list is chosen. -- Jedaï

Steve Schafer

16 Aug 16 Aug

1:31 a.m.

On Thu, 16 Aug 2007 00:05:14 +0200, you wrote:

...

I'm not sure what you mean by "with replacement"

"With replacement" means that you select a value from the source, but you don't actually remove it. That way, it's still available to be selected again later. Steve Schafer Fenestra Technologies Corp. http://www.fenestra.com/

6533

Age (days ago)

6534

Last active (days ago)

List overview

Download

6 comments

5 participants

participants (5)

Chad Scherrer
Chaddaï Fouché
Paul Johnson
Steve Schafer
Thomas Hartman