Implementing Las Vegas algorithms in Haskell

newer
[Haskell] ANNOUNCE: ChristmasTree...

older
Optimizing Compiler for the ICFP...

Matthias Görgens

7 Jul 2009 7 Jul '09

1:08 a.m.

A Las Vegas algorithm, like randomized quicksort, uses a source of randomness to make certain decisions. However its output is unaffected by the randomness. So a function

...

f :: RandomGen g => g -> a -> b

implementing a Las-Vegas-Algorithm 'looks' like a pure function, ignoring its first argument and depending solely on its second argument. What is an idiomatic way to implement such a function? I believe, Monads are too linear and strict.

Show replies by date

Luke Palmer

7 Jul 7 Jul

2:04 a.m.

2009/7/6 Matthias Görgens

...

A Las Vegas algorithm, like randomized quicksort, uses a source of randomness to make certain decisions. However its output is unaffected by the randomness. So a function

...
f :: RandomGen g => g -> a -> b

implementing a Las-Vegas-Algorithm 'looks' like a pure function, ignoring its first argument and depending solely on its second argument. What is an idiomatic way to implement such a function? I believe, Monads are too linear and strict.

Interesting question! Well, you could make your own random generator for the lifetime of the function, with a fixed seed. I'd say this is the most "honest" way to do it; however, might a malicious user discover your seed, he could design an input that would make your algorithm perform poorly. I'm wary of saying you could use unsafePerformIO . randomRIO to get a seed. But I think some sort of unsafe something has to be involved, since you are representing a very advanced proof obligation (the algorithm is independent of the randomness). Keep us (me) posted on developments on this idea. Luke

Jason Dagit

4:03 a.m.

2009/7/6 Matthias Görgens

...

A Las Vegas algorithm, like randomized quicksort, uses a source of randomness to make certain decisions. However its output is unaffected by the randomness. So a function

...
f :: RandomGen g => g -> a -> b

implementing a Las-Vegas-Algorithm 'looks' like a pure function, ignoring its first argument and depending solely on its second argument. What is an idiomatic way to implement such a function? I believe, Monads are too linear and strict.

I believe this would be a good place to apply "implicit configurations". http://okmij.org/ftp/Haskell/types.html#Prepose Let me know if it solves your problem. What I recall of the paper is that it should work nicely for your situation. Jason

Antoine Latter

6:10 a.m.

2009/7/6 Matthias Görgens :

...

A Las Vegas algorithm, like randomized quicksort, uses a source of randomness to make certain decisions. However its output is unaffected by the randomness. So a function

...
f :: RandomGen g => g -> a -> b

implementing a Las-Vegas-Algorithm 'looks' like a pure function, ignoring its first argument and depending solely on its second argument. What is an idiomatic way to implement such a function? I believe, Monads are too linear and strict.

If I were using the function in an executable I were writing, I would probably do something like unsafePerformIO . randomIO. Or thread in the random Gen from main if it were convenient. If I were writing it as a library function, I would leave the function as you described and let the caller make the choice. Calling into randomIO in a library function is extremely dubious, as a second library could be getting and setting the random seed used by randomIO (see setStdGen). So I'm okay taking on that risk in an application I write, but I'm not okay shipping that risk in a re-usable library, with the risk hidden behind a type signature. Maybe I'm just paranoid. Antoine

Max Rabkin

9:06 a.m.

2009/7/7 Antoine Latter :

...

If I were writing it as a library function, I would leave the function as you described and let the caller make the choice. Calling into randomIO in a library function is extremely dubious, as a second library could be getting and setting the random seed used by randomIO (see setStdGen).

So I'm okay taking on that risk in an application I write, but I'm not okay shipping that risk in a re-usable library, with the risk hidden behind a type signature.

Maybe I'm just paranoid.

You're not paranoid if they're really out to get you. I have been in a similar situation (on the other side: I discovered the flaw, and demonstrated an attack). I was in a Python class where our final project was a fairly fully-featured battleships game (3d graphics, network play and a computer player). Part of the evaluation (and a very fun part of the project) was an AI tournament. The TA gave us the code for the tournament server beforehand to that we could test our programs, and I observed that each player was loaded as a module into the same Python process. Python has both a global RNG and encapsulated RNG objects, but since it's an imperative language it's natural to use the global one and most people did. So one could seed the RNG at the start of each game to ones advantage: I tried this afterwards and beat all my opponents convincingly (in the real contest, where I didn't cheat, I was roughly equal to the two other top players and came second).

...

Antoine

--Max

Hector Guilarte

6:51 a.m.

A few days ago I had to randomly choose and element of a list and continue execution, so here's what I did: I made a infinite list of Random numbers [Int] (Not IO [Int]) and I passed it around all the time in a Tuple and whenever I returned I also returned the list, so I would always have it available whenever I needed to use it. Whenever I used it I took it's head off and returned the tail, along with whatever else I was returning in a Tuple. The seed for the Random Generator was the CPUTime. Here's the code: (The function that returns the infinite list is infinito, the one on the bottom) import Random import CPUTime import System.IO.Unsafe {-| La funcion @rand@ Retorna una lista de numeros Random para ser utilizados en la seleccion de las guardias en los if y los do -} rand :: (RandomGen g, Random a) => (a,a) -> g -> [a] rand range gen = as where (a,b) = split gen -- create two separate generators as = randomRs range a -- one infinite list of randoms {-| @seed@ Retorna la semilla con la que la funcion Random va a iniciar la generacion de los numeros Randoms -} seed :: Int seed = fromInteger (unsafePerformIO getCPUTime) {-| @mygen@ Retorna el generador estandar con la semilla dada por la funcion @seed@ -} mygen = mkStdGen seed {-| @infinito@ Retorna la lista infinita de donde se van a sacar los numeros para elegir la guardia a ser ejecutada en los if y los do -} infinito:: (Num t,Random t) => [t] infinito = [ x | x <- rand (1,1000000) mygen] Hope it works for you! Hector Guilarte

Matthias Görgens

7:46 a.m.

Dear Hector, Yes, I thought of a similar scheme. Say we want to implemented randomized quicksort. Passing a list of random numbers would destroy laziness and linearise the algorithm --- because the right recursion branch would need to know at least how many random numbers where consumed by the left branch. So for the example of quicksort I thought of passing an infinite binary tree of random numbers.

...

data RandomTree v = Node (RandomTree v) v (RandomTree v) splitOnMedian :: Ord a => SomeRandomType -> [a] -> ([a],a,[a])

...

quicksort :: RandomTree (SomeRandomType) -> [a] -> [a] quicksort _ [] = [] quicksort _ [a] = [a] quicksort (Node left here right) s = let (l,median,r) = splitOnMedian here s in quicksort left l ++ median ++ quicksort right r

Of course one would need a special data structure for each recursion scheme with this approach. For a number of algorithms something like the rose trees of Data.Tree should work, though. What I wondered was, if one could hid the random plumbing in some data structure, like the state monad, but less linear. Matthias.

Ketil Malde

8:04 a.m.

Matthias Görgens writes:

...

Yes, I thought of a similar scheme. Say we want to implemented randomized quicksort. Passing a list of random numbers would destroy laziness and linearise the algorithm --- because the right recursion branch would need to know at least how many random numbers where consumed by the left branch.

Well, you could implement a function 'split' as split (x:xs) = (evens (x:xs), evens xs) where evens (y:_:ys) = y:evens ys This would divide your supply of random numbers in two - this is lazy, but forcing any of the sublists would force the spine of the original list, so not optimal. So the obvious followup is why not pass a randomGen around instead, which has a split operation already defined, and which causes no laziness headaches?

...

What I wondered was, if one could hid the random plumbing in some data structure, like the state monad, but less linear.

This problem cries for a State monad solution - but you don't need to do it yourself, there's already a Random monad defined for you. -k -- If I haven't seen further, it is by standing in the footprints of giants

Matthias Görgens

8:31 a.m.

...

...
What I wondered was, if one could hid the random plumbing in some data structure, like the state monad, but less linear.

This problem cries for a State monad solution - but you don't need to do it yourself, there's already a Random monad defined for you.

Yes, but I only need the random values inside splitOnMedia. The rest is just non-linear plumbing. We do not know beforehand how many random values a branch quicksort will consume --- neither do we ware about the state of the random generator at the end. Do you consider the RandomMonad the best fit? Matthias.

Luke Palmer

8:59 a.m.

2009/7/7 Matthias Görgens

...

...
...
What I wondered was, if one could hid the random plumbing in some data structure, like the state monad, but less linear.

This problem cries for a State monad solution - but you don't need to do it yourself, there's already a Random monad defined for you.

Yes, but I only need the random values inside splitOnMedia. The rest is just non-linear plumbing. We do not know beforehand how many random values a branch quicksort will consume --- neither do we ware about the state of the random generator at the end. Do you consider the RandomMonad the best fit?

Random monad is a very natural choice for "random cloud" computations. Don't think of it as a state monad -- that's an implementation detail. You can think of a value of type Random a as a "probability distribution of a's"; and there are natural definitions for the monad operators on this semantics. I blogged about this a while ago: http://lukepalmer.wordpress.com/2009/01/17/use-monadrandom/ Luke

Ketil Malde

9:47 a.m.

Luke Palmer writes:

...

Random monad is a very natural choice for "random cloud" computations. Don't think of it as a state monad -- that's an implementation detail. You can think of a value of type Random a as a "probability distribution of a's"; and there are natural definitions for the monad operators on this semantics.

I wonder if pure values and values in the Random monad as belonging to different complexity classes? It's been to long since theory of computation, but I think the relationship between BPP and the other classes is a bit unclear. (Of course, Random isn't limited to polynomial.)

...

I blogged about this a while ago: http://lukepalmer.wordpress.com/2009/01/17/use-monadrandom/

Nice. I've been busy writing a simulator, and thus needed to model some of the statistical distributions. The way I did this, was to say data Distribution = Uniform low high | Normal mu sigma | StudentT ... and sample :: Distribution -> Random Double sample (Normal mu sigma) = ... one reason to go beyond simply: normal :: Double -> Double -> Random Double -- normal mu sigma = sample (Normal mu sigma) is that you might want to do other things with a probability distribution than sampling it, like querying p-values for a given sample. I haven't gotten around to it yet, but I think I can get away by defining functions for the cumulative distribution and its inverse, and just write a generic function for 'sample' (which would need to pull a random probability (0<=p<=1). (But it might turn out it is useful to specialize it for simple distributions anyway?) -k -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Malde

9:52 a.m.

Ketil Malde writes:

...

data Distribution = Uniform low high | Normal mu sigma | StudentT ...

Of course, now that it occurs to me to check this, I notice Data.Random.Distribution does the same thing, only more generally, supporting more distributions, and no doubt with more robust implementations. Used wheel for sale cheap, slightly less round than the competition. Oh well. -k -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Malde

10:47 a.m.

Ketil Malde writes:

...

sample :: Distribution -> Random Double

Sorry, that's not entirely accurate. Rather: sample :: RandomGen g => Distribution -> Rand g Double -k -- If I haven't seen further, it is by standing in the footprints of giants

Heinrich Apfelmus

3:50 p.m.

Luke Palmer wrote:

...

...
...
...
What I wondered was, if one could hid the random plumbing in some data structure, like the state monad, but less linear. This problem cries for a State monad solution - but you don't need to do it yourself, there's already a Random monad defined for you.

Yes, but I only need the random values inside splitOnMedia. The rest is just non-linear plumbing. We do not know beforehand how many random values a branch quicksort will consume --- neither do we ware about the state of the random generator at the end. Do you consider the RandomMonad the best fit?

Random monad is a very natural choice for "random cloud" computations. Don't think of it as a state monad -- that's an implementation detail. You can think of a value of type Random a as a "probability distribution of a's"; and there are natural definitions for the monad operators on this semantics.

I blogged about this a while ago: http://lukepalmer.wordpress.com/2009/01/17/use-monadrandom/

I concur with Luke that interpreting Random a as a probability distribution of values of type a is the best mental model. The fact that it's a monad is secondary, if crucial for doing anything with these values. For a concrete example, http://apfelmus.nfshost.com/random-permutations.html might help. Regards, apfelmus -- http://apfelmus.nfshost.com

5839

Age (days ago)

5839

Last active (days ago)

List overview

Download

13 comments

8 participants

participants (8)

Antoine Latter
Hector Guilarte
Heinrich Apfelmus
Jason Dagit
Ketil Malde
Luke Palmer
Matthias Görgens
Max Rabkin