Re: [Haskell-cafe] random shuffle and random list partition

17 Mar 2009

      Yitzchak Gale ha scritto:
...
[...]
While I think Oleg's tree method is beautiful, in practice
it may be re-inventing the wheel. I haven't tested it, but
I doubt that this implementation is much better than
using the classical shuffle algorithm on an IntMap.
Do you have a working implementation?
...
It's essentially the same tree inside. That's what I
usually use for this, and it works fine.
Oleg implementation is rather efficient, but it requires a lot of memory 
for huge lists.

Here, as an example, two programs, one in Python and one in Haskell.
The default Python generator in Python use the Mersenne Twister, but 
returning floats number in the range [0, 1].

# Python version
from random import shuffle

n = 10000000
m = 10
l = range(1, n + 1)

shuffle(l)
print l[:m]

-- Haskell version
module Main where

import Random.Shuffle
import System.Random.Mersenne.Pure64 (newPureMT)

n = 10000000
m = 10
l = [1 .. n]

main = do
   gen <- newPureMT
   print $ take m $ shuffle' l n gen

The Python version performances are:

real	0m16.812s
user	0m16.469s
sys	0m0.280s

150 MB memory usage

The Haskell version performances are:

real	0m8.757s
user	0m7.920s
sys	0m0.792s

800 MB memory usage
...
...
In future I can add an implementation of the random
shuffle algorithm on mutable arrays in the ST monad.
I've tried that in the past. Surprisingly, it wasn't faster
than using trees. Perhaps I did something wrong. Or
perhaps the difference only becomes apparent for
huge lists.
Can you try it on the list I have posted above?
...
As you point out, your partition algorithm is not fair.
Using your Random.Shuffle and a well-know trick
from combinatorics, you can easily get a fair
partitions function:
http://hpaste.org/fastcgi/hpaste.fcgi/view?id=2485#a2495
Thanks, this is very nice.
I have to run some benchmarks to see if it is efficient.

Regards  Manlio

Re: [Haskell-cafe] random shuffle and random list partition

Manlio Perillo