Bug in random shuffle algorithm

https://wiki.haskell.org/Random_shuffle says: Here's a variation using the MonadRandom package: import Control.Monad import Control.Monad.ST import Control.Monad.Random import System.Random import Data.Array.ST import GHC.Arr shuffle :: RandomGen g => [a] -> Rand g [a] shuffle xs = do let l = length xs rands <- take l `fmap` getRandomRs (0, l-1) let ar = runSTArray $ do ar <- thawSTArray $ listArray (0, l-1) xs forM_ (zip [0..(l-1)] rands) $ \(i, j) -> do vi <- readSTArray ar i vj <- readSTArray ar j writeSTArray ar j vi writeSTArray ar i vj return ar return (elems ar) But this doesn't yield a uniformly-chosen random permutation of its input, because at the i'th step the first `i` elements are not fixed as they are in the other algorithms on that page. Cheers, David

On May 8, 2017, at 12:02 PM, David Turner
wrote: But this doesn't yield a uniformly-chosen random permutation of its input, because at the i'th step the first `i` elements are not fixed as they are in the other algorithms on that page.
Channeling Samuel Dukhovni, who provided a patch for the same bug in older versions of Postfix a few years back, his suggested patch is: shuffle :: RandomGen g => [a] -> Rand g [a] shuffle xs = do let l = length xs - rands <- take l `fmap` getRandomRs (0, l-1) + rands <- forM [0..(l-1)] (\i -> getRandomR (i, l-1)) let ar = runSTArray $ do ar <- thawSTArray $ listArray (0, l-1) xs forM_ (zip [0..(l-1)] rands) $ \(i, j) -> do vi <- readSTArray ar i vj <- readSTArray ar j writeSTArray ar j vi writeSTArray ar i vj return ar return (elems ar) Perhaps someone can forward this along to the package maintainer. Also as an optimization, the `rands` list should be made one element shorter, and likewise the `zip [0..(l-1)]` should become `zip [0..(l-2)]`. The reason is of course that by the time we only have one final element left there's nothing left to permute. -- Viktor. P.S. The corresponding new Postfix code can be found at https://github.com/vdukhovni/postfix/blob/master/postfix/src/dns/dns_rr.c#L3... /* * Shuffle resource records. Every element has an equal chance of landing * in slot 0. After that every remaining element has an equal chance of * landing in slot 1, ... This is exactly n! states for n! permutations. */ for (i = 0; i < len - 1; i++) { r = i + (myrand() % (len - i)); /* Victor&Son */ rr = rr_array[i]; rr_array[i] = rr_array[r]; rr_array[r] = rr; }

Hello, Thank you David for reporting this and thank you Viktor for the fix. I edited the wiki. Li-yao On 05/09/2017 01:46 AM, Viktor Dukhovni wrote:
On May 8, 2017, at 12:02 PM, David Turner
wrote: But this doesn't yield a uniformly-chosen random permutation of its input, because at the i'th step the first `i` elements are not fixed as they are in the other algorithms on that page. Channeling Samuel Dukhovni, who provided a patch for the same bug in older versions of Postfix a few years back, his suggested patch is:
shuffle :: RandomGen g => [a] -> Rand g [a] shuffle xs = do let l = length xs - rands <- take l `fmap` getRandomRs (0, l-1) + rands <- forM [0..(l-1)] (\i -> getRandomR (i, l-1)) let ar = runSTArray $ do ar <- thawSTArray $ listArray (0, l-1) xs forM_ (zip [0..(l-1)] rands) $ \(i, j) -> do vi <- readSTArray ar i vj <- readSTArray ar j writeSTArray ar j vi writeSTArray ar i vj return ar return (elems ar)
Perhaps someone can forward this along to the package maintainer.
Also as an optimization, the `rands` list should be made one element shorter, and likewise the `zip [0..(l-1)]` should become `zip [0..(l-2)]`. The reason is of course that by the time we only have one final element left there's nothing left to permute.
participants (3)
-
David Turner
-
Li-yao Xia
-
Viktor Dukhovni