Strange random choice algorithm

I'm not sure where I got this PICK function from, and don't understand why it's written as it is, so I wanted to test it for randomness. It seems random enough. But if I understand the algorithm correctly, instead of selecting one of the elements from the list, it eliminates all the elements but one and that's the value it returns. Seems like a roundabout way of doing it. Comments? Also, is there a more direct way of printing an array? Output below. Michael ================= import System.Random import Data.Array.IO pick :: [a] -> IO a pick [] = undefined pick [x] = do return x pick (x:xs) = pick' x xs (2 :: Int) pick' :: (Num p, Random p) => t -> [t] -> p -> IO t pick' curr [] _ = do return curr pick' curr (next:rest) prob = do r <- getStdRandom (randomR (1,prob)) let curr' = if r == 1 then next else curr pick' curr' rest (prob+1) main = do arr <- newArray (1,9) 0 :: IO (IOArray Int Int) doLoop arr [1,2,3,4,5,6,7,8,9] 0 doLoop arr z k = do p <- pick z a <- readArray arr p writeArray arr p (a+1) if k > 10000 then do v <- readArray arr 1 print v v <- readArray arr 2 print v v <- readArray arr 3 print v v <- readArray arr 4 print v v <- readArray arr 5 print v v <- readArray arr 6 print v v <- readArray arr 7 print v v <- readArray arr 8 print v v <- readArray arr 9 print v else do doLoop arr z (k+1) =============== [michael@localhost ~]$ runhaskell array1.hs 1110 1117 1080 1169 1112 1119 1137 1084 1074 [michael@localhost ~]$

Am Samstag 30 Januar 2010 20:59:08 schrieb michael rice:
I'm not sure where I got this PICK function from, and don't understand why it's written as it is, so I wanted to test it for randomness. It seems random enough. But if I understand the algorithm correctly, instead of selecting one of the elements from the list, it eliminates all the elements but one and that's the value it returns.
Yep.
Seems like a roundabout way of doing it. Comments?
Indeed.
Also, is there a more direct way of printing an array?
Sure, printing immutable arrays: print arr ~> array (lo,hi) [(lo,arr!lo), ... , (hi,arr!hi)] print (assocs arr) ~> [(lo,arr!lo), ... , (hi,arr!hi)] print (elems arr) ~> [(arr!lo), ... , (arr!hi)] printing IO[U]Arrays: do immArr <- freeze arr print (immArr :: [U]Array ix el) do ass <- getAssocs arr print ass (getAssocs arr >>= print) do els <- getElems arr print els (getElems arr >>= print) or, to get output like below: getElems arr >>= mapM_ print Printing ST[U]Arrays would need an unsafeIOToST, but reasonably, you wouldn't want to print them before you've left ST.
Output below.
Michael
=================
import System.Random import Data.Array.IO
pick :: [a] -> IO a pick [] = undefined pick [x] = do return x pick (x:xs) = pick' x xs (2 :: Int)
pick' :: (Num p, Random p) => t -> [t] -> p -> IO t pick' curr [] _ = do return curr pick' curr (next:rest) prob = do r <- getStdRandom (randomR (1,prob)) let curr' = if r == 1 then next else curr pick' curr' rest (prob+1)
main = do arr <- newArray (1,9) 0 :: IO (IOArray Int Int) doLoop arr [1,2,3,4,5,6,7,8,9] 0 doLoop arr z k = do p <- pick z a <- readArray arr p writeArray arr p (a+1) if k > 10000 then do v <- readArray arr 1 print v v <- readArray arr 2 print v v <- readArray arr 3 print v v <- readArray arr 4 print v v <- readArray arr 5 print v v <- readArray arr 6 print v v <- readArray arr 7 print v v <- readArray arr 8 print v v <- readArray arr 9 print v else do doLoop arr z (k+1)
===============
[michael@localhost ~]$ runhaskell array1.hs 1110 1117 1080 1169 1112 1119 1137 1084 1074 [michael@localhost ~]$

On Sat, Jan 30, 2010 at 9:38 PM, Daniel Fischer
Also, is there a more direct way of printing an array?
Sure,
printing immutable arrays:
print arr ~> array (lo,hi) [(lo,arr!lo), ... , (hi,arr!hi)] print (assocs arr) ~> [(lo,arr!lo), ... , (hi,arr!hi)] print (elems arr) ~> [(arr!lo), ... , (arr!hi)]
Those are all fine.
printing IO[U]Arrays:
do immArr <- freeze arr print (immArr :: [U]Array ix el)
do ass <- getAssocs arr print ass
(getAssocs arr >>= print)
do els <- getElems arr print els
On the other hand, all those suggestions have a severe efficiency problem due to the IO Monad strictness, for instance (getAssocs arr
= print) will start by creating the whole list of associations before printing any of it.
More efficient and still better than the initial code would be : mapM_ (readArray arr >=> print) [1..9] -- Jedaï

On 30 Jan 2010, at 20:59, michael rice wrote:
I'm not sure where I got this PICK function from, and don't understand why it's written as it is, so I wanted to test it for randomness. It seems random enough. But if I understand the algorithm correctly, instead of selecting one of the elements from the list, it eliminates all the elements but one and that's the value it returns. Seems like a roundabout way of doing it. Comments?
Below is a function draw() that shuffles an interval, and prints it out. Hans ---- import Random getRandomIndex :: [a] -> IO(Int) getRandomIndex ls = getStdRandom(randomR(0, (length ls) - 1)) remove :: Int -> [a] -> [a] remove 0 (x:xs) = xs remove n (x:xs) | n>0 = x: remove (n-1) xs remove _ (_:_) = error "remove: negative argument" remove _ [] = error "remove: too large argument" shuffle :: [a] -> IO [a] shuffle [] = return [] shuffle ls = do i <- getRandomIndex ls do l <- shuffle (remove i ls) return ((ls !! i) : l) draw ls = do k <- shuffle ls putStr (show k) ----

On Jan 30, 2010, at 8:59 PM, michael rice wrote:
I'm not sure where I got this PICK function from, and don't understand why it's written as it is, so I wanted to test it for randomness. It seems random enough.
We can convince ourselves using reason instead of tests. An element is selected by the algorithm if it is picked and no later element is picked afterwards. It doesn't matter which elements are picked before. The n-th element of the given list replaces the current selection with probability (1/n). Hence, the probability that the n-th element is selected in the end is (1/n)*(1-1/(n+1))*...*(1-1/m) if there are m elements. For example, if there are 9 elements, the probability of selecting the 7th is 1/7 * 7/8 * 8/9 which is 1/9 because the 7 and 8 are canceled out. This happens for all elements and, thus, every element is selected with probability 1/9. Anyway, pick xs = do n <- randomRIO (1,length xs) return (xs!!(n-1)) would have been clearer.. It queries the random number generator only once but walks through the list twice. Sebastian -- Underestimating the novelty of the future is a time-honored tradition. (D.G.)

Hi all,
Very nice analysis, Sebastian.
Thanks,
Michael
--- On Sat, 1/30/10, Sebastian Fischer
I'm not sure where I got this PICK function from, and don't understand why it's written as it is, so I wanted to test it for randomness. It seems random enough.
We can convince ourselves using reason instead of tests. An element is selected by the algorithm if it is picked and no later element is picked afterwards. It doesn't matter which elements are picked before. The n-th element of the given list replaces the current selection with probability (1/n). Hence, the probability that the n-th element is selected in the end is (1/n)*(1-1/(n+1))*...*(1-1/m) if there are m elements. For example, if there are 9 elements, the probability of selecting the 7th is 1/7 * 7/8 * 8/9 which is 1/9 because the 7 and 8 are canceled out. This happens for all elements and, thus, every element is selected with probability 1/9. Anyway, pick xs = do n <- randomRIO (1,length xs) return (xs!!(n-1)) would have been clearer.. It queries the random number generator only once but walks through the list twice. Sebastian --Underestimating the novelty of the future is a time-honored tradition. (D.G.) _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Jan 31, 2010, at 12:06 AM, Sebastian Fischer wrote:
pick xs = do n <- randomRIO (1,length xs) return (xs!!(n-1))
would have been clearer.. It queries the random number generator only once but walks through the list twice.
Walking through the list twice leads to linear memory requirements because the list cannot be garbage collected during the evaluation of length. If you intend to use this function with very long lists, the original algorithm is preferable. Here is a rewriting: import System.Random (randomRIO) import Control.Monad (foldM) pick (x:xs) = foldM swap x $ zip xs [2..] where swap y (z,n) = do m <- randomRIO (1::Integer,n) if m==1 then return z else return y This version of `pick` runs in constant space. Sebastian -- Underestimating the novelty of the future is a time-honored tradition. (D.G.)
participants (5)
-
Chaddaï Fouché
-
Daniel Fischer
-
Hans Aberg
-
michael rice
-
Sebastian Fischer