Re: [Haskell-cafe] Generate random UArray in constant memory space.

Am Tuesday 09 February 2010 19:19:18 schrieben Sie:
Daniel,
I've just run venum2 program locally and here is my results:
$ ./venum2 10000000 +RTS -s ./venum2 10000000 +RTS -s 50000005000000 22,736 bytes allocated in the heap 688 bytes copied during GC 17,184 bytes maximum residency (1 sample(s)) 19,680 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
$ ./venum2 1000000000 +RTS -s ./venum2 1000000000 +RTS -s 500000000500000000 24,152 bytes allocated in the heap 688 bytes copied during GC 17,184 bytes maximum residency (1 sample(s)) 19,680 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
So my PC shows that there is constant memory allocation. Maybe I'm doing something wrong ?
Unlikely. However, I notice that you seem to have a 64-bit system. I don't know the details, but usually those have more registers than a 32- bit system, so you can get more complicated loops to run completely in the registers. I think that's what happens here, enumFromToU' is just complicated enough to not run in the registers on my 32-bit system, but still runs in the registers on your 64-bit system. The PRNG code is too complicated (too many temporary variables) to run in the registers on either system (BTW, have you tried it with the specialised PRNG code in the source file? I'm not sure whether that might be just small enough to run in the registers on a 64-bit system.).
2010/2/9 Daniel Fischer
: Am Tuesday 09 February 2010 15:43:13 schrieben Sie:
Update:
I've implemented `enumFromToU` through `unfoldU`:
enumFromToU' from to = unfoldU (to - from) f from where f i = let i' = i + 1 in JustS (i' :*: i')
This code behaves similarly to `enumFromToU` (i.e. constantly uses ~25 kb of memory, runs in the same time as above).
Wait,
$ cat venum2.hs module Main (main) where
import Text.Printf import Control.Applicative import System.Environment import Data.Array.Vector
main = do [size] <- map read <$> getArgs let ints = enumFromToU' 0 size :: UArr Int printf "%d\n" (sumU ints)
enumFromToU' from to = unfoldU (to - from) f from where f i = let i' = i + 1 in JustS (i' :*: i')
$ ghc -O2 --make venum2 [1 of 1] Compiling Main ( venum2.hs, venum2.o ) Linking venum2 ... $ ./venum2 +RTS -sstderr -RTS 1000000 ./venum2 1000000 +RTS -sstderr 1784293664 48,256,384 bytes allocated in the heap 6,256 bytes copied during GC 26,804 bytes maximum residency (1 sample(s)) 25,524 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
$ ./venum2 +RTS -sstderr -RTS 10000000 ./venum2 10000000 +RTS -sstderr -2004260032 481,937,552 bytes allocated in the heap 19,516 bytes copied during GC 26,804 bytes maximum residency (1 sample(s)) 25,512 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
So we have constant memory, but linear allocation, just like with the random numbers.
With enumFromToU, also the allocation is constant, so unfoldU allocates, enumFromToU not.
So the question is why random number list generator eats O(n) memory ?
It doesn't, not here, at least. Using System.Random, the allocation figures are about ten times as high, but the residency is about the same. Putting the PRNG code in the same file and tweaking things a bit (eliminating all intermediate Integers, e.g.), I get
$ cat ran2Vec.hs {-# LANGUAGE BangPatterns #-} module Main (main) where
import Text.Printf import Control.Applicative import System.Environment import Data.Array.Vector
randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int) randomListU b@(l,h) g size = unfoldU size gen g where !k = h-l+1 !b' = 2147483561 `rem` k gen !g = let (!x, !g') = stdNext g in JustS ((l+ (x+b') `rem` k) :*: g')
main = do [size] <- map read <$> getArgs let ints = randomListU (-10, 10) (mkStdGen 1) size printf "%d\n" (sumU ints)
data StdGen = StdGen {-# UNPACK #-} !Int {-# UNPACK #-} !Int
mkStdGen :: Int -> StdGen mkStdGen s | s < 0 = mkStdGen (-s) | otherwise = StdGen (s1+1) (s2+1) where (q, s1) = s `divMod` 2147483562 s2 = q `mod` 2147483398
{-# INLINE stdNext #-} stdNext :: StdGen -> (Int, StdGen) -- Returns values in the range stdRange stdNext (StdGen s1 s2) = z' `seq` g' `seq` (z', g') where !g' = StdGen s1'' s2'' !z' = if z < 1 then z + 2147483562 else z !z = s1'' - s2''
!k = s1 `quot` 53668 !s1' = 40014 * (s1 - k * 53668) - k * 12211 !s1'' = if s1' < 0 then s1' + 2147483563 else s1'
!k' = s2 `quot` 52774 !s2' = 40692 * (s2 - k' * 52774) - k' * 3791 !s2'' = if s2' < 0 then s2' + 2147483399 else s2' $ ghc -O2 -funbox-strict-fields -funfolding-use-threshold=16 --make ran2Vec -o ran2AVec5 Linking ran2AVec5 ... $ ./ran2AVec5 +RTS -sstderr -RTS 10000000 ./ran2AVec5 10000000 +RTS -sstderr 5130 521,828,888 bytes allocated in the heap 8,664 bytes copied during GC 26,788 bytes maximum residency (1 sample(s)) 25,636 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
nearly the same allocation figures as for the *much* simpler enumFromToU', and it's about ten times as fast as System.Random.

Daniel,
Yes, I have 64 bit system.
Maybe you're right. The PRNG code with the same vector size allocates
two times more memory at my PC. (~ 1 Gb)
Thank you,
Vasyl
2010/2/9 Daniel Fischer
Am Tuesday 09 February 2010 19:19:18 schrieben Sie:
Daniel,
I've just run venum2 program locally and here is my results:
$ ./venum2 10000000 +RTS -s ./venum2 10000000 +RTS -s 50000005000000 22,736 bytes allocated in the heap 688 bytes copied during GC 17,184 bytes maximum residency (1 sample(s)) 19,680 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
$ ./venum2 1000000000 +RTS -s ./venum2 1000000000 +RTS -s 500000000500000000 24,152 bytes allocated in the heap 688 bytes copied during GC 17,184 bytes maximum residency (1 sample(s)) 19,680 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
So my PC shows that there is constant memory allocation. Maybe I'm doing something wrong ?
Unlikely. However, I notice that you seem to have a 64-bit system. I don't know the details, but usually those have more registers than a 32- bit system, so you can get more complicated loops to run completely in the registers. I think that's what happens here, enumFromToU' is just complicated enough to not run in the registers on my 32-bit system, but still runs in the registers on your 64-bit system.
The PRNG code is too complicated (too many temporary variables) to run in the registers on either system (BTW, have you tried it with the specialised PRNG code in the source file? I'm not sure whether that might be just small enough to run in the registers on a 64-bit system.).
2010/2/9 Daniel Fischer
: Am Tuesday 09 February 2010 15:43:13 schrieben Sie:
Update:
I've implemented `enumFromToU` through `unfoldU`:
enumFromToU' from to = unfoldU (to - from) f from where f i = let i' = i + 1 in JustS (i' :*: i')
This code behaves similarly to `enumFromToU` (i.e. constantly uses ~25 kb of memory, runs in the same time as above).
Wait,
$ cat venum2.hs module Main (main) where
import Text.Printf import Control.Applicative import System.Environment import Data.Array.Vector
main = do [size] <- map read <$> getArgs let ints = enumFromToU' 0 size :: UArr Int printf "%d\n" (sumU ints)
enumFromToU' from to = unfoldU (to - from) f from where f i = let i' = i + 1 in JustS (i' :*: i')
$ ghc -O2 --make venum2 [1 of 1] Compiling Main ( venum2.hs, venum2.o ) Linking venum2 ... $ ./venum2 +RTS -sstderr -RTS 1000000 ./venum2 1000000 +RTS -sstderr 1784293664 48,256,384 bytes allocated in the heap 6,256 bytes copied during GC 26,804 bytes maximum residency (1 sample(s)) 25,524 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
$ ./venum2 +RTS -sstderr -RTS 10000000 ./venum2 10000000 +RTS -sstderr -2004260032 481,937,552 bytes allocated in the heap 19,516 bytes copied during GC 26,804 bytes maximum residency (1 sample(s)) 25,512 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
So we have constant memory, but linear allocation, just like with the random numbers.
With enumFromToU, also the allocation is constant, so unfoldU allocates, enumFromToU not.
So the question is why random number list generator eats O(n) memory ?
It doesn't, not here, at least. Using System.Random, the allocation figures are about ten times as high, but the residency is about the same. Putting the PRNG code in the same file and tweaking things a bit (eliminating all intermediate Integers, e.g.), I get
$ cat ran2Vec.hs {-# LANGUAGE BangPatterns #-} module Main (main) where
import Text.Printf import Control.Applicative import System.Environment import Data.Array.Vector
randomListU :: (Int, Int) -> StdGen -> Int -> (UArr Int) randomListU b@(l,h) g size = unfoldU size gen g where !k = h-l+1 !b' = 2147483561 `rem` k gen !g = let (!x, !g') = stdNext g in JustS ((l+ (x+b') `rem` k) :*: g')
main = do [size] <- map read <$> getArgs let ints = randomListU (-10, 10) (mkStdGen 1) size printf "%d\n" (sumU ints)
data StdGen = StdGen {-# UNPACK #-} !Int {-# UNPACK #-} !Int
mkStdGen :: Int -> StdGen mkStdGen s | s < 0 = mkStdGen (-s) | otherwise = StdGen (s1+1) (s2+1) where (q, s1) = s `divMod` 2147483562 s2 = q `mod` 2147483398
{-# INLINE stdNext #-} stdNext :: StdGen -> (Int, StdGen) -- Returns values in the range stdRange stdNext (StdGen s1 s2) = z' `seq` g' `seq` (z', g') where !g' = StdGen s1'' s2'' !z' = if z < 1 then z + 2147483562 else z !z = s1'' - s2''
!k = s1 `quot` 53668 !s1' = 40014 * (s1 - k * 53668) - k * 12211 !s1'' = if s1' < 0 then s1' + 2147483563 else s1'
!k' = s2 `quot` 52774 !s2' = 40692 * (s2 - k' * 52774) - k' * 3791 !s2'' = if s2' < 0 then s2' + 2147483399 else s2' $ ghc -O2 -funbox-strict-fields -funfolding-use-threshold=16 --make ran2Vec -o ran2AVec5 Linking ran2AVec5 ... $ ./ran2AVec5 +RTS -sstderr -RTS 10000000 ./ran2AVec5 10000000 +RTS -sstderr 5130 521,828,888 bytes allocated in the heap 8,664 bytes copied during GC 26,788 bytes maximum residency (1 sample(s)) 25,636 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation)
nearly the same allocation figures as for the *much* simpler enumFromToU', and it's about ten times as fast as System.Random.
participants (2)
-
Daniel Fischer
-
Vasyl Pasternak