
Am Mittwoch, 4. März 2009 01:44 schrieb Tobias Olausson:
Hello all. I am currently implementing an emulation of a CPU, in which the CPU's RAM is part of the internal state that is passed around in the program using a state monad. However, the program performs unexpectingly bad, and some profiling information makes us believe that the problem is the high memory usage of the program.
The program below is similar to our main program used when testing a sorting algorithm in this CPU:
module Main where
import Control.Monad.State.Lazy
Not good, use Control.Monad.State.Strict
import Data.Word import Data.Array.Diff import Control.Concurrent (threadDelay)
data LoopState = LoopState { intVal :: Integer , diff :: DiffUArray Word8 Word8
Diff(U)Arrays tend to be slow, use them with care.
}
initState :: LoopState initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00..0xFF]])
main :: IO () main = do execStateT looper initState >>= putStrLn . show . intVal
looper :: StateT LoopState IO () looper = do st <- get let res = intVal st + 1 idx = fromIntegral res put $ st { intVal = res, diff = (diff st) // [(idx,idx)] } if res == 13000000 then return () else looper
You're being too lazy, building a huge thunk that only gets evaluated at the end of the loop. You have to force evaluation earlier.
Of course our program does more than updating a counter ;-) Compiling and running this program yields the following result:
[~]:[olaussot] >> ghc --make -O2 -o array ArrayPlay.hs [~]:[olaussot] >> ./array +RTS -sstderr ./array +RTS -sstderr 13000000 313,219,740 bytes allocated in the heap 1,009,986,984 bytes copied during GC 200,014,828 bytes maximum residency (8 sample(s)) 4,946,648 bytes maximum slop 393 MB total memory in use (3 MB lost due to fragmentation)
Generation 0: 590 collections, 0 parallel, 3.06s, 3.09s elapsed Generation 1: 8 collections, 0 parallel, 3.56s, 4.21s elapsed
INIT time 0.00s ( 0.00s elapsed) MUT time 0.27s ( 0.27s elapsed) GC time 6.62s ( 7.30s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 6.89s ( 7.57s elapsed)
%GC time 96.1% (96.4% elapsed)
Alloc rate 1,155,958,754 bytes per MUT second
Productivity 3.9% of total user, 3.6% of total elapsed
Why does the program spend 96.1% of its total running time collecting garbage? Any tips to make this program perform better are appreciated. Please do tell if anything is unclear.
Nothing gets evaluated until the end, so nothing can be discarded earlier. ---------------------------------------------------------------------- {-# LANGUAGE BangPatterns #-} module Main where import Control.Monad.State.Strict import Data.Word import Data.Array.Unboxed import Data.Array.ST import Data.Array.MArray update :: UArray Word8 Word8 -> Word8 -> Word8 -> UArray Word8 Word8 update arr i v = runSTUArray $ do sar <- unsafeThaw arr writeArray sar i v return sar data LoopState = LoopState { intVal :: !Integer , diff :: !(UArray Word8 Word8) } initState :: LoopState initState = LoopState 0 (array (0x00,0xFF) [(idx,0)|idx<-[0x00 .. 0xFF]]) main :: IO () main = do execStateT looper initState >>= putStrLn . show . intVal looper :: StateT LoopState IO () looper = do LoopState i df <- get let res = i + 1 idx = fromIntegral res !ndf = update df idx idx put (LoopState res ndf) if res == 13000000 then return () else looper ---------------------------------------------------------------------- Is much better behaved. I didn't investigate if every strictness annotation is necessary.
-- Tobias Olausson tobsan@gmail.com
Cheers, Daniel