
Alexey Khudyakov wrote:
Hello!
I'm tryig to write efficient code for creating histograms. I have following requirements for it:
1. O(1) element insertion 2. No reallocations. Thus in place updates are needed.
accumArray won't go because I need to fill a lot of histograms (hundrends) from vely long list of data (possibly millions of elements) and it will traverse input data for each histogram.
Sorry, Duncan is right. I misread here. My first idea would still be to use accumArray though, or rather, accum, processing the input data in chunks of an appropriate size (which depends on the histogram sizes.) But actually, the ST code isn't bad (I hope this isn't homework): ------------------------------------------------------------------------ import Control.Monad.ST import Control.Monad import Data.Array.ST import Data.Array.Unboxed stuArray :: ST s (STUArray s i e) -> ST s (STUArray s i e) stuArray = id -- Create histograms. -- -- Each histogram is described by a pair (f, (l, u)), where 'f' maps -- a data entry to an Int index, and l and u are lower and upper bounds -- of the indices, respectively. -- mkHistograms :: [(a -> Int, (Int, Int))] -> [a] -> [UArray Int Int] mkHistograms hs ds = runST collect where -- Why is the type signature on 'collect' required here? collect :: ST s [UArray Int Int] collect = do -- create histogram arrays of appropriate sizes histograms <- forM hs $ \(_, range) -> do stuArray $ newArray range 0 -- iterate through the data forM_ ds $ \d -> do -- iterate through the histograms forM_ (zip hs histograms) $ \((f, _), h) -> do -- update appropriate entry writeArray h (f d) . succ =<< readArray h (f d) -- finally, freeze the histograms and return them -- (using unsafeFreeze is ok because we're done modifying the -- arrays) mapM unsafeFreeze histograms test = mkHistograms [((`mod` 3), (0,2)), ((`mod` 5), (0,4))] [0..10] -- test returns -- [array (0,2) [(0,4),(1,4),(2,3)], -- array (0,4) [(0,3),(1,2),(2,2),(3,2),(4,2)]] ------------------------------------------------------------------------ Bertram P.S. Ryan is right, too - I'm not sure where I got confused there. runST $ foo didn't work in ghc 6.6; I knew that it works in ghc 6.8.3, but I thought this was changed again.