Re: [GHC] #7561: Unnecessary Heap Allocations - Slow Performance

#7561: Unnecessary Heap Allocations - Slow Performance -------------------------------+-------------------------------------------- Reporter: wurmli | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.8.1 Component: Compiler | Version: 7.6.1 Keywords: | Os: Linux Architecture: x86_64 (amd64) | Failure: Runtime performance bug Difficulty: Unknown | Testcase: Blockedby: | Blocking: Related: | -------------------------------+-------------------------------------------- Comment(by wurmli): Let me add a similar case, exemplifying better why I think that not the vector library, but the way the optimiser deals with reductions seems to be the culprit. (That's why I had put "unnecessary heap allocations in the title; #1498 might be a related ticket.) {{{ import System.Environment import Control.Applicative import Control.Monad import Control.Monad.ST import qualified Data.Vector.Storable.Mutable as VSM import qualified Data.Vector.Storable as G ------------------------------------------------------------------- -- Increment counter by 1 acc1 v = do{ k<-VSM.read v 0; VSM.write v 0 (k+1) } -- Increment counter 4 times by 1 until limit is reached -- i.e. each loop adds 4 whileModify1 :: G.MVector s Int -> ST s () whileModify1 v = do go <- do{ k<-VSM.read v 0; n<-VSM.read v 1; return (compare k n)} case go of LT -> do {acc1 v; acc1 v; acc1 v; acc1 v; whileModify1 v} EQ -> return () GT -> return () ------------------------------------------------------------------- -- Increment counter by 2 acc2 v = do{ k<-VSM.read v 0; VSM.write v 0 (k+1); k<-VSM.read v 0; VSM.write v 0 (k+1); } -- Increment counter twice by 2 until limit is reached -- i.e. each loop adds 4 whileModify2 :: G.MVector s Int -> ST s () whileModify2 v = do go <- do{ k<-VSM.read v 0; n<-VSM.read v 1; return (compare k n)} case go of LT -> do {acc2 v; acc2 v; whileModify2 v} EQ -> return () GT -> return () -------------------------------------------------------------------- -- Case 1 is fast with low use of heap space -- Case 2 is slow with high use of heap (about 400 times slower, 300 times more heap) testAcc :: Int -> Int -> G.Vector Int testAcc a n = runST $ do v <- (G.thaw $ G.fromList [0,n]) :: ST s (G.MVector s Int) case a of 1 -> whileModify1 v 2 -> whileModify2 v G.freeze v main = do let k = 50000000 n <- read.head <$> getArgs putStrLn $ show $ testAcc n k }}} The llvm backend improves timing about 4 fold {{{ ghc -threaded -rtsopts -O2 -fllvm heapAndSpeed.hs }}} Even though the two ways of increasing the counter are almost the same, the resulting code behaves vastly different, not only in speed, but also in heap usage (and garbage collector activity). {{{ ./heapAndSpeed 1 +RTS -s fromList [50000000,50000000] 74,088 bytes allocated in the heap 6,296 bytes copied during GC 47,184 bytes maximum residency (1 sample(s)) 18,352 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) ... INIT time 0.00s ( 0.00s elapsed) MUT time 0.02s ( 0.02s elapsed) GC time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.02s ( 0.02s elapsed) }}} {{{ ./heapAndSpeed 2 +RTS -s fromList [50000000,50000000] 25,000,074,088 bytes allocated in the heap 7,048,952 bytes copied during GC 47,184 bytes maximum residency (2 sample(s)) 22,448 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) ... INIT time 0.00s ( 0.00s elapsed) MUT time 8.06s ( 8.07s elapsed) GC time 0.17s ( 0.17s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 8.23s ( 8.24s elapsed) }}} -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7561#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC