
Thank you ! I'm now in an equal time with the C code ;)
On Sun, Jun 19, 2016 at 3:12 PM, Bertram Felgenhauer via Haskell-Cafe
Guillaume Bouchard wrote:
- PRef (Unboxed bytearray of size 1, the closest thing to an unboxed stack allocation) : 86ms - URef (Unboxed vector of size 1) : 130ms - SRef (Storable vector of size 1) : 43ms (This is an improvement !) - BRef (Boxed ref) : 137ms - STRef : 54ms (this was my baseline)
You really shouldn't use any mutable variable at all for this, but pass the values around as function arguments instead:
Nice ! This is so obvious that I did not thought about it. Thank you. (Thank you for the followup-mail, it is clear that it may be easier for GHC to unbox / put in register "normal variable" than more complex type such as the one involved in PRef).
Besides, the code spends most of its time on parsing the input. The following more low-level code does the job far more quickly:
Impressive. On the profiling the parsing was accounting only for 10% of the running time. Perhaps the profiler introduces more overhead on the vector part than on the parsing.
import Data.ByteString.Char8 as B
parse' :: IO [Int32] parse' = do content <- B.getContents return $ map fromIntegral $ unfoldr (B.readInt . B.dropWhile (=='\n')) $ content
This is in part with the C code. ~16ms
It's possible to improve this slightly by implementing the code from scratch:
parse'' :: IO [Int32] parse'' = do content <- B.getContents return $ go content where go b = case B.uncons b of Nothing -> [] Just ('\n',b) -> go b Just ('-',b) -> go'' 0 b Just (d,b) -> go' (fromIntegral (ord d - 48)) b go' v b = case B.uncons b of Nothing -> [v] Just ('\n',b) -> v : go b Just (d,b) -> go' (v*10 + fromIntegral (ord d - 48)) b go'' v b = case B.uncons b of Nothing -> [v] Just ('\n',b) -> v : go b Just (d,b) -> go' (v*10 - fromIntegral (ord d - 48)) b
Taken together these changes improve the runtime from 79ms to 21ms here.
This is better than the C code ;) Thank you. I learned a few thing today. -- Guillaume