Oh, I forgot the technique of inlining the lazy bytestring chunks, and processing each chunk seperately.

$ time ./fast
4166680
./fast  1.25s user 0.07s system 99% cpu 1.325 total

Essentially inline Lazy.foldlChunks and specializes is (the inliner should really get that).
And now we have a nice unboxed inner loop, which llvm might spot:

$ ghc -O2 -funbox-strict-fields fast.hs  --make -fllvm 
$ time ./fast                                                          
4166680
./fast  1.07s user 0.06s system 98% cpu 1.146 total

So about 8x faster. Waiting for some non-lazy bytestring benchmarks... :)

{-# LANGUAGE BangPatterns #-}

import Data.ByteString.Internal
import Data.ByteString.Unsafe
import qualified Data.ByteString.Char8      as S
import qualified Data.ByteString.Lazy.Char8 as L
import qualified Data.ByteString.Lazy.Internal as L
import System.IO.Posix.MMap.Lazy

main = do
    f <- unsafeMMapFile "test.txt"
    print . new 0 $ L.toChunks f

new :: Int -> [ByteString] -> Int
new i [] = i
new i (x:xs) = new (add i x) xs

-- jump into the fast path
{-# INLINE add #-}
add :: Int -> ByteString -> Int
add !i !s | S.null s   = i
          | isSpace' x = add (i+1) xs
          | otherwise  = add i     xs
  where T x xs = uncons s

data T = T !Char ByteString

uncons s = T (w2c (unsafeHead s)) (unsafeTail s)

isSpace' c = c == '\n'    || c == ' '
{-# INLINE isSpace' #-}




On Tue, Mar 19, 2013 at 7:36 PM, Don Stewart <dons00@gmail.com> wrote:
Just for fun. Here's some improvements. about 6x faster. 
I'd be interested to see what io-streams could do on this.

Using a 250M test file.

-- strict state monad and bang patterns on the uncons and accumulator argument:

$ time ./A
4166680
./A  8.42s user 0.57s system 99% cpu 9.037 total

-- just write a loop

$ time ./A 
4166680
./A  3.84s user 0.26s system 99% cpu 4.121 total

-- switch to Int

$ time ./A 
4166680
./A  1.89s user 0.23s system 99% cpu 2.134 total

-- custom isSpace function

$ time ./A
4166680
./A  1.56s user 0.24s system 99% cpu 1.808 total

-- mmap IO

$ time ./A
4166680
./A  1.54s user 0.09s system 99% cpu 1.636 total

Here's the final program:


{-# LANGUAGE BangPatterns #-}

import qualified Data.ByteString        as S
import qualified Data.ByteString.Lazy.Char8 as L
import System.IO.Posix.MMap.Lazy

main = do
    f <- unsafeMMapFile "test.txt"
    print $ go 0 f
  where
    go :: Int -> L.ByteString -> Int
    go !a !s = case L.uncons s of
        Nothing     -> a
        Just (x,xs) | isSpaceChar8 x -> go (a+1) xs
                    | otherwise      -> go a     xs

isSpaceChar8 c = c == '\n'    || c == ' '
{-# INLINE isSpaceChar8 #-}


On Mon, Mar 18, 2013 at 8:53 AM, Konstantin Litvinenko <to.darkangel@gmail.com> wrote:
Hi All!

I tune my toy project for performance and hit the wall on simple, in imperative world, task. Here is the code that model what I'm trying to achieve

import qualified Data.ByteString.Lazy as L
import Data.Word8(isSpace)
import Data.Word
import Control.Monad.State

type Stream = State L.ByteString

get_byte :: Stream (Maybe Word8)
get_byte = do
    s <- get
    case L.uncons s of
        Nothing -> return Nothing
        Just (x, xs) -> put xs >> return (Just x)

main = do
    f <- L.readFile "test.txt"
    let r = evalState count_spaces f
    print r
  where
    count_spaces = go 0
      where
        go a = do
            x <- get_byte
            case x of
                Just x' ->  if isSpace x' then go (a + 1) else go a
                Nothing -> return a

It takes the file and count spaces, in imperative way, consuming bytes one by one. The problem is: How to rewrite this to get rid of constant allocation of state but still working with stream of bytes? I can rewrite this as one-liner L.foldl, but that doesn't help me in any way to optimize my toy project where all algorithms build upon consuming stream of bytes.

PS. My main lang is C++ over 10 years and I only learn Haskell :)


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe