
In part to help solidify my own understanding and usage, I wrote up the following which shows a comparison of processing an input file. Andrew Coppin originally posed the issue concerning strictness imposed by using the ST monad for processing an input file. This literate example shows a comparison of processing a file using: 1. the ST monad 2. the ST monad with Luke Palmer's suggested laziness 3. the State monad 4. a direct Iteratee (from John Millikin's Enumerator package) 5. the same Iteratee in Monad form 6. another slight variation of the Iteratee in Monad form First, lets get the basics taken care of:
import System.IO import System.Environment import Data.Word import Data.Bits import qualified Data.ByteString as B import Control.Applicative ( (<$>) ) import Control.Monad.Trans.Class (lift) import Control.Monad.IO.Class import Control.Monad.ST.Lazy import Data.STRef.Lazy import Control.Monad.Trans.State.Lazy import qualified Data.Enumerator as E import Data.Enumerator ( ($$) ) import qualified Data.Enumerator.Binary as EB import qualified Data.Enumerator.List as EL
This example is intended to show the effects of lazy or strict processing of a file, so an input file is needed.
inp = "input.example"
This input file can contain whatever you'd like, but for my testing I simply created a 5MB file of zeros via: $ dd if=/dev/zero of=input.example count=10000 $ ls -sh input.example 4.9M input.example The output file will use the following base name, with the number of the processing mode appended.
oup = "output.example"
The stats output of ghc will be used to compare the different processing modes, so only one process will be performed each time the application is run. The processing mode desired will be input as a command-line parameter, defaulting to the first mode.
main = do tna <- getArgs let tn = read $ head $ tna ++ ["1"] case tn of 6 -> testE 6 transform6 5 -> testE 5 transform5 4 -> testE 4 transform4 3 -> testT 3 transform3 2 -> testT 2 transform2 _ -> testT 1 transform1
To build and run this example (assuming this literate source is saved as fproc.lhs): $ ghc -o fproc --make fproc.lhs && for N in $(seq 1 5) ; do time ./fproc $N +RTS -t -RTS ; done That's all the basic setup out of the way. The actual processing of the file is irrelevant other than needing to remember previous input to process the current input. In my example each byte is usually combined with the previous byte to determine the output byte. In the ST and State monad forms, the previous byte value is stored in the state portion of the monad. The ST form is my interpretation of Andrew's original intent.
transform1 xs = runST (newSTRef 0 >>= work xs)
where work [] _ = return [] work (e:es) s = do n <- readSTRef s writeSTRef s $ shiftR e 4 let r = if e < 32 then e else n+e (r :) <$> work es s
To run this with standardized file processing, ByteString -> Word8 conversion, and output, main uses the testT wrapper. Hopefully all the pack and unpack operations are fusing and I haven't skewed the results by introducing strictness at this level.
testT n t = let oun = oup ++ show n op = B.pack . t . B.unpack in print n >> op <$> B.readFile inp >>= B.writeFile oun
My output from this is:
./fproc 1 +RTS -t
1
<
transform2 xs = runST (newSTRef 0 >>= work xs) where work [] _ = return [] work (e:es) s = do n <- readSTRef s writeSTRef s $! shiftR e 4 let r = if e < 32 then e else n+e fmap (r :) $ work es s
This yields nearly identical results (actually slightly worse, but
that may be within the measuring variance):
./fproc 2 +RTS -t
2
<
transform3 xs = evalState (work xs) 0 where work [] = return [] work (e:es) = do n <- get put $ shiftR e 4 let r = if e < 32 then e else n+e (r :) <$> work es
./fproc 3 +RTS -t
3
<
transform4 :: MonadIO m => Handle -> E.Iteratee B.ByteString m () transform4 h = E.continue $ work 0 where work n E.EOF = E.yield () E.EOF work n (E.Chunks []) = E.continue $ work n work n (E.Chunks (e:es)) = let op a b = (shiftR b 4, if b < 32 then b else a+b) (m, r) = B.mapAccumL op n e in do liftIO $ B.hPut h r work m $ E.Chunks es
The state element is now simply the first argument to the recursive inner work function. Each chunk will be a ByteString, so I use mapAccumL to process each ByteString as it's provided. The testE wrapper is enumerator equivalent of the testT wrapper used with the ST and State monads.
testE n t = do print n h <- openFile (oup ++ show n) WriteMode E.run_ (EB.enumFile inp $$ t h)
The results of this Iteratee approach:
./fproc 4 +RTS -t
4
<
transform5 :: MonadIO m => Handle -> E.Iteratee B.ByteString m () transform5 h = work 0 where work n = do e <- EL.head case e of Nothing -> return () Just e' -> next n e' next n e = do (m,r) <- return $ B.mapAccumL op n e liftIO $ B.hPut h r work m
op a b = (shiftR b 4, if b < 32 then b else a+b)
As would be expected, this has nearly identical performance to the
non-monadic version:
./fproc 5 +RTS -t
5
<
transform6 :: MonadIO m => Handle -> E.Iteratee B.ByteString m () transform6 h = work 0 where work n = EL.head >>= maybe (return ()) (next n) next n e = do let (m,r) = B.mapAccumL op n e liftIO $ B.hPut h r work m op a b = (shiftR b 4, if b < 32 then b else a+b)
That's not so bad! The learning curve of Iteratees is non-trivial, but the results are pretty readable, IMHO. And here's verification that the output is reasonable: $ ls -1sh *.example* 4.9M input.example 4.9M output.example1 4.9M output.example2 4.9M output.example3 4.9M output.example4 4.9M output.example5 4.9M output.example6 Hopefully this has been a useful comparison of using Iteratee techniques in relation to more conventional monads, and the performance results are good support of the usefulness of Iteratee's. As always, the greatest benefit was probably for myself in actually implementing and writing this up, but if you read through this far I hope you found it readable and useful. -Kevin Quick -- -KQ