
wchogg:
On Thu, Sep 18, 2008 at 1:29 PM, Don Stewart
wrote: wchogg:
Hey Haskell, So for a fairly inane reason, I ended up taking a couple of minutes and writing a program that would spit out, to the console, the number of lines in a file. Off the top of my head, I came up with this which worked fine with files that had 100k lines:
main = do path <- liftM head $ getArgs h <- openFile path ReadMode n <- execStateT (countLines h) 0 print n
untilM :: Monad m => (a -> m Bool) -> (a -> m ()) -> a -> m () untilM cond action val = do truthy <- cond val if truthy then return () else action val >> (untilM cond action val)
countLines :: Handle -> StateT Int IO () countLines = untilM (\h -> lift $ hIsEOF h) (\h -> do lift $ hGetLine h modify (+1))
If this makes anyone cringe or cry "you're doing it wrong", I'd actually like to hear it. I never really share my projects, so I don't know how idiosyncratic my style is.
This makes me cry.
import System.Environment import qualified Data.ByteString.Lazy.Char8 as B
main = do [f] <- getArgs s <- B.readFile f print (B.count '\n' s)
Compile it.
$ ghc -O2 --make A.hs
$ time ./A /usr/share/dict/words 52848 ./A /usr/share/dict/words 0.00s user 0.00s system 93% cpu 0.007 total
Against standard tools:
$ time wc -l /usr/share/dict/words 52848 /usr/share/dict/words wc -l /usr/share/dict/words 0.01s user 0.00s system 88% cpu 0.008 total
So both you & Bryan do essentially the same thing and of course both versions are far better than mine. So the purpose of using the Lazy version of ByteString was so that the file is only incrementally loaded by readFile as count is processing?
Yep, that's right The streaming nature is implicit in the lazy bytestring. It's kind of the dual of explicit chunkwise control -- chunk processing reified into the data structure. -- Don