Re: [Haskell-cafe] A round of golf

19 Sep 2008

      On Thu, Sep 18, 2008 at 1:55 PM, Don Stewart  wrote:
...
wchogg:
...
On Thu, Sep 18, 2008 at 1:29 PM, Don Stewart  wrote:
...
wchogg:
...
Hey Haskell,
So for a fairly inane reason, I ended up taking a couple of minutes
and writing a program that would spit out, to the console, the number
of lines in a file.  Off the top of my head, I came up with this which
worked fine with files that had 100k lines:
main = do
 path <- liftM head $ getArgs
 h <- openFile path ReadMode
 n <- execStateT (countLines h) 0
 print n
untilM :: Monad m => (a -> m Bool) -> (a -> m ()) -> a -> m ()
untilM cond action val = do
 truthy <- cond val
 if truthy then return () else action val >> (untilM cond action val)
countLines :: Handle -> StateT Int IO ()
countLines = untilM (\h -> lift $ hIsEOF h) (\h -> do
                                                lift $ hGetLine h
                                                modify (+1))
If this makes anyone cringe or cry "you're doing it wrong", I'd
actually like to hear it.  I never really share my projects, so I
don't know how idiosyncratic my style is.
This makes me cry.
import System.Environment
   import qualified Data.ByteString.Lazy.Char8 as B
main = do
       [f] <- getArgs
       s   <- B.readFile f
       print (B.count '\n' s)
Compile it.
$ ghc -O2 --make A.hs
$ time ./A /usr/share/dict/words
   52848
   ./A /usr/share/dict/words 0.00s user 0.00s system 93% cpu 0.007 total
Against standard tools:
$ time wc -l /usr/share/dict/words
   52848 /usr/share/dict/words
   wc -l /usr/share/dict/words 0.01s user 0.00s system 88% cpu 0.008 total
So both you & Bryan do essentially the same thing and of course both
versions are far better than mine.  So the purpose of using the Lazy
version of ByteString was so that the file is only incrementally
loaded by readFile as count is processing?
Yep, that's right
The streaming nature is implicit in the lazy bytestring. It's kind of
the dual of explicit chunkwise control -- chunk processing reified into
the data structure.
Hi Don,
I have a bit more of a followup, actually.  You make use of the built
in bytestring consumer count, which itself is built upon the
foldlChunks function which is only exported in the
ByteString.Lazy.Internal.  If I want to make my own efficient
bytestring consumer, is that what I need to use in order to preserve
the inherent laziness of the datastructure?

Also, I feel a little at a loss for how to make a good bytestring
producer for efficiently _writing_ large swaths of data via writeFile.
 Would it be possible to whip up a small example?

Oh, and lastly, I apologize to both you & Bryan for making you cry.  I
hope you can forgive my cruelty.

Thanks,
Creighton

Re: [Haskell-cafe] A round of golf

Creighton Hogg