Re: [Haskell-cafe] Lazy Lists and IO

11 Jul 2007

      On Wed, Jul 11, 2007 at 12:58:34AM -0400, Ronald Guida wrote:
...
Hi Everyone,
A few weeks ago, I started learning Haskell (and functional
programming) on my own from the wealth if information on the internet.
I recently read the paper "Why Functional Programming Matters" [1] and
it led me to wonder how to input a lazy list.
Suppose I have a function "f" that reads a lazy list, such that "f"
only consumes as much of the list as it needs to.  Laziness allows me
to apply "f" to an infinite list without creating an infinite loop.
Now I want to connect the console to "f", such that the list of inputs
to "f" comes from the console, one item at a time.
To create a specific example, lets suppose I want to read and
accumulate the contents of a list, and I want to stop when the sum
reaches or exceeds a specified cutoff.
I can write this as a function that reads elements from an infinite list:
...
accumUntilCutoff :: Integer -> [Integer] -> (Integer, [Integer])
accumUntilCutoff cutoff xs = accumHelper cutoff 0 id xs
    where
      accumHelper :: {- cutoff          -} Integer ->
                     {- sum so far      -} Integer ->
                     {- elements so far -} ([Integer] -> [Integer]) ->
                     {- remaining list  -} [Integer] ->
                     {- outputs         -} (Integer, [Integer])
      accumHelper cutoff sum elems [] = (sum, elems [])
      accumHelper cutoff sum elems (x:xs) =
          if sum < cutoff
            then accumHelper cutoff (sum + x) (elems . (x:)) xs
            else (sum, elems [])
Unrelatedly, a more exerienced Haskeller would probably write this using
composition interally:

accumUntilCutoff :: (Ord a, Num a) => a -> [a] -> (a, [a])
accumUntilCutoff cutoff xs =
    findAcceptLast ((>= cutoff) . fst) (zip (scanl (+) 0 xs) (inits xs))

findAcceptLast :: (a -> Bool) -> [a] -> a
findAcceptLast pred lst = fromMaybe (last lst) (find pred lst)
...
Example:
  GHCi> accumUntilCutoff 20 [1..]
        ==> (21,[1,2,3,4,5,6])
  GHCi> accumUntilCutoff 20 ([1..6] ++ repeat undefined)
        ==> (21,[1,2,3,4,5,6])
    [This demonstrates that accumUntilCutoff is lazy)
Alternatively, I can write a function that (1) prompts the user to
enter numbers until the running sum reaches the cutoff, and then (2)
returns the results in the IO monad.
...
readUntilCutoff :: Integer -> IO (Integer, [Integer])
readUntilCutoff cutoff = readHelper cutoff 0 id
    where
      readHelper :: {- cutoff          -} Integer ->
                    {- sum so far      -} Integer ->
                    {- elements so far -} ([Integer] -> [Integer]) ->
                    {- outputs         -} IO (Integer, [Integer])
      readHelper cutoff sum elems = do
        if sum < cutoff
          then do
            putStr "Enter an integer: "
            ln <- getLine
            let x = read ln
            readHelper cutoff (sum + x) (elems . (x:))
          else return $ (sum, elems [])
Example:
  GHCi> readUntilCutoff 20
  Enter an integer: 1
  Enter an integer: 2
  Enter an integer: 3
  Enter an integer: 4
  Enter an integer: 5
  Enter an integer: 6
        ==> (21,[1,2,3,4,5,6])
Here's my puzzle:
I am dis-satisfied with the fact that I have to embed IO code in the
middle of accumulation code.  Is there some way to separate
"readUntilCutoff" into two parts (e.g. functions), such that one part
would look extremely similar to "accumUntilCutoff", while the other
part would handle the user interaction associated with getting the
next number?
Not very nicely.

Option 1. Ignore purity

This is what Haskell typically does.  There is a magic function
getContents that returns stdin as a list of characters, lazily, which
you can process using ordinary list functions.  There is an even more
magic function (tecnically not portable, but in the defacto standard
libraries) unsafeInterleaveIO, which you can use to make your own magic
functions.

integersFromStdin :: IO [Integer]
integersFromStdin = unsafeInterleaveIO $ do
    putStr "Enter an integer: "
    hFlush stdout
    -- NB: GHCi turns off output buffering in an attempt to be
    -- friendlier.  you need the explicit flush in batch-compiled code.
    num <- readLn -- handy standard function!
    rest <- integersFromStdin
    -- this is magic - unsafeInterleaveIO means that it won't be
    -- executed until the value is needed
    return (num : rest)

Unfortunately, ignoring purity is fraught with peril.  One notable
example recently is in HAppS, a Haskell web framework.  Alex Jacobson (a
haskeller of significant note, not some "clueless newbie") accidentally
wrote { length xs > 0 } instead of { not (null xs) } in some parsing
code.  Which would just be inefficient, normally, but it demanded the
whole thing, which as it happened was a lazy stream coming of a socket.
Bad data dependencies, bang, deadlock.

Option 2. Ignore lists

It's possible to describe lists threaded with something else.

data ListT m a = m (ListT' m a)
data ListT' m a = NilT | ConsT a (ListT m a)

You get your safety back ... and lose the standard list functions, list
syntax, list comprehensions, list instances, strings-as-lists, et
cetera.

Stefan

Re: [Haskell-cafe] Lazy Lists and IO

Stefan O'Rear