Streaming a Conduit into a lazy list

4 Nov 2015

      Conduits seem like a popular and useful way to compose operations
which consume and produce streams at various speeds. I hadn't used
them before, so apologies for any obvious mistakes below.

Conduits have lots of power in terms of how you can interleave monadic
effects with the streaming; but I found that I had a stream with no
effects at all and I wanted to convert it back into haskells more
simplistic representation, lazy lists.

This is probably a well-known trick, but I couldn't find how to do this
from googling, so here is the solution I found in case it helps
someone else:
...
{-# LANGUAGE DeriveDataTypeable, FlexibleContexts  #-}
module Main where
import Data.Conduit
import Control.Monad.Catch
import Control.Monad.Trans.Resource
import Control.Monad.Writer
import Control.Monad.Identity
import qualified Data.Conduit.List as CL
import Data.Typeable
source123Error :: Monad m => Source m Int
source123Error = do
  yield 1
  yield 2
  yield 3
  error "error"
A source which produces some data and then hangs, to test lazy production. 

One solution which doesn't work is CL.consume (as its own docs say) - and instead this happens:

*Main> :t runIdentity (source123Error $$ CL.consume)
runIdentity (source123Error $$ CL.consume) :: [Int]
*Main> runIdentity (source123Error $$ CL.consume)
*** Exception: error

But what we can do instead is push the data out via the Writer monad:
...
tellEverything :: MonadWriter [a] m => Sink a m ()
tellEverything = awaitForever (\x -> tell [x])
*Main> :t source123Error $$ tellEverything
source123Error $$ tellEverything :: MonadWriter [Int] m => m ()
*Main> snd $ runWriter (source123Error $$ tellEverything)
[1,2,3*** Exception: error

Success! A lazily produced list.

If your output is of any size - and depending on the compositions
pattern in your code - it may be much faster to use this version:
...
tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m ()
tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))
*Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne)
[1,2,3*** Exception: error

I found a further case where the Conduit I was trying to run was pure
but had a 'MonadThrow' constraint. You can use the same approach here,
using the ExceptionT transformer to satisfy the MonadThrow constraint.
...
data MyError = MyError deriving (Show,Typeable)
instance Exception MyError
source123Throw :: MonadThrow m => Source m Int
source123Throw = do
 yield 1
 yield 2
 yield 3
 throwM MyError
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Error $$ tellEverything)
[1,2,3*** Exception: error
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything)
[1,2,3]

An aside: I tried to measure the speed difference between the
list-mappend and Endo-mappend approaches with the following
code. count2N uses a binary-tree shaped recursion so it should be
fairly bad for list-mappend with lots of long lists on the left.
...
count2N :: Monad m => Int -> Source m Int
count2N 0 = yield 0
count2N n = count2N (n-1) >> count2N (n-1)
...
speedTest1 = print . length . snd . runWriter $ (count2N 24 $$ tellEverything)
speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $ (count2N 24 $$ tellEveryEndo)
With -O or -O2 I measure no speed difference between these, they both
take a bit over 2 seconds. With no optimisation flag, speedTest2 is
slower, 17 seconds vs 11 seconds. 

I'm quite surprised the Endo version isn't faster, it seems like
something is rewriting those list appends?

Cheers,

Jules

Jules Bean

tags

participants (1)