
Conduits seem like a popular and useful way to compose operations which consume and produce streams at various speeds. I hadn't used them before, so apologies for any obvious mistakes below. Conduits have lots of power in terms of how you can interleave monadic effects with the streaming; but I found that I had a stream with no effects at all and I wanted to convert it back into haskells more simplistic representation, lazy lists. This is probably a well-known trick, but I couldn't find how to do this from googling, so here is the solution I found in case it helps someone else:
{-# LANGUAGE DeriveDataTypeable, FlexibleContexts #-} module Main where
import Data.Conduit import Control.Monad.Catch import Control.Monad.Trans.Resource import Control.Monad.Writer import Control.Monad.Identity import qualified Data.Conduit.List as CL import Data.Typeable
source123Error :: Monad m => Source m Int source123Error = do yield 1 yield 2 yield 3 error "error"
A source which produces some data and then hangs, to test lazy production. One solution which doesn't work is CL.consume (as its own docs say) - and instead this happens: *Main> :t runIdentity (source123Error $$ CL.consume) runIdentity (source123Error $$ CL.consume) :: [Int] *Main> runIdentity (source123Error $$ CL.consume) *** Exception: error But what we can do instead is push the data out via the Writer monad:
tellEverything :: MonadWriter [a] m => Sink a m () tellEverything = awaitForever (\x -> tell [x])
*Main> :t source123Error $$ tellEverything source123Error $$ tellEverything :: MonadWriter [Int] m => m () *Main> snd $ runWriter (source123Error $$ tellEverything) [1,2,3*** Exception: error Success! A lazily produced list. If your output is of any size - and depending on the compositions pattern in your code - it may be much faster to use this version:
tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m () tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))
*Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne) [1,2,3*** Exception: error I found a further case where the Conduit I was trying to run was pure but had a 'MonadThrow' constraint. You can use the same approach here, using the ExceptionT transformer to satisfy the MonadThrow constraint.
data MyError = MyError deriving (Show,Typeable) instance Exception MyError
source123Throw :: MonadThrow m => Source m Int source123Throw = do yield 1 yield 2 yield 3 throwM MyError
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Error $$ tellEverything) [1,2,3*** Exception: error *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything) [1,2,3] An aside: I tried to measure the speed difference between the list-mappend and Endo-mappend approaches with the following code. count2N uses a binary-tree shaped recursion so it should be fairly bad for list-mappend with lots of long lists on the left.
count2N :: Monad m => Int -> Source m Int count2N 0 = yield 0 count2N n = count2N (n-1) >> count2N (n-1)
speedTest1 = print . length . snd . runWriter $ (count2N 24 $$ tellEverything) speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $ (count2N 24 $$ tellEveryEndo)
With -O or -O2 I measure no speed difference between these, they both take a bit over 2 seconds. With no optimisation flag, speedTest2 is slower, 17 seconds vs 11 seconds. I'm quite surprised the Endo version isn't faster, it seems like something is rewriting those list appends? Cheers, Jules

conduit isn't designed to be used in this way, though in theory such a lazy function would be possible. To get the same effect, you can (ab)use the Data.Conduit.Lazy module, which provides a lazy I/O escape hatch. In this case:
lazyConsume source123Error >>= print
On Wed, Nov 4, 2015 at 12:59 AM, Jules Bean
Conduits seem like a popular and useful way to compose operations which consume and produce streams at various speeds. I hadn't used them before, so apologies for any obvious mistakes below.
Conduits have lots of power in terms of how you can interleave monadic effects with the streaming; but I found that I had a stream with no effects at all and I wanted to convert it back into haskells more simplistic representation, lazy lists.
This is probably a well-known trick, but I couldn't find how to do this from googling, so here is the solution I found in case it helps someone else:
{-# LANGUAGE DeriveDataTypeable, FlexibleContexts #-} module Main where
import Data.Conduit import Control.Monad.Catch import Control.Monad.Trans.Resource import Control.Monad.Writer import Control.Monad.Identity import qualified Data.Conduit.List as CL import Data.Typeable
source123Error :: Monad m => Source m Int source123Error = do yield 1 yield 2 yield 3 error "error"
A source which produces some data and then hangs, to test lazy production.
One solution which doesn't work is CL.consume (as its own docs say) - and instead this happens:
*Main> :t runIdentity (source123Error $$ CL.consume) runIdentity (source123Error $$ CL.consume) :: [Int] *Main> runIdentity (source123Error $$ CL.consume) *** Exception: error
But what we can do instead is push the data out via the Writer monad:
tellEverything :: MonadWriter [a] m => Sink a m () tellEverything = awaitForever (\x -> tell [x])
*Main> :t source123Error $$ tellEverything source123Error $$ tellEverything :: MonadWriter [Int] m => m () *Main> snd $ runWriter (source123Error $$ tellEverything) [1,2,3*** Exception: error
Success! A lazily produced list.
If your output is of any size - and depending on the compositions pattern in your code - it may be much faster to use this version:
tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m () tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))
*Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne) [1,2,3*** Exception: error
I found a further case where the Conduit I was trying to run was pure but had a 'MonadThrow' constraint. You can use the same approach here, using the ExceptionT transformer to satisfy the MonadThrow constraint.
data MyError = MyError deriving (Show,Typeable) instance Exception MyError
source123Throw :: MonadThrow m => Source m Int source123Throw = do yield 1 yield 2 yield 3 throwM MyError
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Error $$ tellEverything) [1,2,3*** Exception: error *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything) [1,2,3]
An aside: I tried to measure the speed difference between the list-mappend and Endo-mappend approaches with the following code. count2N uses a binary-tree shaped recursion so it should be fairly bad for list-mappend with lots of long lists on the left.
count2N :: Monad m => Int -> Source m Int count2N 0 = yield 0 count2N n = count2N (n-1) >> count2N (n-1)
speedTest1 = print . length . snd . runWriter $ (count2N 24 $$ tellEverything) speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $ (count2N 24 $$ tellEveryEndo)
With -O or -O2 I measure no speed difference between these, they both take a bit over 2 seconds. With no optimisation flag, speedTest2 is slower, 17 seconds vs 11 seconds.
I'm quite surprised the Endo version isn't faster, it seems like something is rewriting those list appends?
Cheers,
Jules
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

This got me curious, so I just added a `sourceToList` function to master:
https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace...
On Wed, Nov 4, 2015 at 4:43 AM, Michael Snoyman
conduit isn't designed to be used in this way, though in theory such a lazy function would be possible. To get the same effect, you can (ab)use the Data.Conduit.Lazy module, which provides a lazy I/O escape hatch. In this case:
lazyConsume source123Error >>= print
On Wed, Nov 4, 2015 at 12:59 AM, Jules Bean
wrote: Conduits seem like a popular and useful way to compose operations which consume and produce streams at various speeds. I hadn't used them before, so apologies for any obvious mistakes below.
Conduits have lots of power in terms of how you can interleave monadic effects with the streaming; but I found that I had a stream with no effects at all and I wanted to convert it back into haskells more simplistic representation, lazy lists.
This is probably a well-known trick, but I couldn't find how to do this from googling, so here is the solution I found in case it helps someone else:
{-# LANGUAGE DeriveDataTypeable, FlexibleContexts #-} module Main where
import Data.Conduit import Control.Monad.Catch import Control.Monad.Trans.Resource import Control.Monad.Writer import Control.Monad.Identity import qualified Data.Conduit.List as CL import Data.Typeable
source123Error :: Monad m => Source m Int source123Error = do yield 1 yield 2 yield 3 error "error"
A source which produces some data and then hangs, to test lazy production.
One solution which doesn't work is CL.consume (as its own docs say) - and instead this happens:
*Main> :t runIdentity (source123Error $$ CL.consume) runIdentity (source123Error $$ CL.consume) :: [Int] *Main> runIdentity (source123Error $$ CL.consume) *** Exception: error
But what we can do instead is push the data out via the Writer monad:
tellEverything :: MonadWriter [a] m => Sink a m () tellEverything = awaitForever (\x -> tell [x])
*Main> :t source123Error $$ tellEverything source123Error $$ tellEverything :: MonadWriter [Int] m => m () *Main> snd $ runWriter (source123Error $$ tellEverything) [1,2,3*** Exception: error
Success! A lazily produced list.
If your output is of any size - and depending on the compositions pattern in your code - it may be much faster to use this version:
tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m () tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))
*Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne) [1,2,3*** Exception: error
I found a further case where the Conduit I was trying to run was pure but had a 'MonadThrow' constraint. You can use the same approach here, using the ExceptionT transformer to satisfy the MonadThrow constraint.
data MyError = MyError deriving (Show,Typeable) instance Exception MyError
source123Throw :: MonadThrow m => Source m Int source123Throw = do yield 1 yield 2 yield 3 throwM MyError
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Error $$ tellEverything) [1,2,3*** Exception: error *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything) [1,2,3]
An aside: I tried to measure the speed difference between the list-mappend and Endo-mappend approaches with the following code. count2N uses a binary-tree shaped recursion so it should be fairly bad for list-mappend with lots of long lists on the left.
count2N :: Monad m => Int -> Source m Int count2N 0 = yield 0 count2N n = count2N (n-1) >> count2N (n-1)
speedTest1 = print . length . snd . runWriter $ (count2N 24 $$ tellEverything) speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $ (count2N 24 $$ tellEveryEndo)
With -O or -O2 I measure no speed difference between these, they both take a bit over 2 seconds. With no optimisation flag, speedTest2 is slower, 17 seconds vs 11 seconds.
I'm quite surprised the Endo version isn't faster, it seems like something is rewriting those list appends?
Cheers,
Jules
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Yay! That’s great.
The point of course is that you shouldn’t need to do a lazyIO trick when there is no IO going on.
My actual application of this was in Text.XML.Stream.Parse which certainly can produce output lazily without doing IO.
Jules
On 4 Nov 2015, at 13:04, Michael Snoyman
This got me curious, so I just added a `sourceToList` function to master:
https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace...
On Wed, Nov 4, 2015 at 4:43 AM, Michael Snoyman
wrote: conduit isn't designed to be used in this way, though in theory such a lazy function would be possible. To get the same effect, you can (ab)use the Data.Conduit.Lazy module, which provides a lazy I/O escape hatch. In this case: lazyConsume source123Error >>= print
On Wed, Nov 4, 2015 at 12:59 AM, Jules Bean
wrote: Conduits seem like a popular and useful way to compose operations which consume and produce streams at various speeds. I hadn't used them before, so apologies for any obvious mistakes below. Conduits have lots of power in terms of how you can interleave monadic effects with the streaming; but I found that I had a stream with no effects at all and I wanted to convert it back into haskells more simplistic representation, lazy lists.
This is probably a well-known trick, but I couldn't find how to do this from googling, so here is the solution I found in case it helps someone else:
{-# LANGUAGE DeriveDataTypeable, FlexibleContexts #-} module Main where
import Data.Conduit import Control.Monad.Catch import Control.Monad.Trans.Resource import Control.Monad.Writer import Control.Monad.Identity import qualified Data.Conduit.List as CL import Data.Typeable
source123Error :: Monad m => Source m Int source123Error = do yield 1 yield 2 yield 3 error "error"
A source which produces some data and then hangs, to test lazy production.
One solution which doesn't work is CL.consume (as its own docs say) - and instead this happens:
*Main> :t runIdentity (source123Error $$ CL.consume) runIdentity (source123Error $$ CL.consume) :: [Int] *Main> runIdentity (source123Error $$ CL.consume) *** Exception: error
But what we can do instead is push the data out via the Writer monad:
tellEverything :: MonadWriter [a] m => Sink a m () tellEverything = awaitForever (\x -> tell [x])
*Main> :t source123Error $$ tellEverything source123Error $$ tellEverything :: MonadWriter [Int] m => m () *Main> snd $ runWriter (source123Error $$ tellEverything) [1,2,3*** Exception: error
Success! A lazily produced list.
If your output is of any size - and depending on the compositions pattern in your code - it may be much faster to use this version:
tellEveryEndo :: MonadWriter (Endo [a]) m => Sink a m () tellEveryEndo = awaitForever (\x -> tell (Endo (x:)))
*Main> ($[]) . appEndo . snd . runWriter $ (source123Error $$ tellOne) [1,2,3*** Exception: error
I found a further case where the Conduit I was trying to run was pure but had a 'MonadThrow' constraint. You can use the same approach here, using the ExceptionT transformer to satisfy the MonadThrow constraint.
data MyError = MyError deriving (Show,Typeable) instance Exception MyError
source123Throw :: MonadThrow m => Source m Int source123Throw = do yield 1 yield 2 yield 3 throwM MyError
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Error $$ tellEverything) [1,2,3*** Exception: error *Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything) [1,2,3]
An aside: I tried to measure the speed difference between the list-mappend and Endo-mappend approaches with the following code. count2N uses a binary-tree shaped recursion so it should be fairly bad for list-mappend with lots of long lists on the left.
count2N :: Monad m => Int -> Source m Int count2N 0 = yield 0 count2N n = count2N (n-1) >> count2N (n-1)
speedTest1 = print . length . snd . runWriter $ (count2N 24 $$ tellEverything) speedTest2 = print . length . ($[]) . appEndo . snd . runWriter $ (count2N 24 $$ tellEveryEndo)
With -O or -O2 I measure no speed difference between these, they both take a bit over 2 seconds. With no optimisation flag, speedTest2 is slower, 17 seconds vs 11 seconds.
I'm quite surprised the Endo version isn't faster, it seems like something is rewriting those list appends?
Cheers,
Jules
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

On 4 Nov 2015, at 13:04, Michael Snoyman
This got me curious, so I just added a `sourceToList` function to master:
https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace...
Having spent a day thinking this over… Nothing with the type `Source m a -> m [a]` can work for my second example - the one where I use runExceptionT to discharge the MonadThrow constraint. This is because once you use runExceptionT you have pushed yourself into the situation where in case of error there is no ‘return value’. It’s not that I care about that per se - if there is an error then the return value is no use to me - but unfortunately that has knock-on implications on laziness. The Writer monad solution pushes out the return value incrementally by a ‘side-channel’ rather than using the return value and it’s that property which lets it work even in the presence of runExceptionT. Another approach which would work though is to provide a newtyped Identity monad which handles MonadThrow by _|_, which would allow you to regain laziness (?) Jules

On Wed, Nov 4, 2015 at 9:28 AM, Jules Bean
On 4 Nov 2015, at 13:04, Michael Snoyman
wrote: This got me curious, so I just added a `sourceToList` function to master:
https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace...
Having spent a day thinking this over…
Nothing with the type `Source m a -> m [a]` can work for my second example - the one where I use runExceptionT to discharge the MonadThrow constraint. This is because once you use runExceptionT you have pushed yourself into the situation where in case of error there is no ‘return value’.
It’s not that I care about that per se - if there is an error then the return value is no use to me - but unfortunately that has knock-on implications on laziness.
The Writer monad solution pushes out the return value incrementally by a ‘side-channel’ rather than using the return value and it’s that property which lets it work even in the presence of runExceptionT.
Another approach which would work though is to provide a newtyped Identity monad which handles MonadThrow by _|_, which would allow you to regain laziness (?)
Jules
I don't think that the Writer example above is demonstrating what you're saying, since you're using imprecise exceptions (`error`) instead of MonadThrow. You could do the same thing with sourceToList and get that result. You're also correct that some kind of a Identity monad with a MonadThrow instance based on `throw` would allow this to work. Michael

(Apologies for coming back after 2 weeks)
On 4 Nov 2015, at 19:41, Michael Snoyman
On Wed, Nov 4, 2015 at 9:28 AM, Jules Bean
wrote: On 4 Nov 2015, at 13:04, Michael Snoyman
wrote: This got me curious, so I just added a `sourceToList` function to master:
https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace...
Having spent a day thinking this over…
Nothing with the type `Source m a -> m [a]` can work for my second example - the one where I use runExceptionT to discharge the MonadThrow constraint. This is because once you use runExceptionT you have pushed yourself into the situation where in case of error there is no ‘return value’.
I don't think that the Writer example above is demonstrating what you're saying, since you're using imprecise exceptions (`error`) instead of MonadThrow. You could do the same thing with sourceToList and get that result.
There was another example where I don’t use imprecise exceptions, but I use ExceptionT:
source123Throw :: MonadThrow m => Source m Int source123Throw = do yield 1 yield 2 yield 3 throwM MyError
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything) [1,2,3] The point here is the order of nesting runWriter(T) and runException(T). If you do it this way around the type is `([a],Either Exception ())` and you can just inspect the first element of the tuple to check the list without checking for an error - which allows it to be streaming. `Source m a -> m [a]` can only take this second form. Jules

(Apologies for coming back after 2 weeks)
On 4 Nov 2015, at 19:41, Michael Snoyman
On Wed, Nov 4, 2015 at 9:28 AM, Jules Bean
wrote: On 4 Nov 2015, at 13:04, Michael Snoyman
wrote: This got me curious, so I just added a `sourceToList` function to master:
https://github.com/snoyberg/conduit/commit/289f671cb7669c4aec78d8e77f01f2ace...
Having spent a day thinking this over…
Nothing with the type `Source m a -> m [a]` can work for my second example - the one where I use runExceptionT to discharge the MonadThrow constraint. This is because once you use runExceptionT you have pushed yourself into the situation where in case of error there is no ‘return value’.
I don't think that the Writer example above is demonstrating what you're saying, since you're using imprecise exceptions (`error`) instead of MonadThrow. You could do the same thing with sourceToList and get that result.
There was another example where I don’t use imprecise exceptions, but I use ExceptionT:
source123Throw :: MonadThrow m => Source m Int source123Throw = do yield 1 yield 2 yield 3 throwM MyError
*Main Control.Monad.Except Control.Monad.Trans.Resource> snd . runWriter . runExceptionT $ (source123Throw $$ tellEverything) [1,2,3] The point here is the order of nesting runWriter(T) and runException(T). If you do it this way around the type is `([a],Either Exception ())` and you can just inspect the first element of the tuple to check the list without checking for an error - which allows it to be streaming. `Source m a -> m [a]` can only take this second form. Jules
participants (2)
-
Jules Bean
-
Michael Snoyman