
I've just recently started learning how to use the enumerator library. I'm designing a utility that parses a file where each line is a JSON object. I've designed it to look like: source enumerator (enumHandle pretty much) -> chunk by lines (enumeratee) -> parse a line into an Object (enumeratee) -> filter objects based on a criteria (enumeratee) -> limit some keys from each object (enumeratee) -> encode the object into a lazy bytestring (enumeratee) -> output the file to stdout (iteratee) I'm having difficulties with the types, particularly composing the enumeratees in the middle. Someone in #haskell said that's a good case for
=> from Control.Monad but that seems to not like it if an enumeratee changes the type between the input and output.
Here's the relevant types: type Object = Map Text Value pipeline :: MonadIO m => (Enumerator ByteString IO ()) -> [Text] -> [Filter] -> Iteratee a m () pipeline s rfs fs = s $$ splitLines >=> parseLine >=> (filterObjects fs) >=> (restrictFields rfs) >=> encoder $$ output splitLines :: Monad m => Enumeratee ByteString ByteString m b parseLine :: Monad m => Enumeratee ByteString Object m b filterObjects :: Monad m => [Filter] -> Enumeratee Object Object m b restrictObjects :: Monad m => [Text] -> Enumeratee Object Object m b encoder :: Monad m => Enumeratee Object LBS.ByteString m b output :: MonadIO m => Iteratee LBS.ByteString m () I'm using >=> because I'd prefer to compose them left-right for readability Here's the error I get Couldn't match expected type `ByteString' with actual type `M.Map Text Value' Expected type: Data.Enumerator.Step ByteString m b1 -> Iteratee ByteString m b2 Actual type: Enumeratee ByteString Object m0 b0 In the first argument of `(>=>)', namely `parseLine' In the second argument of `(>=>)', namely `parseLine >=> (filterObjects fs) >=> (restrictFields rfs) >=> encoder' Any ideas? -- Michael Xavier http://www.michaelxavier.net LinkedIn http://www.linkedin.com/pub/michael-xavier/13/b02/a26

I wanted to solve this but the only way I could get it to compile was:
run_ ((((((enumHandle undefined undefined $= splitLines) $= parseLine)
$= filterObjects undefined) $= restrictObjects undefined) $= encoder)
$$ output)
This looks terrible, and I'm sure there is a better way, but I can't
find it. If you remove any of the parenthesis it complains. I would
love to heard the "right" way to do this.
On Wed, Aug 31, 2011 at 3:40 PM, Michael Xavier
=> from Control.Monad but that seems to not like it if an enumeratee changes the type between the input and output. Here's the relevant types: type Object = Map Text Value
I've just recently started learning how to use the enumerator library. I'm designing a utility that parses a file where each line is a JSON object. I've designed it to look like: source enumerator (enumHandle pretty much) -> chunk by lines (enumeratee) -> parse a line into an Object (enumeratee) -> filter objects based on a criteria (enumeratee) -> limit some keys from each object (enumeratee) -> encode the object into a lazy bytestring (enumeratee) -> output the file to stdout (iteratee) I'm having difficulties with the types, particularly composing the enumeratees in the middle. Someone in #haskell said that's a good case for pipeline :: MonadIO m => (Enumerator ByteString IO ()) -> [Text] -> [Filter] -> Iteratee a m () pipeline s rfs fs = s $$ splitLines >=> parseLine >=> (filterObjects fs) >=> (restrictFields rfs) >=> encoder $$ output splitLines :: Monad m => Enumeratee ByteString ByteString m b parseLine :: Monad m => Enumeratee ByteString Object m b filterObjects :: Monad m => [Filter] -> Enumeratee Object Object m b restrictObjects :: Monad m => [Text] -> Enumeratee Object Object m b encoder :: Monad m => Enumeratee Object LBS.ByteString m b output :: MonadIO m => Iteratee LBS.ByteString m () I'm using >=> because I'd prefer to compose them left-right for readability Here's the error I get
Couldn't match expected type `ByteString' with actual type `M.Map Text Value' Expected type: Data.Enumerator.Step ByteString m b1 -> Iteratee ByteString m b2 Actual type: Enumeratee ByteString Object m0 b0 In the first argument of `(>=>)', namely `parseLine' In the second argument of `(>=>)', namely `parseLine >=> (filterObjects fs) >=> (restrictFields rfs) >=> encoder' Any ideas? -- Michael Xavier http://www.michaelxavier.net LinkedIn
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

What about pipeline :: MonadIO m => Enumerator ByteString m () -> [Text] -> [Filter] -> Iteratee ByteString m () pipeline s rfs fs = s $$ splitLines =$ parseLine =$ filterObjects fs =$ restrictObjects rfs =$ encoder =$ output Cheers, -- Felipe.

It compiles! I'm glad I got to preserve the ordering of the enumeratees. Thanks! On Wed, Aug 31, 2011 at 6:50 PM, Felipe Almeida Lessa < felipe.lessa@gmail.com> wrote:
What about
pipeline :: MonadIO m => Enumerator ByteString m () -> [Text] -> [Filter] -> Iteratee ByteString m () pipeline s rfs fs = s $$ splitLines =$ parseLine =$ filterObjects fs =$ restrictObjects rfs =$ encoder =$ output
Cheers,
-- Felipe.
-- Michael Xavier http://www.michaelxavier.net LinkedIn http://www.linkedin.com/pub/michael-xavier/13/b02/a26

Michael Xavier
I've just recently started learning how to use the enumerator library. I'm designing a utility that parses a file where each line is a JSON object. I've designed it to look like:
source enumerator (enumHandle pretty much) -> chunk by lines (enumeratee) -> parse a line into an Object (enumeratee) -> filter objects based on a criteria (enumeratee) -> limit some keys from each object (enumeratee) -> encode the object into a lazy bytestring (enumeratee) -> output the file to stdout (iteratee)
If there is no specific reason to use lazy ByteStrings, I would suggest that you use a concept complementary to iteratees, the blaze-builder library for efficient stream output. Greets, Ertugrul -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/

I've been meaning to take a look at blaze-builder anyways, but there is a
specific reason to use lazy ByteStrings. I'm using the Aeson library for
parsing/encoding JSON data. The encode function in that library chose lazy
ByteStrings as the output format. While performance on this project is a
factor, I'm reimplementing a project done in Ruby, so it won't be too hard
to best it in that dimension in Haskell, regardless of the output format ;)
On Thu, Sep 1, 2011 at 7:27 AM, Ertugrul Soeylemez
Michael Xavier
wrote: I've just recently started learning how to use the enumerator library. I'm designing a utility that parses a file where each line is a JSON object. I've designed it to look like:
source enumerator (enumHandle pretty much) -> chunk by lines (enumeratee) -> parse a line into an Object (enumeratee) -> filter objects based on a criteria (enumeratee) -> limit some keys from each object (enumeratee) -> encode the object into a lazy bytestring (enumeratee) -> output the file to stdout (iteratee)
If there is no specific reason to use lazy ByteStrings, I would suggest that you use a concept complementary to iteratees, the blaze-builder library for efficient stream output.
Greets, Ertugrul
-- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- Michael Xavier http://www.michaelxavier.net LinkedIn http://www.linkedin.com/pub/michael-xavier/13/b02/a26

Actually, you can use fromValue[1] to get a Builder.
[1] http://hackage.haskell.org/packages/archive/aeson/0.3.2.11/doc/html/Data-Aes...
On Thu, Sep 1, 2011 at 6:33 PM, Michael Xavier
I've been meaning to take a look at blaze-builder anyways, but there is a specific reason to use lazy ByteStrings. I'm using the Aeson library for parsing/encoding JSON data. The encode function in that library chose lazy ByteStrings as the output format. While performance on this project is a factor, I'm reimplementing a project done in Ruby, so it won't be too hard to best it in that dimension in Haskell, regardless of the output format ;)
On Thu, Sep 1, 2011 at 7:27 AM, Ertugrul Soeylemez
wrote: Michael Xavier
wrote: I've just recently started learning how to use the enumerator library. I'm designing a utility that parses a file where each line is a JSON object. I've designed it to look like:
source enumerator (enumHandle pretty much) -> chunk by lines (enumeratee) -> parse a line into an Object (enumeratee) -> filter objects based on a criteria (enumeratee) -> limit some keys from each object (enumeratee) -> encode the object into a lazy bytestring (enumeratee) -> output the file to stdout (iteratee)
If there is no specific reason to use lazy ByteStrings, I would suggest that you use a concept complementary to iteratees, the blaze-builder library for efficient stream output.
Greets, Ertugrul
-- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- Michael Xavier http://www.michaelxavier.net LinkedIn
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners

I don't know how I missed that. I'll be learning blaze-builder then. Thanks!
On Thu, Sep 1, 2011 at 8:40 AM, Michael Snoyman
Actually, you can use fromValue[1] to get a Builder.
[1] http://hackage.haskell.org/packages/archive/aeson/0.3.2.11/doc/html/Data-Aes...
On Thu, Sep 1, 2011 at 6:33 PM, Michael Xavier
wrote: I've been meaning to take a look at blaze-builder anyways, but there is a specific reason to use lazy ByteStrings. I'm using the Aeson library for parsing/encoding JSON data. The encode function in that library chose lazy ByteStrings as the output format. While performance on this project is a factor, I'm reimplementing a project done in Ruby, so it won't be too hard to best it in that dimension in Haskell, regardless of the output format ;)
On Thu, Sep 1, 2011 at 7:27 AM, Ertugrul Soeylemez
wrote: Michael Xavier
wrote: I've just recently started learning how to use the enumerator library. I'm designing a utility that parses a file where each line is a JSON object. I've designed it to look like:
source enumerator (enumHandle pretty much) -> chunk by lines (enumeratee) -> parse a line into an Object (enumeratee) -> filter objects based on a criteria (enumeratee) -> limit some keys from each object (enumeratee) -> encode the object into a lazy bytestring (enumeratee) -> output the file to stdout (iteratee)
If there is no specific reason to use lazy ByteStrings, I would suggest that you use a concept complementary to iteratees, the blaze-builder library for efficient stream output.
Greets, Ertugrul
-- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- Michael Xavier http://www.michaelxavier.net LinkedIn
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners
-- Michael Xavier http://www.michaelxavier.net LinkedIn http://www.linkedin.com/pub/michael-xavier/13/b02/a26
participants (5)
-
David McBride
-
Ertugrul Soeylemez
-
Felipe Almeida Lessa
-
Michael Snoyman
-
Michael Xavier