
Hello, I would like to add Data.Text support to parsec3. The following is a sample implementation. https://github.com/kazu-yamamoto/parsec3/commit/58c268c37e54a470c841c3ef5ecb... If there is no problem, please merge it to parsec3. Regards, --Kazu

On Tue, Feb 8, 2011 at 7:48 PM, Kazu Yamamoto
Hello,
I would like to add Data.Text support to parsec3. The following is a sample implementation.
https://github.com/kazu-yamamoto/parsec3/commit/58c268c37e54a470c841c3ef5ecb...
If there is no problem, please merge it to parsec3.
Hello, The package parsec3 on Hackage (http://hackage.haskell.org/package/parsec3) is maintained by Christian Maeder, so you should send patches to him. I maintain the library that it was forked from (parsec - http://hackage.haskell.org/package/parsec) and would also welcome suggestions :-) I'm not sure that the libraries@haskell.org is the best place to send patches. If you're interested in a wider discussion, haskel-cafe@haskell.org is a great place to get feedback. Have you done any benchmarks for parsec Data.Text values? I had thought that most folks who have measured it have ended up not too happy. Take care, Antoine
Regards,
--Kazu
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Hello,
I'm not sure that the libraries@haskell.org is the best place to send patches. If you're interested in a wider discussion, haskel-cafe@haskell.org is a great place to get feedback.
When I asked an Applicative issue to him before, he suggested to post the issue to this ML. So, I thought that this is the best place. If not, I will send patches directly to him from the next.
Have you done any benchmarks for parsec Data.Text values? I had thought that most folks who have measured it have ended up not too happy.
No. What kind of benchmark is appealing? Comparing it with ByteString? Are there alredy any benchmark code as a good start point? --Kazu

On 9 February 2011 15:22, Kazu Yamamoto
Have you done any benchmarks for parsec Data.Text values? I had thought that most folks who have measured it have ended up not too happy.
No. What kind of benchmark is appealing? Comparing it with ByteString?
I think comparing it to String would be a better choice; at the very least the Text version should be no worse than String. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

No. What kind of benchmark is appealing? Comparing it with ByteString?
I think comparing it to String would be a better choice; at the very least the Text version should be no worse than String.
OK. I wrote a parser to extract word of a pattern and parsed an ASCII file of about 6 Mbytes(concatenated *.texi files of Emacs). I measured performance with the time command. The compiler is GHC 6.12.3 with -O. String: 7.34s user 0.69s system 99% cpu 8.040 total Lazy ByteString: 9.95s user 0.60s system 99% cpu 10.549 total Lazy Text: 10.74s user 1.09s system 91% cpu 12.967 total Lazy Text is slower than String. But Lazy ByteString is also slow. Yes, this is not a good result. But there is no reason to merge the patch of Text. --Kazu Here is my code: ---- target :: Parser String target = appear samp -- to extract @samp{} of GNU's texi samp :: Parser String samp = string "@samp{" *> many1 (noneOf "}") <* char '}' appear :: Parser a -> Parser [a] appear p = (:) <$> try p <*> appear p <|> anyChar *> appear p <|> [] <$ eof ---

On 9 February 2011 16:58, Kazu Yamamoto
No. What kind of benchmark is appealing? Comparing it with ByteString?
I think comparing it to String would be a better choice; at the very least the Text version should be no worse than String.
OK.
I wrote a parser to extract word of a pattern and parsed an ASCII file of about 6 Mbytes(concatenated *.texi files of Emacs). I measured performance with the time command. The compiler is GHC 6.12.3 with -O.
How about with -O2 ? -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Am 09.02.2011 13:21, schrieb Ivan Lazar Miljenovic:
On 9 February 2011 16:58, Kazu Yamamoto
wrote: No. What kind of benchmark is appealing? Comparing it with ByteString?
I think comparing it to String would be a better choice; at the very least the Text version should be no worse than String.
OK.
I wrote a parser to extract word of a pattern and parsed an ASCII file of about 6 Mbytes(concatenated *.texi files of Emacs). I measured performance with the time command. The compiler is GHC 6.12.3 with -O.
How about with -O2 ?
Good question. I should mention that I've removed -O2 from parsec3.cabal. I think, cabal install --reinstall -O2 parsec3-1.0.0.3 should repair this. Christian

Hello,
I think,
cabal install --reinstall -O2 parsec3-1.0.0.3
should repair this.
I made a tiny benchmark using progression and criterion. https://github.com/kazu-yamamoto/parser-benchmark When I re-install parsec3 with the -O2 option, performance of the tree becomes almost the same. http://www.mew.org/~kazu/plot.png Note that the input file is ASCII. If we use a UTF-8 file, Lazy Text becomes much slower. But this is inevitable, I guess. If people want to compare String and Lazy Text on UTF-8, I will add the case. If people want to campare attoparsecs also, I will add them, too. --Kazu

On Tue, Feb 8, 2011 at 10:22 PM, Kazu Yamamoto
Hello,
I'm not sure that the libraries@haskell.org is the best place to send patches. If you're interested in a wider discussion, haskel-cafe@haskell.org is a great place to get feedback.
When I asked an Applicative issue to him before, he suggested to post the issue to this ML. So, I thought that this is the best place. If not, I will send patches directly to him from the next.
Sorry for being confusing :-) I guess I don't really know what the dividing line is between the haskell-cafe list and the libraries list.
Are there alredy any benchmark code as a good start point?
Here's a recent thread: http://www.haskell.org/pipermail/haskell-cafe/2010-December/087580.html I thought I had more links, but I'll have to do more digging.
--Kazu
participants (4)
-
Antoine Latter
-
Christian Maeder
-
Ivan Lazar Miljenovic
-
Kazu Yamamoto