Re: [Haskell-cafe] parsec2 vs. parsec3... again

On 12/23/2010 06:01 AM, Evan Laforge wrote:
This is not very encouraging! Especially strange is how Text generates *more* allocation... I'd expect less since it doesn't unpack all the Texts. Errgh. To check against predicate, library HAS to unpack checked character. There is no way around it. There's an obvious problem where I get the digits as a String and then parse that with list functions, but I can't see any way to get parsec to return a chunk of Text. This is roughly how parsec itself parses numbers, in Text.Parsec.Token.
Any ideas or experience?
If you wish performance so desperatley, you can try hand-coded parsing. What I mean is, that if library has to unpack characters to check them against isDigit predicate, why not to use it in building numeral value immidiatley? This will eliminate intermidiate list. However, every back-tracking parser is slow by definition. If you wish maximum possible speed, consider hand-written lexer (this is not too hard) and possibly Happy to generate parser. BTW, how much utf16 text is around? I never found any in wild web.

On 11-01-13 02:07 PM, Permjacov Evgeniy wrote:
BTW, how much utf16 text is around? I never found any in wild web.
There is a lot of utf16 text in memory chips when you use Windows or Java. In the case of Windows there is also a lot on disk platters as file names. A scanning electron microscope may reveal them.
participants (2)
-
Albert Y. C. Lai
-
Permjacov Evgeniy