
The second pass can be a huge problem for a failure. For example, when
parsing a long string, we first look for an open parenthesis. To do so, we
lex the next token. Supposing that's the string itself, we conclude that it
is a string and not an open parenthesis, so we *throw it away and start
over*. I hope to get the paren issue fixed for good at Hac Phi. Aside from
that rather prominent one, I don't know how much the double scanning hurts,
but it certainly can't *help* in most cases. In a more typical parsing
situation, the parser would consume a stream of tokens instead of a list of
characters. We can't do that here.
On Oct 9, 2016 12:56 AM, "wren romano"
On Sat, Oct 1, 2016 at 8:34 PM, David Feuer
wrote: Instead of scanning first (in lexing) to find the end of the number and
then
scanning the string again to calculate the number, start to calculate once the first digit appears.
Ah, yes. bytestring-lexing does that (among numerous other things). It does save a second pass over the characters, but I'm not sure what proportion of the total slowdown of typical parser combinators is actually due to the second pass, as opposed to other problems with the typical "how hard can it be" lexers/parsers people knock out. Given the multitude of other problems (e.g., using Integer or other expensive types throughout the computation, not forcing things often enough to prevent thunks and stack depth, etc), I'm not sure it's legit to call it a "parser vs lexer" issue.
-- Live well, ~wren _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries