Re: Numeric read seems too strict

By the way, I believe we should be able to read numbers more efficiently by
parsing them directly instead of lexing first. We have to deal with
parentheses, white space, and signs uniformly for all number types. Then
specialized foldl'-style code *should* be able to parse integral and
fractional numbers faster than any lex-first scheme.
On Sep 12, 2016 1:53 PM, "David Feuer"

Read and show are supposed to be simple and perhaps at least have well
behaved asymptotic performance , but ultimately are for simple debugging
and or pasting back into source. Granted it winds up also being used to
serialize small config that are human manipulable or easy to paste back
into source
On Sep 12, 2016 2:03 PM, "David Feuer"
By the way, I believe we should be able to read numbers more efficiently by parsing them directly instead of lexing first. We have to deal with parentheses, white space, and signs uniformly for all number types. Then specialized foldl'-style code *should* be able to parse integral and fractional numbers faster than any lex-first scheme.
On Sep 12, 2016 1:53 PM, "David Feuer"
wrote: I noticed the other day that
readMaybe (fix ('a':)) :: Maybe Double
is an infinite loop. The problem is that the lexer doesn't know that it's expected to lex a number. It just keeps scanning and scanning to get to the end of the endless token. Shall we fix this?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

This may be true, but I don't see how it's relevant.
On Sep 14, 2016 2:14 PM, "Carter Schonwald"
Read and show are supposed to be simple and perhaps at least have well behaved asymptotic performance , but ultimately are for simple debugging and or pasting back into source. Granted it winds up also being used to serialize small config that are human manipulable or easy to paste back into source
On Sep 12, 2016 2:03 PM, "David Feuer"
wrote: By the way, I believe we should be able to read numbers more efficiently by parsing them directly instead of lexing first. We have to deal with parentheses, white space, and signs uniformly for all number types. Then specialized foldl'-style code *should* be able to parse integral and fractional numbers faster than any lex-first scheme.
On Sep 12, 2016 1:53 PM, "David Feuer"
wrote: I noticed the other day that
readMaybe (fix ('a':)) :: Maybe Double
is an infinite loop. The problem is that the lexer doesn't know that it's expected to lex a number. It just keeps scanning and scanning to get to the end of the endless token. Shall we fix this?
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries

On Mon, Sep 12, 2016 at 11:03 AM, David Feuer
By the way, I believe we should be able to read numbers more efficiently by parsing them directly instead of lexing first. We have to deal with parentheses, white space, and signs uniformly for all number types. Then specialized foldl'-style code *should* be able to parse integral and fractional numbers faster than any lex-first scheme.
I follow the part about parentheses and negations, but I'm not sure I get the rest of what you mean. E.g., I'm not sure how any parser could be faster than what bytestring-lexing does for Fractional and Integral types (ignoring the unoptimized hex and octal functions). What am I missing? -- Live well, ~wren

Instead of scanning first (in lexing) to find the end of the number and
then scanning the string again to calculate the number, start to calculate
once the first digit appears.
On Oct 1, 2016 10:07 PM, "wren romano"
On Mon, Sep 12, 2016 at 11:03 AM, David Feuer
wrote: By the way, I believe we should be able to read numbers more efficiently by parsing them directly instead of lexing first. We have to deal with parentheses, white space, and signs uniformly for all number types. Then specialized foldl'-style code *should* be able to parse integral and fractional numbers faster than any lex-first scheme.
I follow the part about parentheses and negations, but I'm not sure I get the rest of what you mean. E.g., I'm not sure how any parser could be faster than what bytestring-lexing does for Fractional and Integral types (ignoring the unoptimized hex and octal functions). What am I missing?
-- Live well, ~wren

On 2 October 2016 at 14:34, David Feuer
Instead of scanning first (in lexing) to find the end of the number and then scanning the string again to calculate the number, start to calculate once the first digit appears.
As in multiply the current sum by 10 before adding each new digit?
On Oct 1, 2016 10:07 PM, "wren romano"
wrote: On Mon, Sep 12, 2016 at 11:03 AM, David Feuer
wrote: By the way, I believe we should be able to read numbers more efficiently by parsing them directly instead of lexing first. We have to deal with parentheses, white space, and signs uniformly for all number types. Then specialized foldl'-style code *should* be able to parse integral and fractional numbers faster than any lex-first scheme.
I follow the part about parentheses and negations, but I'm not sure I get the rest of what you mean. E.g., I'm not sure how any parser could be faster than what bytestring-lexing does for Fractional and Integral types (ignoring the unoptimized hex and octal functions). What am I missing?
-- Live well, ~wren
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com

Yeah, that. With a paren count and an accumulator and for fractional
numbers some care around the decimal point or slash, we can look at each
digit just once. Fast/lazy failure would be a pleasant side effect of
running a numbers-only process from top to bottom. Yes, Read is supposed to
read things that look like Haskell expressions, but it's really not a
Haskell parser and pretending it is only hurts.
On Oct 2, 2016 12:07 AM, "Ivan Lazar Miljenovic"
Instead of scanning first (in lexing) to find the end of the number and
On 2 October 2016 at 14:34, David Feuer
wrote: then scanning the string again to calculate the number, start to calculate once the first digit appears.
As in multiply the current sum by 10 before adding each new digit?
On Oct 1, 2016 10:07 PM, "wren romano"
wrote: On Mon, Sep 12, 2016 at 11:03 AM, David Feuer
wrote: By the way, I believe we should be able to read numbers more
efficiently
by parsing them directly instead of lexing first. We have to deal with parentheses, white space, and signs uniformly for all number types. Then specialized foldl'-style code *should* be able to parse integral and fractional numbers faster than any lex-first scheme.
I follow the part about parentheses and negations, but I'm not sure I get the rest of what you mean. E.g., I'm not sure how any parser could be faster than what bytestring-lexing does for Fractional and Integral types (ignoring the unoptimized hex and octal functions). What am I missing?
-- Live well, ~wren
_______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com http://IvanMiljenovic.wordpress.com

Do we have benchmarks for your proposed change?
Does it handle hex and binary formats too ?
On Sunday, October 2, 2016, David Feuer
Yeah, that. With a paren count and an accumulator and for fractional numbers some care around the decimal point or slash, we can look at each digit just once. Fast/lazy failure would be a pleasant side effect of running a numbers-only process from top to bottom. Yes, Read is supposed to read things that look like Haskell expressions, but it's really not a Haskell parser and pretending it is only hurts.
On Oct 2, 2016 12:07 AM, "Ivan Lazar Miljenovic" < ivan.miljenovic@gmail.com javascript:_e(%7B%7D,'cvml','ivan.miljenovic@gmail.com');> wrote:
Instead of scanning first (in lexing) to find the end of the number and
On 2 October 2016 at 14:34, David Feuer
javascript:_e(%7B%7D,'cvml','david.feuer@gmail.com');> wrote: then scanning the string again to calculate the number, start to calculate once the first digit appears.
As in multiply the current sum by 10 before adding each new digit?
On Oct 1, 2016 10:07 PM, "wren romano"
javascript:_e(%7B%7D,'cvml','winterkoninkje@gmail.com');> wrote:
On Mon, Sep 12, 2016 at 11:03 AM, David Feuer
javascript:_e(%7B%7D,'cvml','david.feuer@gmail.com');>
wrote:
By the way, I believe we should be able to read numbers more efficiently by parsing them directly instead of lexing first. We have to deal with parentheses, white space, and signs uniformly for all number types. Then specialized foldl'-style code *should* be able to parse integral and fractional numbers faster than any lex-first scheme.
I follow the part about parentheses and negations, but I'm not sure I get the rest of what you mean. E.g., I'm not sure how any parser could be faster than what bytestring-lexing does for Fractional and Integral types (ignoring the unoptimized hex and octal functions). What am I missing?
-- Live well, ~wren
_______________________________________________ Libraries mailing list Libraries@haskell.org javascript:_e(%7B%7D,'cvml','Libraries@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com javascript:_e(%7B%7D,'cvml','Ivan.Miljenovic@gmail.com'); http://IvanMiljenovic.wordpress.com

On Sat, Oct 1, 2016 at 8:34 PM, David Feuer
Instead of scanning first (in lexing) to find the end of the number and then scanning the string again to calculate the number, start to calculate once the first digit appears.
Ah, yes. bytestring-lexing does that (among numerous other things). It does save a second pass over the characters, but I'm not sure what proportion of the total slowdown of typical parser combinators is actually due to the second pass, as opposed to other problems with the typical "how hard can it be" lexers/parsers people knock out. Given the multitude of other problems (e.g., using Integer or other expensive types throughout the computation, not forcing things often enough to prevent thunks and stack depth, etc), I'm not sure it's legit to call it a "parser vs lexer" issue. -- Live well, ~wren
participants (4)
-
Carter Schonwald
-
David Feuer
-
Ivan Lazar Miljenovic
-
wren romano