Thanks for the encouragement Rodrigo! I'll follow the process and hope to open a ticket soon. Viktor Dukhovni (2025-Jul-21, excerpt):
It is also fair to point out that once an Int or other bounded integral type is read, arithmetic with that type (addition, subtraction and multiplication) silently overflows. And so silent overflow in `read` is not inconsistent with the type's semantics.
I see parsing as a boundary between an outside world (throwing text at me) and an inside world, where I have programmed some algorithm. As programmer, it is my responsibility to ensure that the types are chosen so that the algorithm works correctly, ideally on any accepted input, i.e., I have to guarantee that no inadvertent overflow happens in this inside world. However, calculating away based on misinterpreted input, will lead to invalid results. Viktor Dukhovni (2025-Jul-21, excerpt):
That said, if various middleware libraries hide overflows, because under the covers thay're using `read`, that could be a problem, so we do want the ecosystem at large to make sensible choices about when silent overflow may or may not be appropriate. Perhaps that means having both wrapping and overflow-checked implementations available, and clear docs with each about its behaviour and the corresponding alternative.
I did not realise this clearly enough before, but have elaborated a bit on Haskell-cafe [1]. We do have unbounded `read :: String -> Integer` and silently overflowing `fromInteger :: Integer -> Word8`, which can be combined if overflow is desired. This follows the idea to be explicit about dangerous things. In addition, we have `read :: String -> Word8` and company, which I'd like to fix.
A few of quick observations about [2]:
Thank you =)
- It disallows expliccit leading "+" (just like "read", but perhaps that should be tolerated).
Yes, it probably should not be that strict. For my own projects I assumed it easier to make it more forgiving later, than the other way round. There really should be consensus on whether or not leading `+` or `0` should be allowed. But these are fixes to make towards the end, I guess.
- It disallows multiple leading zeros, perhaps these should be tolerated.
- It disallows "-0", perhaps these should be tolerated, as well as "-0000", "-000001", ... (With lazy ByteStrings, which might never terminate, there is a generous, but sensible limit on the number of leading zeros allowed).
I ruled this out because I wanted a simple guarantee for termination. Your idea of “generous, but sensible” sounds compelling, the leading `0`s can be cosumed in constant space, we need not keep them.
- One way to avoid difficulties with handling negative minBound is to parse signed values via the corresponding unsigned type, which can accommodate `-minBound` as a positive value, and then negate the final result. This makse possible sharing the low-level digit-by-digit code between the positive and negative cases.
How do you mean? I did not get this “accommodate `-minBound` as a positive value” right, my initial approach to use char '-' >> negate <$> parseUnsigned (negate minBound) fails, exactly because the negation of the lower bound may not be (read: is usually not) within the upper bound, and thus wraps around, e.g., incorrectly `negate (minBound :: Int8)` → `-128` due to the upper bound of `127`. Viktor Dukhovni (2025-Jul-21, excerpt):
If parsing of Integer and Natual is also in scope […]
No, not at all. I have no reservations against `read` for the unbounded types. That should be left alone. Cheers Stefan [1]: https://mail.haskell.org/pipermail/haskell-cafe/2025-July/137162.html [2]: https://github.com/s5k6/robust-int -- Stefan Klinger, Ph.D. -- computer scientist o/X http://stefan-klinger.de /\/ https://github.com/s5k6 \ I prefer receiving plain text messages, not exceeding 32kB.