naturally, length :: a -> Int

Dear Cafe - let's fight Int. In a lot of code I write (for research, teaching, production), a lot of numbers are in fact natural numbers - because I am counting (e.g., number of elements in a collection) or pointing into what I just counted (e.g., Data.Set.elemAt). But Haskell's "most natural" type is - Int. At least that's my impression from current usage in libraries and teaching examples (see functions length, replicate, ... ). I've developed a horror of reading and writing "length something :: Int". As I will be teaching an introductory class on FP soon, I am looking for ways to handle this. One option is presented in Bird/Gibbons: Algorithm Design with Haskell, which has "type Nat = Int" on page 6. Actually, that's the very first code example that appears in the book! The footnote there refers to a remark in Data.Word that has since been dropped? Now there is Numeric.Natural, (cf. https://mail.haskell.org/pipermail/libraries/2014-November/024203.html ) and numeric literals are already overloaded, so we can just use that? Well, yes and no - we still have to write a type conversion each time we use some library function that insists on Int, or, write cumbersome "genericLength". Also (but that is the lesser concern), Numeric.Natural is for arbitrary-length numbers (corresponding to Integer), while I might sometimes know that numbers are representable in a machine word, and I don't want to pay for the implied pattern match (data Natural = NatS Word | NatJ BigNat). I am willing to pay for underflow checks (on subtraction), and for overflow checks (if there's an upper bound). If I want to live risky (no checks) I could use Data.Word. So, what do you do? (Yes, Peano numerals ...) Or, how do we convince ourselves (and our students) of keeping the current status? NB: here is another practical argument for Natural (from production code, an auto-grading system for student homework) We use datatype-generic parsers heavily. For data Foo = Foo { ... :: Nat ... }, bad input will be rejected by the parser already, and I need no extra checks later on. Less code! I sometimes even use "Pos(itive)". That's all fine and dandy - until I call some library function... - J.W.

On Mon, 1 Mar 2021, Johannes Waldmann wrote:
So, what do you do? (Yes, Peano numerals ...) Or, how do we convince ourselves (and our students) of keeping the current status?
A poor man's definition of peano numbers is: type Peano = [()] 'length xs' becomes: map (const ()) xs or () <$ xs or void xs (+) becomes (++) (*) becomes liftA2 const (-) becomes \xs ys -> foldl (flip drop) ys (1<$xs) My experience with Word is that it makes things even worse. It is neither compatible with Int nor has it bound checks. It happens pretty easily that in an intermediate result is negative although the whole expression of type Word is always non-negative. I would prefer to use Liquid Haskell and restrict Int to non-negative numbers. This would solve both problems.

On Mon, Mar 01, 2021 at 05:45:07PM +0100, Johannes Waldmann wrote:
I've developed a horror of reading and writing "length something :: Int".
If you're willing to tolerate the small overhead of Natural: $ ghci λ> import Data.Foldable (Foldable, foldl') λ> import GHC.Natural λ> len :: Foldable t => t a -> Natural; len = foldl' (\a _ -> a + 1) 0 λ> let x = len [1,2,3] λ> :t x x :: Natural λ> x 3 otherwise, note that `succ`, unlike (+), does bounds checks: λ> let x = maxBound :: Int λ> succ x *** Exception: Prelude.Enum.succ{Int}: tried to take `succ' of maxBound and so: λ> import Data.Foldable (Foldable, foldl') λ> len :: Foldable t => t a -> Int; len = foldl' (\a _ -> succ a) 0 λ> let x = len [1,2,3] λ> :t x x :: Int λ> x 3 on a 64-bit machine, testing to make sure the bounds check actually works would take too long, but we can cheat: λ> len :: Foldable t => t a -> Int; len = foldl' (\a _ -> succ a) (maxBound :: Int) λ> len [] 9223372036854775807 λ> len [1] *** Exception: Prelude.Enum.succ{Int}: tried to take `succ' of maxBound -- Viktor.

Am 01.03.21 um 17:45 schrieb Johannes Waldmann:
let's fight Int.
I'm in.
In a lot of code I write (for research, teaching, production), a lot of numbers are in fact natural numbers - because I am counting (e.g., number of elements in a collection) or pointing into what I just counted (e.g., Data.Set.elemAt).
But Haskell's "most natural" type is - Int. At least that's my impression from current usage in libraries and teaching examples (see functions length, replicate, ... ).
I've developed a horror of reading and writing "length something :: Int".
Same here.
If I want to live risky (no checks) I could use Data.Word.
I think Data.Word is not so bad. Does it come with a guarantee that maxBound=2^n-1 and all operations being treated module 2^n, for some natural number n? That would be fine for most applications I guess. It would be nice if one could change the semantics of under- or overflow for Data.Word with a compiler flag, e.g. so that it throws an exception.
That's all fine and dandy - until I call some library function...
I would support a move Int->Word in all libraries where that makes sense. But I doubt it will happen. Cheers Ben

I think non-wrapping Word should be a different type or two. I'm quite fond
of the ring laws myself. For many practical purposes, Word64 can stand in
for Natural. A non-wrapping variant (newtype, perhaps?) might be useful.
Another occasionally useful one is Word63, which is good for bounds-checked
operations.
On Tue, Mar 2, 2021, 7:53 AM Ben Franksen
Am 01.03.21 um 17:45 schrieb Johannes Waldmann:
let's fight Int.
I'm in.
In a lot of code I write (for research, teaching, production), a lot of numbers are in fact natural numbers - because I am counting (e.g., number of elements in a collection) or pointing into what I just counted (e.g., Data.Set.elemAt).
But Haskell's "most natural" type is - Int. At least that's my impression from current usage in libraries and teaching examples (see functions length, replicate, ... ).
I've developed a horror of reading and writing "length something :: Int".
Same here.
If I want to live risky (no checks) I could use Data.Word.
I think Data.Word is not so bad. Does it come with a guarantee that maxBound=2^n-1 and all operations being treated module 2^n, for some natural number n? That would be fine for most applications I guess.
It would be nice if one could change the semantics of under- or overflow for Data.Word with a compiler flag, e.g. so that it throws an exception.
That's all fine and dandy - until I call some library function...
I would support a move Int->Word in all libraries where that makes sense.
But I doubt it will happen.
Cheers Ben
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

On 01/03/2021 17:45, Johannes Waldmann wrote:
Dear Cafe -
let's fight Int.
I'm in too. One first step would be to use `Word#` instead of `Int#` in ghc primops where it makes sense. This could be done as it should be transparent to most users. The natural next step would be to modify functions and classes in `base` to use appropriate types too. I don't think it's going to happen as it would require modifying the Haskell report and breaking almost every existing code/books/etc. So an alternative path would be to extract every wired-in things from `base` and to put them into a `ghc-base` package that `base`, `foundation` and other alternative preludes would depend on. This way we could write codes that don't depend on `base` and the legacy stuff it contains at all (even transitively). Could you open a ticket on GHC's bug tracker about this? Cheers, Sylvain

I think we’d also wanna add compiler suport for ints and words and signal
on over/underflow using cpu machinery . Most other approachs have pretty
fun changes in memory rep
On Tue, Mar 2, 2021 at 11:10 AM Sylvain Henry
On 01/03/2021 17:45, Johannes Waldmann wrote:
Dear Cafe -
let's fight Int.
I'm in too.
One first step would be to use `Word#` instead of `Int#` in ghc primops where it makes sense. This could be done as it should be transparent to most users.
The natural next step would be to modify functions and classes in `base` to use appropriate types too. I don't think it's going to happen as it would require modifying the Haskell report and breaking almost every existing code/books/etc.
So an alternative path would be to extract every wired-in things from `base` and to put them into a `ghc-base` package that `base`, `foundation` and other alternative preludes would depend on. This way we could write codes that don't depend on `base` and the legacy stuff it contains at all (even transitively).
Could you open a ticket on GHC's bug tracker about this?
Cheers, Sylvain
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

There was a great talk about this at PWLConf 2019 by José Manuel Calderón Trilla: "What about the Natural Numbers?" There is a recording on YouTube: https://www.youtube.com/watch?v=jFk1qpr1ytk I think we should not be so quick to give up theoretical elegance and simplicity for the sake of performance (especially not in a language like Haskell). If we would make a change then I would prefer using the lazy and possibly infinite natural numbers with pattern synonyms for successor and zero. Like the existing Integer but then lazy, so more programs will terminate, for example a program that checks if the length of a finite list is smaller than the length of an infinite list. And if we make the change we will probably also need something like Linear.Affine from the linear package with an instance for these natural numbers where the difference between two naturals is an Integer. That would hopefully make it reasonably easy to convert naturals to and from integers. I hope that some of the performance concerns can be tackled by compiler optimizations, for example, detecting when naturals are used in tight loops which can be safely replaced by machine words, similar to strictness analysis and unboxing. And of course programmers can choose to use the old unsafe machine words/ints manually.

Am 02.03.21 um 17:48 schrieb Jaro Reinders:
There was a great talk about this at PWLConf 2019 by José Manuel Calderón Trilla: "What about the Natural Numbers?" There is a recording on YouTube: https://www.youtube.com/watch?v=jFk1qpr1ytk
Thanks for the link. I read Colin's paper long ago and forgot most about it. IMO the correspondence with take and drop for lists isn't too convincing. Lists are boring, and for data structures with more interesting invariants (like Sets) size/length is no longer a monoid homomorphism. The talk is also missing out on the opportunity to hold an extended discussion about arithmetic laws like cancellation. For instance, we have (a - b) + b == a for unsigned (Data.Word, assuming operations modulo 2^bitsize), but (a .-. b) + b == a does not hold in general. Due to primary school "indoctrination" we all intuitively expect such laws to hold for the standard arithmetic operators, which is why I think they should be either modulo 2^wordsize as in C or raise an exception. And I would really like to be able to chose which of those I get. None of this preclude the addition of non-standard arithmetic operators, of course. These may be useful in certain simple situations like when dealing with lists. Cheers Ben
participants (8)
-
Ben Franksen
-
Carter Schonwald
-
David Feuer
-
Henning Thielemann
-
Jaro Reinders
-
Johannes Waldmann
-
Sylvain Henry
-
Viktor Dukhovni