Re: [Haskell-cafe] naturally, length :: a -> Int

Date: Thu, 4 Mar 2021 11:36:14 +0100 From: Ben Franksen
To: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] naturally, length :: a -> Int Message-ID: Content-Type: text/plain; charset=utf-8 Am 03.03.21 um 16:17 schrieb Olaf Klinke:
Which bugs can be caught at compile-time by having length return natural numbers? Regarding space, Int instead of Word only wastes the sign bit, doesn't it?
I like the idea that Johannes Waldmann and Jaro Reinders brought up: Why is length :: Foldable a => a -> Int so convenient? Short answer: Because of "affine" things like `availableSpace - length xs'.
There are indeed two types involved here, as the foundation package points out: relative offsets (like a tangent space?) and absolute counts. Think of NominalDiffTime versus UTCTime, or the two interpretations of vectors as points/movements in space.
In this light one could regard the current length as "the relative offset of the end of the list" which can readily be subtracted from another relative offset. In mathematical terms: Int the free group over the monoid of cardinal lengths.
Hm, interesting point.
If we do embrace that viewpoint, then I'd say we should go all the way and interpret indices modulo (non-negative) structure size! This makes (safe) indexing total (for non-empty structures) and allows things like xs !! (-1) == last xs as in Perl and some other languages. Unsafe indexing (as in the vector package) could remain as is for performance critical code.
Cheers Ben
I like Haskell particularly for not being like Matlab, where virtually any well-bracketed indexing syntax does produce a result, but not necessarily what you intended or expected. Besides, for some structures length is expensive but index is not so much, so I wonder whether one can modulo-index into a lazy container without evaluating its entire spine. I just wanted to understand/justify why Int is so convenient, and the Time-DiffTime analogy sprang to mind. Following the free group route, one immediately sees that there is an embedding from cardinalities to offsets, but no inverse. We could, of course, bring other categories into the picture such as ordered sets. Then there is an embedding-projection pair between Int and Natural, where all Ints < 0 are mapped to 0. The base functions such as take and drop adhere to this interpretation. Olaf

Am 04.03.21 um 14:17 schrieb Olaf Klinke:
Am 03.03.21 um 16:17 schrieb Olaf Klinke:
I like the idea that Johannes Waldmann and Jaro Reinders brought up: Why is length :: Foldable a => a -> Int so convenient? Short answer: Because of "affine" things like `availableSpace - length xs'.
There are indeed two types involved here, as the foundation package points out: relative offsets (like a tangent space?) and absolute counts. Think of NominalDiffTime versus UTCTime, or the two interpretations of vectors as points/movements in space.
In this light one could regard the current length as "the relative offset of the end of the list" which can readily be subtracted from another relative offset. In mathematical terms: Int the free group over the monoid of cardinal lengths.
Hm, interesting point.
If we do embrace that viewpoint, then I'd say we should go all the way and interpret indices modulo (non-negative) structure size! This makes (safe) indexing total (for non-empty structures) and allows things like xs !! (-1) == last xs as in Perl and some other languages. Unsafe indexing (as in the vector package) could remain as is for performance critical code.
Cheers Ben
I like Haskell particularly for not being like Matlab, where virtually any well-bracketed indexing syntax does produce a result, but not necessarily what you intended or expected.
Okay, so you expect the result of list !! (-1) vector ! (-1) to be bottom a.k.a. "error: index out of range". Whether a non-result like that corresponds closer to what you expected or intended depends pretty much on your expectations and intentions. Indexing modulo size/length is well-defined, logically sound, easy to understand and remember, and therefore (IMHO) very practical. Of course YMMV.
Besides, for some structures length is expensive but index is not so much, so I wonder whether one can modulo-index into a lazy container without evaluating its entire spine.
One cannot, naturally, index into a list with a negative index or with one that is greater than or equal to the length w/o evaluating its entire spine. So what? If you write "abc" !! 3 this will throw an error nowadays, but only after traversing the full spine. Now you may argue that [0..] !! (-1) at least crashes immediately, whereas with my proposal it would first eat up all your memory. Okay, not so nice. However, note that replacing Int with Word for size/length/indexing has exactly the same disadvantage, e.g. with [0..] !! (0 - 1) Cheers Ben
participants (2)
-
Ben Franksen
-
Olaf Klinke