Data.IntMap.Strict.findWithDefault is too strict

Hi, when trying out Data.IntMap.Strict instead of Data.IntMap, I've noticed (after some confusion) that Data.IntMap.Strict.findWithDefault evaluates the default value, even if it is not used! I usually pass an error-call to IntMap.findWithDefault to get a more informative crash-message than by using IntMap.! Evaluating (error "...") surely crashes, but the evaluation should only happen when the key is not in the map! I can easily work around this (by using fromMaybe and IntMap.lookup), but it is still worth at least being documented if crucial for performance. (the same applies to Data.Map) Cheers Christian

Christian Maeder
Hi,
when trying out Data.IntMap.Strict instead of Data.IntMap, I've noticed (after some confusion) that Data.IntMap.Strict.findWithDefault evaluates the default value, even if it is not used!
I usually pass an error-call to IntMap.findWithDefault to get a more informative crash-message than by using IntMap.!
I have also found this to be problematic in the past. Is there a compelling reason to keep the current behavior? I would find this function far more useful if the default value were lazily evaluated. Cheers, - Ben

Hi,
On Thu, Dec 13, 2012 at 7:49 AM, Ben Gamari
Christian Maeder
writes: when trying out Data.IntMap.Strict instead of Data.IntMap, I've noticed (after some confusion) that Data.IntMap.Strict.findWithDefault evaluates the default value, even if it is not used!
I usually pass an error-call to IntMap.findWithDefault to get a more informative crash-message than by using IntMap.!
I have also found this to be problematic in the past. Is there a compelling reason to keep the current behavior? I would find this function far more useful if the default value were lazily evaluated.
The behavior is at least documented. From Data.IntMap.Strict: "This module satisfies the following strictness properties: Key and value arguments are evaluated to WHNF; Keys and values are evaluated to WHNF before they are stored in the map." We agonized over this choice when we made it. The reasoning for the current behavior is that these strictness properties help you reason about whether values passed to the API are evaluated or not, as that no longer depend on the contents of them map. We're not really looking to support passing undefined values in the API. -- Johan

On 13.12.2012 17:04, Johan Tibell wrote:
Hi,
On Thu, Dec 13, 2012 at 7:49 AM, Ben Gamari
wrote: Christian Maeder
writes: when trying out Data.IntMap.Strict instead of Data.IntMap, I've noticed (after some confusion) that Data.IntMap.Strict.findWithDefault evaluates the default value, even if it is not used!
I usually pass an error-call to IntMap.findWithDefault to get a more informative crash-message than by using IntMap.!
I have also found this to be problematic in the past. Is there a compelling reason to keep the current behavior? I would find this function far more useful if the default value were lazily evaluated.
The behavior is at least documented. From Data.IntMap.Strict:
"This module satisfies the following strictness properties:
Key and value arguments are evaluated to WHNF; Keys and values are evaluated to WHNF before they are stored in the map."
Well, but the default value of findWithDefault is not stored in the map. So why does it have to be evaluated eagerly, even before know you will use it?
We agonized over this choice when we made it. The reasoning for the current behavior is that these strictness properties help you reason about whether values passed to the API are evaluated or not, as that no longer depend on the contents of them map. We're not really looking to support passing undefined values in the API.
-- Johan
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Andreas Abel <>< Du bist der geliebte Mensch. Theoretical Computer Science, University of Munich Oettingenstr. 67, D-80538 Munich, GERMANY andreas.abel@ifi.lmu.de http://www2.tcs.ifi.lmu.de/~abel/

Even a strict language like scala uses call-by-name arguments for the
counterpart to this function in its collections library. It seems to be
overstepping the mandate of being a "strict map library" to try to pretend
that Haskell is a strict language.
On Thu, Dec 13, 2012 at 11:48 AM, Johan Tibell
On Thu, Dec 13, 2012 at 8:13 AM, Andreas Abel
wrote: Key and value arguments are evaluated to WHNF
For consistency. The API is modeled as if we had call-by-value.
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Thu, Dec 13, 2012 at 7:11 PM, Daniel Peebles
Even a strict language like scala uses call-by-name arguments for the counterpart to this function in its collections library. It seems to be overstepping the mandate of being a "strict map library" to try to pretend that Haskell is a strict language.
We already went through all this when we made the decision. I'm not really inclined to change the decision unless there's new information that should cause us to reconsider it. -- Johan

It makes no sense to evaluate all branches of a case distinction before deciding which branch to take. Therefore it is not possible (or more difficult) in a strict language to user-define functions like if-then-else. In fact, I consider findWithDefault to be just an abbreviation for a case distinction. Even if the default value is not bottom, it makes no sense to evaluate it in the possibly many cases, when it is not used. The current implementation of strict maps does not work well as a replacement for lazy maps and if it does it is less efficient than possible, as it possibly evaluates something unneeded or merely checks if the default value is already evaluated. C. Am 14.12.2012 06:59, schrieb Johan Tibell:
On Thu, Dec 13, 2012 at 7:11 PM, Daniel Peebles
wrote: Even a strict language like scala uses call-by-name arguments for the counterpart to this function in its collections library. It seems to be overstepping the mandate of being a "strict map library" to try to pretend that Haskell is a strict language.
We already went through all this when we made the decision. I'm not really inclined to change the decision unless there's new information that should cause us to reconsider it.
-- Johan

On 13.12.12 5:48 PM, Johan Tibell wrote:
On Thu, Dec 13, 2012 at 8:13 AM, Andreas Abel
wrote: Key and value arguments are evaluated to WHNF
For consistency. The API is modeled as if we had call-by-value.
"For consistency" is monolithic statement. Maybe there are items in this bundle that do not need to be there. Which theorem breaks if you make the default argument in findWithDefault lazy? Which programs segfault then which did not segfault with an eager treatment of the default argument? Cheers, Andreas -- Andreas Abel <>< Du bist der geliebte Mensch. Theoretical Computer Science, University of Munich Oettingenstr. 67, D-80538 Munich, GERMANY andreas.abel@ifi.lmu.de http://www2.tcs.ifi.lmu.de/~abel/

Hey all, Just for reference, here are the two previous threads on library@ about the issue. They're well worth the read. http://www.haskell.org/pipermail/libraries/2011-October/016991.html http://www.haskell.org/pipermail/libraries/2011-November/017178.html I also want to differ from Johan; from my reading of the threads, the issue was settled less conclusively than I would have liked. In particular, extrapolating from the reports of these users, it seems that although users do want to know whether or not their thunks are evaluated or not, they *only* seem to care about it if the thunk actually is *retained* by the data structure. That is to say, users do *not* usually think of maps functions as a way of seq'ing things. And at least in the case of findWithDefault, it's trivial to seq the default argument *if* the user wanted to. I think we can formalize this intuition into a nice reasoning principle that admits "more lazy" strict maps. Cheers, Edward Excerpts from Johan Tibell's message of Thu Dec 13 08:04:09 -0800 2012:
Hi,
On Thu, Dec 13, 2012 at 7:49 AM, Ben Gamari
wrote: Christian Maeder
writes: when trying out Data.IntMap.Strict instead of Data.IntMap, I've noticed (after some confusion) that Data.IntMap.Strict.findWithDefault evaluates the default value, even if it is not used!
I usually pass an error-call to IntMap.findWithDefault to get a more informative crash-message than by using IntMap.!
I have also found this to be problematic in the past. Is there a compelling reason to keep the current behavior? I would find this function far more useful if the default value were lazily evaluated.
The behavior is at least documented. From Data.IntMap.Strict:
"This module satisfies the following strictness properties:
Key and value arguments are evaluated to WHNF; Keys and values are evaluated to WHNF before they are stored in the map."
We agonized over this choice when we made it. The reasoning for the current behavior is that these strictness properties help you reason about whether values passed to the API are evaluated or not, as that no longer depend on the contents of them map. We're not really looking to support passing undefined values in the API.
-- Johan

Hi all,
Just for reference, here are the two previous threads on library@ about the issue. They're well worth the read.
http://www.haskell.org/pipermail/libraries/2011-October/016991.html http://www.haskell.org/pipermail/libraries/2011-November/017178.html
I also want to differ from Johan; from my reading of the threads, the issue was settled less conclusively than I would have liked. In particular, extrapolating from the reports of these users, it seems that although users do want to know whether or not their thunks are evaluated or not, they *only* seem to care about it if the thunk actually is *retained* by the data structure. That is to say, users do *not* usually think of maps functions as a way of seq'ing things. And at least in the case of findWithDefault, it's trivial to seq the default argument *if* the user wanted to.
I think we can formalize this intuition into a nice reasoning principle that admits "more lazy" strict maps.
The current strictness properties are 1. Key and value arguments are evaluated to WHNF; 2. Keys and values are evaluated to WHNF before they are stored in the map. The simple change that would allow evaluating only the stored keys and values would be to remove *value arguments* from the point 1, resulting in 1. Key arguments are evaluated to WHNF; 2. Keys and values are evaluated to WHNF before they are stored in the map. In fact, all keys stored in the map are evaluated (even in Data.{Map,IntMap}.Lazy), so we could simplify the rule 2. to "Values are evaluated to WHNF before they are stored in the map", leaving us with 1. Key arguments are evaluated to WHNF; 2. Values are evaluated to WHNF before they are stored in the map. Note that the point 1. is the same as Data.{Map,IntMap}.Lazy strictness property, so the Lazy and Strict versions would differ only by inclusion of point 2. Is this something users of containers would prefer to the current properties? If someone thinks so, we could create a separate proposal to see what do the users wants or if they care. Cheers, Milan

On Sat, Dec 15, 2012 at 1:39 AM, Milan Straka
Is this something users of containers would prefer to the current properties? If someone thinks so, we could create a separate proposal to see what do the users wants or if they care.
I think any proposal should specify rules for how to define strictness for all (~150 functions) in the API. We need a policy that's easy to define and remember. The original split into a Lazy and Strict was done because Haskellers, even experienced ones, used the API incorrectly (e.g. using insertWith when insertWith' was called for). -- Johan

Hi all, I thought I'd share some of the subtle nuances in designing the strictness properties of the containers API. I currently feel that we lack a good guiding principle in making this decisions. There are several different points along the lazy-to-strict spectrum. For example: # Everything lazy (including the spine) We don't use this. I think it's a bad idea. It can make an insert finish in O(1) just to have a subsequent null check take O(n*log n) time. # Spine strict Quite inefficient as keys aren't unboxed in e.g. lookup and insert. # Strict in keys inserted into the map Also ineffecient as keys still can't be unboxed as there's usually a base case in each loop where the key isn't used. # Strict in key arguments This is what that Lazy API uses. Makes it possible to unbox keys. Good trade-off between performance and expressibility. Means that you can't write: lookup undefined empty which "Strict in keys inserted into the map" would have let you write. # Strict in values inserted into the map Minimum required to avoid space leaks for e.g. a map of Int values that are modified in a loop. # Strict in value arguments This is what the Strict API uses. This allows you to not do any external forcing of values passed to the containers API if you want to make sure they are evaluated. Cheers, Johan

Hi Johan, I think even if you - made all keys and values strict that could end up in the map - made all keys and values strict that could end up being compared to keys/values from a map then still there would be no reason to make the default argument in findWithDefault strict. Cheers, Andreas On 16.12.12 10:10 AM, Johan Tibell wrote:
Hi all,
I thought I'd share some of the subtle nuances in designing the strictness properties of the containers API. I currently feel that we lack a good guiding principle in making this decisions. There are several different points along the lazy-to-strict spectrum. For example:
# Everything lazy (including the spine)
We don't use this. I think it's a bad idea. It can make an insert finish in O(1) just to have a subsequent null check take O(n*log n) time.
# Spine strict
Quite inefficient as keys aren't unboxed in e.g. lookup and insert.
# Strict in keys inserted into the map
Also ineffecient as keys still can't be unboxed as there's usually a base case in each loop where the key isn't used.
# Strict in key arguments
This is what that Lazy API uses. Makes it possible to unbox keys. Good trade-off between performance and expressibility. Means that you can't write:
lookup undefined empty
which "Strict in keys inserted into the map" would have let you write.
# Strict in values inserted into the map
Minimum required to avoid space leaks for e.g. a map of Int values that are modified in a loop.
# Strict in value arguments
This is what the Strict API uses. This allows you to not do any external forcing of values passed to the containers API if you want to make sure they are evaluated.
Cheers, Johan
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Andreas Abel <>< Du bist der geliebte Mensch. Theoretical Computer Science, University of Munich Oettingenstr. 67, D-80538 Munich, GERMANY andreas.abel@ifi.lmu.de http://www2.tcs.ifi.lmu.de/~abel/

Hi,
On Sun, Dec 16, 2012 at 6:47 AM, Andreas Abel
I think even if you
- made all keys and values strict that could end up in the map - made all keys and values strict that could end up being compared to keys/values from a map
then still there would be no reason to make the default argument in findWithDefault strict.
We still need a clear policy how to handle these kind of function and other functions that don't directly insert things into maps (e.g. folds). -- Johan

Excerpts from Johan Tibell's message of Mon Dec 17 05:17:22 +0800 2012:
We still need a clear policy how to handle these kind of function and other functions that don't directly insert things into maps (e.g. folds).
Strictness of accumulators is something of an orthogonal issue to strictness of the maps; i.e. a strict fold can be useful for both strict and lazy maps. Cheers, Edward

On Sun, Dec 16, 2012 at 5:02 PM, Edward Z. Yang
Excerpts from Johan Tibell's message of Mon Dec 17 05:17:22 +0800 2012:
We still need a clear policy how to handle these kind of function and other functions that don't directly insert things into maps (e.g. folds).
Strictness of accumulators is something of an orthogonal issue to strictness of the maps; i.e. a strict fold can be useful for both strict and lazy maps.
I know. I'm trying to communicate that we ought to be thinking about the library holistically, not from the perspective of a single function that happens to not fulfill someones use case at some given day.
participants (7)
-
Andreas Abel
-
Ben Gamari
-
Christian Maeder
-
Daniel Peebles
-
Edward Z. Yang
-
Johan Tibell
-
Milan Straka