Lookup variant for IntMap

Hi all, I recently noticed that lookup for IntMap favours searches that can fast-fail - if instead we are in a situation where lookups are mostly successful, then this can be wasteful (see this issue for details: https://github.com/haskell/containers/issues/794). Since both versions are useful, it would be best to offer users both variants. I wondered if people on this list had any suggestions for a name for the new function? David Feuer also wondered which of these two behaviours should be the default. My inclination is for it to be the current version which fails faster (this also guarantees no unfortunate regressions) but it would be useful to run more substantial real-world benchmarks to figure that out (if anyone has any suggestions then do let me know). Best wishes, Callan

This is an interesting question. As a general rule, I tend to write code
that is optimized for the success case, but usually, I'm thinking about
things like parsing, where failure is extremely uncommon. With an ordered
map, there are certainly situations where failure is the more common case.
However, one important question is: What are you going to do if the element
is not present? One possibility is to add the missing value. But if you're
doing that, then you chose the wrong operation to begin with. Alter (known
as upsert in other ecosystems) outperforms the combination of lookup and
insert, or at least it should. That's all I can think of at the moment on
this topic. It would interesting to see if someone is aware of a situation
where the current behavior is advantageous.
On Mon, Aug 16, 2021 at 8:13 AM Callan McGill
Hi all,
I recently noticed that lookup for IntMap favours searches that can fast-fail - if instead we are in a situation where lookups are mostly successful, then this can be wasteful (see this issue for details: https://github.com/haskell/containers/issues/794). Since both versions are useful, it would be best to offer users both variants.
I wondered if people on this list had any suggestions for a name for the new function? David Feuer also wondered which of these two behaviours should be the default. My inclination is for it to be the current version which fails faster (this also guarantees no unfortunate regressions) but it would be useful to run more substantial real-world benchmarks to figure that out (if anyone has any suggestions then do let me know).
Best wishes, Callan _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
-- -Andrew Thaddeus Martin

Reading the issue and the benchmarks, it seems like the right call here
might be to make the "throws error" version the one that defers the failure
check, and to use a similar algorithm for update (which must search to the
leaf in any case). So (!) uses the new algorithm, as do alterF and
family. lookup, member, and findWithDefault retain the old algorithm.
That does leave an opening for an expected-to-hit lookup that can still
miss. I'd suggest having an alternative only to lookup, rather than
repeating the whole suite of lookup-, member-, and find-like things.
It looks like the benchmarks are manipulating quite large IntMaps? I
wonder how things look for smaller, shallower maps where fail-late tends to
dominate. I use maps-with-miss all the time, but seldom with (say)
millions of elements.
-Jan-Willem Maessen
On Mon, Aug 16, 2021 at 8:12 AM Callan McGill
Hi all,
I recently noticed that lookup for IntMap favours searches that can fast-fail - if instead we are in a situation where lookups are mostly successful, then this can be wasteful (see this issue for details: https://github.com/haskell/containers/issues/794). Since both versions are useful, it would be best to offer users both variants.
I wondered if people on this list had any suggestions for a name for the new function? David Feuer also wondered which of these two behaviours should be the default. My inclination is for it to be the current version which fails faster (this also guarantees no unfortunate regressions) but it would be useful to run more substantial real-world benchmarks to figure that out (if anyone has any suggestions then do let me know).
Best wishes, Callan _______________________________________________ Libraries mailing list Libraries@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
participants (3)
-
Andrew Martin
-
Callan McGill
-
Jan-Willem Maessen