Proposal: Value strict versions of containers

As it stands, we have strict functions for certain operations, such as insertWith. But this current selection is somewhat arbitrary. For illustrative purposes, here are other functions that ought to be strictified in IntMap, because there is no easy way for a user to implement them. Data.IntMap insertLookupWithKey adjust adjustWithKey update updateWithKey updateLookupWithKey alter updateMinWithKey updateMaxWithKey updateMax updateMin map mapWithKey There are also some functions that could be implemented by the user, assuming that IntMap always forces the outer structure returned by the combining functions here, but are tricky to do so. mapAccum mapAccumWithKey mapAccumRWithKey mapAccumL mapMaybe mapMaybeWithKey mapEither mapEitherWithKey Since there are so many of these functions, I propose we introduce Lazy and Strict versions of IntMap. We should also carefully document operations on the IntMap that are strict on all values; for example, filter. Similar changes should be made to Data.Map. There is also a companion proposal: Regardless of whether or not we implement Strict and Lazy versions of IntMap, we should implement all of these strict versions of functions, because are not easy for end-users to implement. A counter-proposal is, if we split out Strict and Lazy IntMap, we should *remove* all strict functions from the lazy map, on grounds that selective strictness can be implemented with an extra layer of indirection on the Strict IntMap. Commentary: I favor separate value strict and value lazy maps, and also believe that insertWith' and friends should be deprecated from the lazy versions. I don't think reimplementing all of these functions to be strict is a good idea, on code duplication and API clutter reasons. Remember: at the cost of one layer of indirection, you can be selectively lazy using IntMap (Lazy a). Discussion period: 3 weeks Cheers, Edward

On Sat, 21 May 2011, Edward Z. Yang wrote:
As it stands, we have strict functions for certain operations, such as insertWith. But this current selection is somewhat arbitrary. For illustrative purposes, here are other functions that ought to be strictified in IntMap, because there is no easy way for a user to implement them.
Is it possible to solve this with the proposed Location interface? "Library proposal: add a Location interface for element-wise operations on Data.Map" http://www.haskell.org/pipermail/libraries/2011-January/015593.html

It's tough to say, because the ~ross link is broken. If a user-controlled value is forced, even if it's a container like Maybe or a tuple, then strict versions of the functions can be implemented user-side. Otherwise, it's orthogonal. Cheers, Edward Excerpts from Henning Thielemann's message of Sat May 21 13:37:10 -0400 2011:
On Sat, 21 May 2011, Edward Z. Yang wrote:
As it stands, we have strict functions for certain operations, such as insertWith. But this current selection is somewhat arbitrary. For illustrative purposes, here are other functions that ought to be strictified in IntMap, because there is no easy way for a user to implement them.
Is it possible to solve this with the proposed Location interface?
"Library proposal: add a Location interface for element-wise operations on Data.Map" http://www.haskell.org/pipermail/libraries/2011-January/015593.html

On Sat, May 21, 2011 at 7:19 PM, Edward Z. Yang
Since there are so many of these functions, I propose we introduce Lazy and Strict versions of IntMap. We should also carefully document operations on the IntMap that are strict on all values; for example, filter. Similar changes should be made to Data.Map.
This is exactly what I did in the unordered-containers package [1], where I have Data.HashMap.Strict and Data.HashMap.Lazy. 1. http://hackage.haskell.org/package/unordered-containers Johan

Hi,
Since there are so many of these functions, I propose we introduce Lazy and Strict versions of IntMap. We should also carefully document operations on the IntMap that are strict on all values; for example, filter. Similar changes should be made to Data.Map.
+1 To be more verbose, I would like the following: - have Data.IntMap.Lazy and Data.IntMap.Strict, with equal set of methods (so they can be used interchangeably) - for backward compatibility, have Data.IntMap as it is, but deprecate all strict methods ending with '. In the end, Data.IntMap should just reexport Data.IntMap.Lazy. - the types Data.IntMap.IntMap, Data.IntMap.Lazy.IntMap and Data.IntMap.Strict.IntMap should be equal -- so I can use strict methods on IntMap someone else created with lazy methods. That is simple in absence of type classes. Cheers, Milan
There is also a companion proposal: Regardless of whether or not we implement Strict and Lazy versions of IntMap, we should implement all of these strict versions of functions, because are not easy for end-users to implement. A counter-proposal is, if we split out Strict and Lazy IntMap, we should *remove* all strict functions from the lazy map, on grounds that selective strictness can be implemented with an extra layer of indirection on the Strict IntMap.
Commentary: I favor separate value strict and value lazy maps, and also believe that insertWith' and friends should be deprecated from the lazy versions. I don't think reimplementing all of these functions to be strict is a good idea, on code duplication and API clutter reasons. Remember: at the cost of one layer of indirection, you can be selectively lazy using IntMap (Lazy a).
Discussion period: 3 weeks
Cheers, Edward
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Excerpts from Milan Straka's message of Sun May 22 06:50:07 -0400 2011:
- the types Data.IntMap.IntMap, Data.IntMap.Lazy.IntMap and Data.IntMap.Strict.IntMap should be equal -- so I can use strict methods on IntMap someone else created with lazy methods. That is simple in absence of type classes.
Comment: This would mean you would get no static assurances against mixing up a strict IntMap with a lazy IntMap; for all intents and purposes, this is equivalent to implementing strict versions of all functions on top of a lazy data type. Cheers, Edward

Excerpts from Milan Straka's message of Sun May 22 06:50:07 -0400 2011:
- the types Data.IntMap.IntMap, Data.IntMap.Lazy.IntMap and Data.IntMap.Strict.IntMap should be equal -- so I can use strict methods on IntMap someone else created with lazy methods. That is simple in absence of type classes.
Comment: This would mean you would get no static assurances against mixing up a strict IntMap with a lazy IntMap; for all intents and purposes, this is equivalent to implementing strict versions of all functions on top of a lazy data type.
Exactly. Do you think we should have different types for static and lazy maps? It feels odd to me to have these distinguished on type level. To me it is more about how you want to work with the structure, not the structure itself. But maybe others feel differently. Cheers, Milan

On 22 May 2011, at 11:58, Milan Straka wrote:
Excerpts from Milan Straka's message of Sun May 22 06:50:07 -0400 2011:
- the types Data.IntMap.IntMap, Data.IntMap.Lazy.IntMap and Data.IntMap.Strict.IntMap should be equal -- so I can use strict methods on IntMap someone else created with lazy methods. That is simple in absence of type classes.
Comment: This would mean you would get no static assurances against mixing up a strict IntMap with a lazy IntMap; for all intents and purposes, this is equivalent to implementing strict versions of all functions on top of a lazy data type.
Exactly.
+1 for this proposal.

Unsure, ask again in a few days. Benefits for encoding at the data type level: 1. You'll never "forget" to add an appropriate `seq`, so the code looks essentially the same in the lazy and strict cases. 2. It's true that converting between lazy and strict maps can be something of a pain. However, I suspect that it is good discipline to know what strictness your map is (I may be wrong here.) Cheers, Edward

On Sun, May 22, 2011 at 1:04 PM, Edward Z. Yang
2. It's true that converting between lazy and strict maps can be something of a pain. However, I suspect that it is good discipline to know what strictness your map is (I may be wrong here.)
I think we wil need a type class for maps regardless, as it's sometime useful to treat e.g. a ordered and unordered map the same. Johan

I'm warming up to the same underlying type approach. Unfortunately, it's not fully general (for example, you couldn't use this to manage lazy/strict bytestrings) but I could see this spreading to other libraries (e.g. transformers) Edward Excerpts from Edward Z. Yang's message of Sun May 22 07:04:20 -0400 2011:
Unsure, ask again in a few days.
Benefits for encoding at the data type level:
1. You'll never "forget" to add an appropriate `seq`, so the code looks essentially the same in the lazy and strict cases.
2. It's true that converting between lazy and strict maps can be something of a pain. However, I suspect that it is good discipline to know what strictness your map is (I may be wrong here.)
Cheers, Edward

On Sun, May 22, 2011 at 12:58 PM, Milan Straka
Do you think we should have different types for static and lazy maps? It feels odd to me to have these distinguished on type level. To me it is more about how you want to work with the structure, not the structure itself.
I've thought about this issue as well. At the moment Data.HashMap.Strict just provides wrappers around the functions in Data.HashMap.Lazy. Two implications of this is that all instances are shared (so a HashMap from Data.HashMap.Strict still behaves like a lazy map when used through class instances) and that it's a bit trickier to get strictness right (while it falls out almost automatically if you make the type itself strict). I also think there's a small performance implication of having the type itself strict, in that GHC never will have to check if the value is evaluated. I need to investigate this some more though. Cheers, Johan

On 22/05/2011 11:58, Milan Straka wrote:
Excerpts from Milan Straka's message of Sun May 22 06:50:07 -0400 2011:
- the types Data.IntMap.IntMap, Data.IntMap.Lazy.IntMap and Data.IntMap.Strict.IntMap should be equal -- so I can use strict methods on IntMap someone else created with lazy methods. That is simple in absence of type classes.
Comment: This would mean you would get no static assurances against mixing up a strict IntMap with a lazy IntMap; for all intents and purposes, this is equivalent to implementing strict versions of all functions on top of a lazy data type.
Exactly.
Do you think we should have different types for static and lazy maps? It feels odd to me to have these distinguished on type level. To me it is more about how you want to work with the structure, not the structure itself.
But maybe others feel differently.
I like your proposal. If the two were different types, I'm fairly sure we would end up needing strict operations on the lazy IntMap occasionally. Making the two types equivalent adds some useful flexibility at the expense of some static checking - it's a tradeoff, and reasonable people could differ on which they prefer, but I personally would rather have one IntMap with strict and lazy APIs. Cheers, Simon

On Mon, May 23, 2011 at 11:47 AM, Simon Marlow
I personally would rather have one IntMap with strict and lazy APIs.
I am also in favour of the "same type approach" because it seems it can help to avoid duplicating the implementations. Would it be efficient to define a type Data.IntMap.Strict.IntMap with strict fields (also for values) and then define Data.IntMap.Lazy in terms of this implementation and a wrapper type? data Lazy a = Lazy { getLazy :: a } newtype IntMap a = LazyIntMap { getLazyIntMap :: Data.IntMap.Strict (Lazy a) } lookup :: Int -> IntMap a -> Maybe a lookup n = fmap getLazy . Data.IntMap.Strict.lookup n . getLazyIntMap ... Sebastian

On Mon, May 23, 2011 at 1:31 PM, Sebastian Fischer
I am also in favour of the "same type approach" because it seems it can help to avoid duplicating the implementations. Would it be efficient to define a type Data.IntMap.Strict.IntMap with strict fields (also for values) and then define Data.IntMap.Lazy in terms of this implementation and a wrapper type? data Lazy a = Lazy { getLazy :: a } newtype IntMap a = LazyIntMap { getLazyIntMap :: Data.IntMap.Strict (Lazy a) } lookup :: Int -> IntMap a -> Maybe a lookup n = fmap getLazy . Data.IntMap.Strict.lookup n . getLazyIntMap
Not really, it adds two words worth of overhead and one extra indirection. Johan

On Mon, May 23, 2011 at 2:14 PM, Johan Tibell
Would it be efficient to define a type Data.IntMap.Strict.IntMap with strict fields (also for values) and
On Mon, May 23, 2011 at 1:31 PM, Sebastian Fischer
wrote: then define Data.IntMap.Lazy in terms of this implementation and a wrapper type?
Not really, it adds two words worth of overhead and one extra indirection.
Because unboxing can not be used for the polymorphic value component in the strict structure and there is no way to tell GHC to unbox the specific instantiation `IntMap (Lazy a)`? I thought GHC would eliminate the indirection and extra words with -funbox-strict-fields. Sebastian

On 5/22/11 6:52 AM, Edward Z. Yang wrote:
Excerpts from Milan Straka's message of Sun May 22 06:50:07 -0400 2011:
- the types Data.IntMap.IntMap, Data.IntMap.Lazy.IntMap and Data.IntMap.Strict.IntMap should be equal -- so I can use strict methods on IntMap someone else created with lazy methods. That is simple in absence of type classes.
Comment: This would mean you would get no static assurances against mixing up a strict IntMap with a lazy IntMap; for all intents and purposes, this is equivalent to implementing strict versions of all functions on top of a lazy data type.
Not really for *all* intents and purposes. Separating them into different modules makes it very easy to make a top-level decision to import one or the other (or use qualified imports), which is a good deal cleaner than having primes scattered hither and yon. Personally I'm a fan of this approach, because it means we never have to convert between representations but we also clean up the API (as mentioned above). I'm less concerned about the possibility that someone might hand me a map containing thunks. Also, it allows for easily making partially strict IntMaps when you want to be lazy in general but know you must be strict for specific operations. Moreover, if you really want to make the type-level distinction, why not use newtypes? The conversion will be free in one direction, and the other direction can be done by just forcing all the elements in-place (instead of cloning the tree[1]). The only reason to actually duplicate the structure definition is if there's a whole lot of overhead for asking whether elements have already been evaluated. [1] And if you do end up having separate ADTs, then there should be explicit functions for casting from one to the other, so that you don't need to go via (fromAscList . toAscList) -- Live well, ~wren
participants (8)
-
Edward Z. Yang
-
Henning Thielemann
-
Johan Tibell
-
Malcolm Wallace
-
Milan Straka
-
Sebastian Fischer
-
Simon Marlow
-
wren ng thornton