Map folds in containers: why no worker/wrapper split?

Looking at the source code for containers (0.3.0.0), especially the foldWithKey functions in Data.Map and Data.IntMap, I see recursive definitions with non-strict accumulators, without the usual worker/wrapper split, such as: foldlWithKey :: (b -> k -> a -> b) -> b -> Map k a -> b foldlWithKey _ z Tip = z foldlWithKey f z (Bin _ kx x l r) = foldlWithKey f (f (foldlWithKey f z l) kx x) r For comparison, here is base:GHC.List.lhs's definition of foldl: -- We write foldl as a non-recursive thing, so that it -- can be inlined, and then (often) strictness-analysed, -- and hence the classic space leak on foldl (+) 0 xs foldl :: (a -> b -> a) -> a -> [b] -> a foldl f z0 xs0 = lgo z0 xs0 where lgo z [] = z lgo z (x:xs) = lgo (f z x) xs Doesn't that mean that strictness in the folded Map/IntMap operations cannot be used by the strictness analyser? What am I missing? Claus

claus.reinke:
Looking at the source code for containers (0.3.0.0), especially the foldWithKey functions in Data.Map and Data.IntMap, I see recursive definitions with non-strict accumulators, without the usual worker/wrapper split, such as:
foldlWithKey :: (b -> k -> a -> b) -> b -> Map k a -> b foldlWithKey _ z Tip = z foldlWithKey f z (Bin _ kx x l r) = foldlWithKey f (f (foldlWithKey f z l) kx x) r
For comparison, here is base:GHC.List.lhs's definition of foldl:
-- We write foldl as a non-recursive thing, so that it -- can be inlined, and then (often) strictness-analysed, -- and hence the classic space leak on foldl (+) 0 xs
foldl :: (a -> b -> a) -> a -> [b] -> a foldl f z0 xs0 = lgo z0 xs0 where lgo z [] = z lgo z (x:xs) = lgo (f z x) xs
Doesn't that mean that strictness in the folded Map/IntMap operations cannot be used by the strictness analyser? What am I missing?
I bet that's the case. Data.Map is the way it is for historical reasons. It needs a dedicated performance maintainer to do detailed benchmarking and static analysis. -- Don

claus.reinke:
Looking at the source code for containers (0.3.0.0), especially the foldWithKey functions in Data.Map and Data.IntMap, I see recursive definitions with non-strict accumulators, without the usual worker/wrapper split, such as:
foldlWithKey :: (b -> k -> a -> b) -> b -> Map k a -> b foldlWithKey _ z Tip = z foldlWithKey f z (Bin _ kx x l r) = foldlWithKey f (f (foldlWithKey f z l) kx x) r
For comparison, here is base:GHC.List.lhs's definition of foldl:
-- We write foldl as a non-recursive thing, so that it -- can be inlined, and then (often) strictness-analysed, -- and hence the classic space leak on foldl (+) 0 xs
foldl :: (a -> b -> a) -> a -> [b] -> a foldl f z0 xs0 = lgo z0 xs0 where lgo z [] = z lgo z (x:xs) = lgo (f z x) xs
Doesn't that mean that strictness in the folded Map/IntMap operations cannot be used by the strictness analyser? What am I missing?
I bet that's the case.
Data.Map is the way it is for historical reasons. It needs a dedicated performance maintainer to do detailed benchmarking and static analysis.
I did quite a lot of benchmarking, strictness analysis and performance improvements to the containers package recently, see the draft of my paper http://fox.ucw.cz/papers/containers/containers.pdf. Next week I will start finalizing the patches and submit them. The 'not generating a worker/wrapper' issue is one of several on my radar, so this should be resolved. Cheers, Milan

fox:
Data.Map is the way it is for historical reasons. It needs a dedicated performance maintainer to do detailed benchmarking and static analysis.
I did quite a lot of benchmarking, strictness analysis and performance improvements to the containers package recently, see the draft of my paper http://fox.ucw.cz/papers/containers/containers.pdf.
Next week I will start finalizing the patches and submit them. The 'not generating a worker/wrapper' issue is one of several on my radar, so this should be resolved.
I find the abstract very exciting, and would love to see a darcs repo. Don't work in isolation! -- Don

fox:
Data.Map is the way it is for historical reasons. It needs a dedicated performance maintainer to do detailed benchmarking and static analysis.
I did quite a lot of benchmarking, strictness analysis and performance improvements to the containers package recently, see the draft of my paper http://fox.ucw.cz/papers/containers/containers.pdf.
Next week I will start finalizing the patches and submit them. The 'not generating a worker/wrapper' issue is one of several on my radar, so this should be resolved.
I find the abstract very exciting, and would love to see a darcs repo.
Don't work in isolation!
I will set it at the beginning of next week, I am occupied elsewhere till then. Cheers, Milan

I did quite a lot of benchmarking, strictness analysis and performance improvements to the containers package recently, see the draft of my paper http://fox.ucw.cz/papers/containers/containers.pdf.
Next week I will start finalizing the patches and submit them. The 'not generating a worker/wrapper' issue is one of several on my radar, so this should be resolved.
That is good to hear. It is great that someone is doing the work needed to pin down, measure, document, and perhaps even cure some of the performance issues in containers. If nothing else, a published summary of the performance issues that have popped up over the years would be a start. Btw, the first thing I'd look for in a paper on containers performance, and the one issue that tends to trip up most users, is strictness. So I was surprised not to see a discussion of this in your draft (at least not in a quick browse). Could it be that your work has focussed on the Sets rather than the Maps in containers? Lots of performance problems with containers are about pragmatics, not representation, eg, one knows that Maps are strict in keys but available operations happen to be defined in such a way that one cannot easily ensure other desirable strictness properties. For the worker/wrapper split, there are non-strict definitions of higher-order functions, which the compiler could make strict by using strictness information about the calling context if the definitions were written differently. For other operations, turning the non-strict definitions into strict versions (when strict versions are more appropriate; ideally, both variants should be well supported and documented) is often difficult for users of containers. The issues are also non-obvious, which is perhaps worse, because Map/IntMap is often used to attempt to speed up a loop accumulator, with the result that thunks pile up behind the scene and performance goes down until large problem sizes cannot be handled. At this point, users are no longer aware that they built in the performance problem right at the start, by using a supposedly efficient data structure in an unfortunate way - they think it is about Haskell and large datasets. Some examples that I happen to recall (there are more in the list archives of cafe and libraries, under various keywords, as the original poster doesn't realize that the root cause is in containers; there are also some related tickets in the GHC trac): - IntMap and Map expose different APIs wrt strictness http://www.haskell.org/pipermail/libraries/2009-March/011399.html http://www.mail-archive.com/haskell-cafe@haskell.org/msg48925.html - to work around the non-strict Map fold, users have to resort to nested folds (non-strict fold to convert to list, then strict foldl' over list) and other tricks: http://www.haskell.org/pipermail/haskell-cafe/2010-June/079098.html similarly non-trivial thinking is required to work around other no-control-over-strictness issues in the APIs, bypassing the obvious functions and using non-obvious ones based on knowledge about their implementation - strictness is used inconsistently: http://hackage.haskell.org/trac/ghc/ticket/4109 - strictness is not documented: !!! search for "strict" in the Data.Map/IntMap documentation.. this is really bad - the only way for users to become aware of strictness problems is by running into them, and the only way to fix strictness-related performance issues is by referring to the implementation (in theory, an implementation change could invalidate lots of performance fixes in production code) It seems that strictness documentation and control would make for a large section in a paper on containers performance. Claus

Hi, I agree the strictness is undocumented and inconsistent (I was bitten by it several times). I did not mention it much in the paper, maybe I should have. I want to work on the patches now with the knowledge of the benchmark results, and the strictness info is another thing I want to deal with. BTW, the paper speaks of sets only, but mentions that it applied to maps too. The benchmarks contains both Set/Map and IntSet/IntMap and the initial patches were also to all there four containers. Cheers, Milan
I did quite a lot of benchmarking, strictness analysis and performance improvements to the containers package recently, see the draft of my paper http://fox.ucw.cz/papers/containers/containers.pdf.
Next week I will start finalizing the patches and submit them. The 'not generating a worker/wrapper' issue is one of several on my radar, so this should be resolved.
That is good to hear. It is great that someone is doing the work needed to pin down, measure, document, and perhaps even cure some of the performance issues in containers. If nothing else, a published summary of the performance issues that have popped up over the years would be a start.
Btw, the first thing I'd look for in a paper on containers performance, and the one issue that tends to trip up most users, is strictness. So I was surprised not to see a discussion of this in your draft (at least not in a quick browse). Could it be that your work has focussed on the Sets rather than the Maps in containers?
Lots of performance problems with containers are about pragmatics, not representation, eg, one knows that Maps are strict in keys but available operations happen to be defined in such a way that one cannot easily ensure other desirable strictness properties.
For the worker/wrapper split, there are non-strict definitions of higher-order functions, which the compiler could make strict by using strictness information about the calling context if the definitions were written differently.
For other operations, turning the non-strict definitions into strict versions (when strict versions are more appropriate; ideally, both variants should be well supported and documented) is often difficult for users of containers.
The issues are also non-obvious, which is perhaps worse, because Map/IntMap is often used to attempt to speed up a loop accumulator, with the result that thunks pile up behind the scene and performance goes down until large problem sizes cannot be handled. At this point, users are no longer aware that they built in the performance problem right at the start, by using a supposedly efficient data structure in an unfortunate way - they think it is about Haskell and large datasets.
Some examples that I happen to recall (there are more in the list archives of cafe and libraries, under various keywords, as the original poster doesn't realize that the root cause is in containers; there are also some related tickets in the GHC trac):
- IntMap and Map expose different APIs wrt strictness http://www.haskell.org/pipermail/libraries/2009-March/011399.html http://www.mail-archive.com/haskell-cafe@haskell.org/msg48925.html
- to work around the non-strict Map fold, users have to resort to nested folds (non-strict fold to convert to list, then strict foldl' over list) and other tricks: http://www.haskell.org/pipermail/haskell-cafe/2010-June/079098.html
similarly non-trivial thinking is required to work around other no-control-over-strictness issues in the APIs, bypassing the obvious functions and using non-obvious ones based on knowledge about their implementation
- strictness is used inconsistently: http://hackage.haskell.org/trac/ghc/ticket/4109
- strictness is not documented: !!! search for "strict" in the Data.Map/IntMap documentation..
this is really bad - the only way for users to become aware of strictness problems is by running into them, and the only way to fix strictness-related performance issues is by referring to the implementation (in theory, an implementation change could invalidate lots of performance fixes in production code)
It seems that strictness documentation and control would make for a large section in a paper on containers performance.
Claus

Claus Reinke wrote:
Lots of performance problems with containers are about pragmatics, not representation, eg, one knows that Maps are strict in keys but available operations happen to be defined in such a way that one cannot easily ensure other desirable strictness properties.
Not to put any extra burden on the messenger, but do you have a summary of the particular strictnesses that are often desired, or the common problems encountered in practice? In particular I'm thinking in the context of the bytestring-trie package and how it can be improved. The initial API sought to mirror the containers version of Map/IntMap, because they seemed rather stable at the time. However, there've been a number of changes and proposals lately, so I want to be sure I don't miss anything. -- Live well, ~wren

Lots of performance problems with containers are about pragmatics, not representation, eg, one knows that Maps are strict in keys but available operations happen to be defined in such a way that one cannot easily ensure other desirable strictness properties.
Not to put any extra burden on the messenger, but do you have a summary of the particular strictnesses that are often desired, or the common problems encountered in practice?
Can't provide an exhaustive summary, but perhaps I can illustrate the general overview already given in a bit more detail. The background for these problems is usually the same, namely an application that needs a Map that is strict in its values, not just in its keys. That is not the default usage, and it is not well supported. Let us say, some working code is translated from a strict functional language, and the translator just wants to make the code do the same thing in the non-strict language Haskell (without changing the algorithm to take advantage of non-strictness). That is sadly typical of the "my Haskell code is slow and runs out of memory" threads (sometimes strictness is just not thought about, though). So we need to ensure an invariant that would be the default in a strict language: the contents of Map entries are always evaluated. With that in mind, look at the Data.Map API again: 1. without looking at the source, the existence of insertWith(Key)' is the only hint that the other operations are not strict in the values (and insertWith' is not in IntMap, only in Map) 2. you can probably guess that Maps are strict in keys and representation, from the description as balanced trees, but to be sure, you'll be looking at the source 3. of the operations, we are not concerned with lookups (we need to evaluate "values" before putting them into the Map) nor with deletion; so what about construction/modification/ combination (note that inspection of source is necessary to make any progress)? - singleton/insert: easy enough - insertWith(Key): oops! we can't use this; and insertWith(Key)' is only provided for Map, not IntMap (so we can't switch between the two), - insertLookupWithKey: oops - adjust*/update* go back to udpateWithKey: looks useable - alter: looks useable - union(s): look ok - union(s)With(Key): oops - difference*: ok - intersection: ok - intersectionWith(Key): oops - map(WithKey): oops - mapAccum(R)(WithKey): oops - mapKeys: ok - mapKeysWith: oops - mapKeysMonotonic: ok - .. Even if you do this analysis (many users won't, and I don't guarantee I've got it right), you're left with a substantially smaller useable API. "useable/ok" here means that there is a way to modify the API functions to enforce the strictness invariant we're after (all values evaluated); "oops" means that even if we wanted to, we could not use the function because it does not give us the means to add strictness. Generally, higher-order operations that apply functions but simply put the result in a non-strict field of the Map representation are suspect here; things like adjust/alter are more helpful because we can tie the Just/Nothing decision to prior evaluation; things like folds suffer from the usual issues (non-strict in accumulator, not written in a way that would help the strictness analyser to produce a specialised version); .. Chances are (as various "my Haskell code is slow/overflows" posts have illustrated over the years) that users will get it wrong. And those who try to get it right won't be happy with the API support for their use case (the useable API is so reduced that it is tempting to fork the code instead). So most of it boils down to just one common problem, with many different symptoms. The solution has to be designed into the API from the start, and it needs to be consistent and documented. One starts with the two main use cases (Maps strict or non-strict in values) and investigates how that affects all the API operations. Implementing everything twice would be code duplication, but better than what we have now (suggestions for limiting code duplication exist as well: specialize the representation or the operations for the two use cases, using specialization, inlining, strictness analysis, type families, or ..). Implementers are turned away by the code duplication of the simple solution, and nobody has done the measurements to compare the more elegant solutions, as far as I know. The default use case (Maps not strict in values) works, the other use case (Maps strict in values) is understood, but not well supported (making sure that availability of a key can be made dependent on the evaluation of its value would be sufficient, but all too many operations fall short of supporting this). Which is why I attributed the problems to pragmatics. Does this help? Claus
In particular I'm thinking in the context of the bytestring-trie package and how it can be improved. The initial API sought to mirror the containers version of Map/IntMap, because they seemed rather stable at the time. However, there've been a number of changes and proposals lately, so I want to be sure I don't miss anything.
API mirroring seems a good idea, to make it easier to choose the representation most suited for an application. It should work both ways: if you find the need to provide slightly different operations to support strict mode with your package, suggest that containers should follow.
participants (4)
-
Claus Reinke
-
Don Stewart
-
Milan Straka
-
wren ng thornton