
On Thu, Jun 24, 2010 at 12:14 PM, Milan Straka
Hi,
I am working on improving containers performance and have written a draft paper about it http://fox.ucw.cz/papers/containers/containers.pdf
I wanted to start a new thread in response to strictness issues brought up by Claus Reinke.
- IntMap and Map expose different APIs wrt strictness http://www.haskell.org/pipermail/libraries/2009-March/011399.html http://www.mail-archive.com/haskell-cafe@haskell.org/msg48925.html
- to work around the non-strict Map fold, users have to resort to nested folds (non-strict fold to convert to list, then strict foldl' over list) and other tricks: http://www.haskell.org/pipermail/haskell-cafe/2010-June/079098.html
similarly non-trivial thinking is required to work around other no-control-over-strictness issues in the APIs, bypassing the obvious functions and using non-obvious ones based on knowledge about their implementation
- strictness is used inconsistently: http://hackage.haskell.org/trac/ghc/ticket/4109
- strictness is not documented: !!! search for "strict" in the Data.Map/IntMap documentation..
this is really bad - the only way for users to become aware of strictness problems is by running into them, and the only way to fix strictness-related performance issues is by referring to the implementation (in theory, an implementation change could invalidate lots of performance fixes in production code)
I need some opinion:
- Do you think methods like insert/lookup/delete/etc should be strict in key/element?
Yes. This allows us to use associated data types to unpack keys and values directly in the constructor saving both time (as accessing keys require fewer indirections) and space (as there's less overhead per key/value pair).
As Claus wrote, right now it is undocumented and inconsistent (both in the methods of one container and also in the same methods of different container).
The problem is the following: a lookup on an empty container does not really have to evaluate the key. Should we honour it or should we sacrifice it to be faster and consistent with other methods (eg., insert has to be strict in the key).
- The Set/Map, IntSet/IntMap do not currently have a strict fold. Do you think it should be added to the API?
Yes. I believe the recent post on storing lots amount of Twitter data in a Map ran into problems due to lack of a strict fold. I think it's important to consider all the container types at once and find a consistent set of functions and strictness guarantees so that we don't introduce more inconsistencies in their APIs. Johan