Re: [Haskell-cafe] Data.Set optimisations [was: library for set of heterogeneous values?]

On Mon, Aug 15, 2016 at 1:15 AM, Anthony Clayden wrote: "indirect jumps are particularly expensive on a modern processor architecture, because they fox the branch prediction hardware, leading to a stall of 10 or more cycles depending on the length of the pipeline." [Introduction. The paper goes on to estimate a penalty around 7 to 9 % in general GHC code.]
The cost model here makes sense if ...
Thanks Wren. Is that 7 to 9 % figure unduly alarmist? (It is consistent with Milan's benchmarking.) For example would strictifying or inlining have much greater benefits than worrying about constructors? And it only affects processor-intensive applications(?)
... Regardless of how GHC does things, those basics of branch prediction and primop ordering will remain the same. ...
In practice GHC does the naive thing of ordering branches to be in order of declaration in the data type definition. This is good in as much as it doesn't matter how users textually order their case analyses, since GHC will reorder them. ...
...the order of data constructors in a data type's definition is also used for other magic, like deriving Ord. ...
Hmm, I'd hardly describe derivng Ord as "magic". It's what the language report says [Chapter 11].
Really, we'd like to make GHC smarter here about how it orders [case branchng code]; but that gets complicated very quickly.
I can see that. I gave some possible rules of thumb in an earlier message. (In particular, recursive constructors are more likely.)
... noone other than me seems to care very much. ...
I guess not many know. We kinda expect GHC will be much smarter than we could achieve by hand-crafting the object code. Those who could write assembler code must be dying out ;-) Judging by the 2006 paper, part of the difficulty seems to be that the C code GHC generates is none too idiomatic for what a C optimiser is tuned for. AntC
participants (1)
-
Anthony Clayden