
This might pay off as well, but I am leery that its a rather tricky
balancing act and will take a lot of profiling to find the right balance in
practical performance vs. asymptotic bounds.
Should we split the notion of improving the performance of fromList into a
separate project/proposal
that simply exposes synergies with this one?
-Edward
On Wed, Aug 29, 2012 at 8:02 PM, Johan Tibell
On Wed, Aug 29, 2012 at 4:54 PM, Edward Kmett
wrote: We should be able to fuse this "try to construct linearly, but fall back on N-log-N" version of fromList in one pass even for normal uses of fromList.
e.g. assume that you are constructing a sorted tree until you find a key out of order, then take the tree you've built so far and union it appropriately with the slower constructed fromList of the remainder. That way you don't have to retain the storage for both the list and the map, and we only force the list once.
Could you instead of falling back to the slower fromList just keep building new trees and then union them all in the end. Real world data is often is semi-sorted order. I guess we would need to detect some worst case scenarios (i.e. sorted in reverse order) and fall back to the slower fromList in those cases.
-- Johan