
Ok, after mulling over the issues that Will Ness has brought up in the last few days [1], I think I have a partial explanation for the apparent tension between Will's observations and Heinrich Apfelmus's Implicit Heaps article [2], which both concern the implementation of mergeAll [3]. The merge' function takes two ordered lists, with the head of the first list less than the head of the second, and merges their contents: merge' [] ys = ys merge' (x:xs) ys = x : merge xs ys The nice thing about this function is we can merge an infinite number of lists by folding right, if assume that the heads of those lists are appropriately ordered. This appears in Richard Bird's code at the end of Melissa O'Neill's "Genuine Sieve of Eratosthenes", though undoubtedly this observation has been independently made by many people. Now, in an ordinary sense, merge' *is* an associative operator, in that given three fully defined ordered lists with ordered heads, merge' xs (merge' ys zs) == merge' (merge' xs ys) zs. This allows us to merge an infinite number of lists using an arbitrary tree of merge' operations, without ever worrying that we will return an incorrect result. (However, we might get stuck in a non-productive loop, for example if we fold left over an infinite list of ordered lists) However, Heinrich's article uses a stronger sense of associativity which includes laziness properties and thus captures some operational characteristics. In this sense, merge' *is not* associative. To remedy this, Heinrich uses VIPs to strengthen the associativity property of merge'. This raises the question, is there some combination of the shape of the merge' tree and some input for which using VIPs dramatically changes the efficiency of a mergeAll operation? I suspect the answer is yes, though I don't know for sure at this point in time. However, I do tacitly believe that the current tree that mergeAll uses doesn't exhibit this property for any input, and so I have simplified the implementations of mergeAll and unionAll in the latest version of data-ordlist-0.4.4 by avoiding the use of VIPs. This has the nice side benefit of modestly improving performance when the elements of the result are highly biased towards the first list. Best, Leon [1] http://permalink.gmane.org/gmane.comp.lang.haskell.cafe/84666 [2] http://apfelmus.nfshost.com/articles/implicit-heaps.html [3] http://hackage.haskell.org/packages/archive/data-ordlist/0.4.4/doc/html/Data...