Data.List.sortBy creates unnecessary thunks

Data.List.sortBy uses a DList-like representation to efficiently chunk a list, but the final conversion from DList to List is not strict. This leaves unnecessary suspensions while sorting. This was first noticed in https://github.com/ghc-proposals/ghc-proposals/pull/50, where sorting a random list with a naive merge sort is faster than the one in base. Proposing the following change, which should improve performance and space usage. ```diff diff --git a/libraries/base/Data/OldList.hs b/libraries/base/Data/OldList.hs index c3618c42a9..a68c4f42af 100644 --- a/libraries/base/Data/OldList.hs +++ b/libraries/base/Data/OldList.hs @@ -851,7 +851,7 @@ sortBy cmp = mergeAll . sequences ascending a as (b:bs) | a `cmp` b /= GT = ascending b (\ys -> as (a:ys)) bs - ascending a as bs = as [a]: sequences bs + ascending a as bs = *as [a] `seq`* as [a] : sequences bs mergeAll [x] = x mergeAll xs = mergeAll (mergePairs xs) ``` And, as a side note, can we please add more type signatures and comments to the sort/sortBy function? Cheers, Sid
participants (1)
-
Siddhanathan Shanmugam