
Dennis Raddle wrote:
I have a problem from music. I have a list of notes sorted by begin time. I want to sort them by end time.
Notes that are sorted by begin time are usually close to sorted by end time, because notes tend to cluster within a small range of durations.
What algorithms are available to me (hopefully in a library) that are good with this kind of thing?
Sorry, I don't know of any library that handles this special case. I'm also not too familiar with the literature, so I don't know of anyone who writes about it. (It's probably worthwhile to have a look in Knuth.) Anyway, it seems like this is not too hard. Can you guarantee for some value of m that for each note N, only the first m notes following N might end earlier than N? If so, then the following simple algorithm is linear and runs in constant space. You could then use: sortAlmostBy m (comparing endTime) sortAlmostBy :: Int -> (a -> a -> Ordering) -> [a] -> [a] sortAlmostBy m cmp = mergeAABy cmp . map (sortBy cmp) . chunksOf m chunksOf :: Int -> [a] -> [[a]] chunksOf m = map (take m) . takeWhile (not . null) . iterate (drop m) -- Merge a list of lists with two assumptions: -- (1) Each list is sorted. -- (2) The sequence of lists is "almost ascending" - -- no element of a list is greater than any element of -- any list following it, except possibly the list immediately -- following it. mergeAABy :: (a -> a -> Ordering) -> [[a]] -> [a] mergeAABy cmp ((x:xs):xss) = y : mergeAABy cmp yss where (y, yss) = pickLeast x xs xss pickLeast p ps pss@((q:qs):qss) = case cmp p q of GT -> (q, (p : ps) : qs : qss) _ -> (p, ps : pss) pickLeast p ps (_:pss) = pickLeast p ps pss pickLeast p ps _ = (p, [ps]) mergeAABy cmp (_:xss) = mergeAABy cmp xss mergeAABy _ _ = [] You might be able to do a little better than this. Here is one way: GHC would probably optimize this better if you make pickLeast non-recursive by arranging for lists that become empty to be eliminated from the calculation earlier. But as David points out, this is probably good enough for your application, even if you are processing a full orchestra score of a Mahler symphony that lasts for hours. (Whereas just using sortBy for that case might be slow.) Regards, Yitz