
Mitchell, Neil wrote:
Hi Twan,
Almost all uses of sortBy in user code use 'comparing', 'on' or a similar construction [1]. I think we should add a function that makes this common behavior more convenient:
sortOn :: Ord b => (a -> b) -> [a] -> [a]
For consistency we should also add *On for the other *By functions in Data.List:
nubOn :: Eq b => (a -> b) -> [a] -> [a] deleteOn :: Eq b => (a -> b) -> a -> [a] -> [a] deleteFirstsOn :: Eq b => (a -> b) -> [a] -> [a] -> [a] unionOn :: Eq b => (a -> b) -> [a] -> [a] -> [a] intersectOn :: Eq b => (a -> b) -> [a] -> [a] -> [a] groupOn :: Eq b => (a -> b) -> [a] -> [[a]] sortOn :: Ord b => (a -> b) -> [a] -> [a] insertOn :: Ord b => (a -> b) -> a -> [a] -> [a] maximumOn :: Ord b => (a -> b) -> [a] -> a minimumOn :: Ord b => (a -> b) -> [a] -> a (nubSortOn :: Ord b => (a -> b) -> [a] -> [a]) -- see #2629
All good, apart from deleteFirstsOn. I honestly don't know what that function does, I was thinking possibly something like the first component of tuples? But no real idea.
deleteFirstBy is "\\By", so it should perhaps be called differenceBy. But this is just the name that the List module uses. The reason for adding deleteFirstOn is simply consistency, since we also have a By version.
In Hoogle I use sortOn to be the non-cacheing version, and sortWith to be the cacheing version. However, I think we can just ignore the non-cacheing version - if your extraction function is expensive you probably want to think a bit anyway.
If the extraction function is slow, like say "sortOn length", caching is a clear win. On the other hand, there is some constant cost involved. Here are some benchmarks of sorting words in a dictionary: no cache cache sortOn f 3.203125 0.625000 sortOn id 0.156250 0.187500 where f = length . nub . sort, i.e. an expensive function. I'm not sure if the factor 1.2 overhead of the caching version is important enough to warrant exposing two different implementations. Twan