filter by max length a list of lists with equivalent values

Hi, I have what’s proving to be a tricky little problem for me to solve. I’m trying to take a list of a list of Int’s and return the longest list for each list with elements of equivalent value. That’s not a great explanation, so let me try to give an example: If I have a list of lists grouped by element value. ghci> group [1,1,1,1,2,2,2,2,3,3,2,2,2,5,6,2,2,7] [[1,1,1,1],[2,2,2,2],[3,3],[2,2,2],[5],[6],[2,2],[7]] I would like to take the subset of the outer list containing only the longest of the inner lists for any particular element. So for this particular example, the desired output would be: [[1,1,1,1],[2,2,2,2],[3,3],[5],[6],[7]] The lists [2,2,2] and [2,2] would be removed because they’re shorter than [2,2,2,2]. Any thoughts on how to do this would be appreciated. Thanks and best regards, James

On 2014-02-11 15:54, James Toll wrote:
If I have a list of lists grouped by element value.
ghci> group [1,1,1,1,2,2,2,2,3,3,2,2,2,5,6,2,2,7] [[1,1,1,1],[2,2,2,2],[3,3],[2,2,2],[5],[6],[2,2],[7]]
I would like to take the subset of the outer list containing only the longest of the inner lists for any particular element.
So for this particular example, the desired output would be:
[[1,1,1,1],[2,2,2,2],[3,3],[5],[6],[7]] Any thoughts on how to do this would be appreciated.
After grouping the given list, so that you have [[1,1,1,1],[2,2,2,2],[3,3],[2,2,2],[5],[6],[2,2],[7]] Sort the list by comparing the first element of each list ghci> sortBy (comparing head) [[1,1,1,1],[2,2,2,2],[3,3],[2,2,2],[5],[6],[2,2],[7]] [[1,1,1,1],[2,2,2,2],[2,2,2],[2,2],[3,3],[5],[6],[7]] Then, group that again such that lists with equal elements get put into one list: ghci> groupBy ((==) `on` head) [[1,1,1,1],[2,2,2,2],[2,2,2],[2,2],[3,3],[5],[6],[7]] [[[1,1,1,1]],[[2,2,2,2],[2,2,2],[2,2]],[[3,3]],[[5]],[[6]],[[7]]] Finally, select the "maximum" of each inner list by comparing the length of the sub-sub-lists: ghci> map (maximumBy (comparing length)) [[[1,1,1,1]],[[2,2,2,2],[2,2,2],[2,2]],[[3,3]],[[5]],[[6]],[[7]]] You'll need "Data.List", "Data.Ord" and "Data.Function" for this. -- Frerich Raabe - raabe@froglogic.com www.froglogic.com - Multi-Platform GUI Testing

On Feb 11, 2014, at 9:08 AM, Frerich Raabe
After grouping the given list, so that you have
[[1,1,1,1],[2,2,2,2],[3,3],[2,2,2],[5],[6],[2,2],[7]]
Sort the list by comparing the first element of each list
ghci> sortBy (comparing head) [[1,1,1,1],[2,2,2,2],[3,3],[2,2,2],[5],[6],[2,2],[7]] [[1,1,1,1],[2,2,2,2],[2,2,2],[2,2],[3,3],[5],[6],[7]]
Then, group that again such that lists with equal elements get put into one list:
ghci> groupBy ((==) `on` head) [[1,1,1,1],[2,2,2,2],[2,2,2],[2,2],[3,3],[5],[6],[7]] [[[1,1,1,1]],[[2,2,2,2],[2,2,2],[2,2]],[[3,3]],[[5]],[[6]],[[7]]]
Finally, select the "maximum" of each inner list by comparing the length of the sub-sub-lists:
ghci> map (maximumBy (comparing length)) [[[1,1,1,1]],[[2,2,2,2],[2,2,2],[2,2]],[[3,3]],[[5]],[[6]],[[7]]]
Thank you for the nicely detailed explanation. In my frustration, I was fearing I was going to have to revert to a more imperative approach to solving this, and I was hoping someone would demonstrate a more functional solution. In the various iterations that I unsuccessfully tried, I think the major idea I was lacking was your second step, the groupBy ((==) `on` head). That’s really the part I just wasn’t able to come up with on my own. Again, thanks so much for the help. Best, James

Another way to do this would be to sort the sub lists according to what they contain (the list heads) and then their length (to resolve ties). You could then use nubBy with "(==) `on` head" to remove lists and only be left with the longest lists. Since nub[By] keeps only the first occurrence of each element when there are duplicates, "comparing length" is flipped - meaning that longer lists come first. This gives us the following function: nubBy ((==) `on` head) . sortBy (comparing head <> flip (comparing length)) . group Where (<>) is from Data.Monoid. An explanation of using monoids to build sorting combinators can be found in this reddit post [note that the author uses (++) for (<>)]: http://www.reddit.com/r/programming/comments/7cf4r/monoids_in_my_programming... -- Erlend Hamberg ehamberg@gmail.com

On Feb 11, 2014, at 10:15 AM, Erlend Hamberg wrote:
Another way to do this would be to sort the sub lists according to what they contain (the list heads) and then their length (to resolve ties). You could then use nubBy with “(==) `on` head” to remove lists and only be left with the longest lists. Since nub[By] keeps only the first occurrence of each element when there are duplicates, “comparing length” is flipped – meaning that longer lists come first. This gives us the following function:
nubBy ((==) `on` head) . sortBy (comparing head <> flip (comparing length)) . group
Where (<>) is from Data.Monoid. An explanation of using monoids to build sorting combinators can be found in this reddit post [note that the author uses (++) for (<>)]: http://www.reddit.com/r/programming/comments/7cf4r/monoids_in_my_programming...
Thank you for this example and the link. I have not explored the Data.Monoid module yet, but I will read through the reddit post. Best, James
participants (3)
-
Erlend Hamberg
-
Frerich Raabe
-
James Toll