
Thank you all for the helpful responses! The solution from Daniel Fischer
worked like a charm, the solution from Aai kinda worked, but I can't use
because I don't want to modify the order of the elements in the list...
And yes, my "criterion" was a bit ill-defined, so the very clever function
from Ozgur satisfied it :P The real goal is to maximize the number of groups
though.
I'll test the solutions more hardly today and try to make the functions from
Daniel even more clearer and elegant. I post the results back here!
Thanks a lot!
João Paulo Pizani Flor
joaopizani@gmail.com
Federal University of Santa Catarina
2010/11/23 Daniel Fischer
On Tuesday 23 November 2010 18:23:28, João Paulo Pizani Flor wrote:
Hello dear Haskellers!
I've been a user and big fan of Haskell for a while, but only now I'm writing my first "big" project in Haskell (some thousands of lines of code perhaps). It's an interpreter for a programming language, the source code is music! Yes, sheet music! :D
OK, so my specific problem goes like this: I have a list of tuples
:t myList
[ (Float, Integer) ]
And I want to "group" this list into a nested list groupAtoms :: [ (Float,Integer) ] -> [ [(Float,Integer)] ]
Of course, I want that the concatenation of the groups yield me the original list, i.e, (foldl (++) [] . groupAtoms == id),
Here, foldr is the better choice. (++) can start delivering its result before inspecting its second argument, hence
foldr (++) [] (list1 : list2 : list3 : ...) = list1 ++ (foldr (++) [] (list2 : list3 : ...)) = list1 ++ (list2 ++ (foldr (++) [] (list3 : ...)))
can be consumed as it is constructed, so it can run in constant space, and it can stop early if not the entire result is needed. And it can deal with infinite lists.
The prelude function concat does this.
foldl on the other hand can't deliver anything before the whole input list has been traversed, in particular it doesn't terminate on infinite input, it needs at least O(length result) space and it can't stop early. Additionally, foldl (++) [] constructs left-nested (++)-applications, which are bad for performance.
and the criterion that defines a group is that: "The sum of the first elements in the tuples comprising the list must be greater than or equal to 1.0". That is, given a list of tuples, the boolean predicate deciding whether this list is a PROPER group (True) or TOO SMALL (False) is: \g -> sum (map fst g) >= 1.0
Although the criterion is very clear, I've tried hard until now and couldn't come up with a function for producing the nested list based on this grouping criterion.
Split the task into two parts.
1. collect the first group and the remainder of the list as a pair of lists 2. cons the first group to what recursion on the remainder delivers
Very much like the definition of `lines'.
For your use-case, the condition when a group is complete doesn't only depend on the list element but also on the previous, concretely the running sum of the first components. For a general such function, you need - the stopping condition - an accumulation parameter - an update function for the accumulating parameter as arguments. For example (stupid names, sorry)
breakAcc :: (b -> Bool) -> (a -> b -> b) -> b -> [a] -> ([a],[a]) breakAcc _ _ _ [] = ([],[]) breakAcc cond comb acc xxs@(x:xs) | cond acc = ([],xxs) | otherwise = (x:ys,zs) where (ys, zs) = breakAcc cond comb (comb x acc) xs
solves part 1, then
groupsAcc :: (b -> Bool) -> (a -> b -> b) -> b -> [a] -> [[a]] groupsAcc _ _ _ [] = [] groupsAcc cond comb acc xs = let (g,tl) = breakAcc cond comb acc xs in g : groupsAcc cond comb acc tl
gives you part 2, and your particular function is
groupAtoms :: [(Float,Integer)] -> [[(Float,Integer)]] groupAtoms = groupsAcc (>= 1.0) ((+) . fst) 0
Of course the last group need not be proper. If there are long groups, the above can cause a space leak (as lines had in GHC < 7), that can be avoided by sacrificing a little non-strictness and replacing the let (g,tl) = ... in with a
case ... of (g,tl) -> g : groupsAcc ...
or, if you need the non-strictness, with a slightly convoluted method as used for lines in GHC 7.
I am sure that the Haskell Hierarchical Libraries have something to help me, but only now I see I'm still a big noob :P
Could someone please help me writing this function?
My best regards from Brazil,
João Paulo Pizani Flor joaopizani@gmail.com Federal University of Santa Catarina