Thank you all for the helpful responses! The solution from Daniel Fischer worked like a charm, the solution from Aai kinda worked, but I can't use because I don't want to modify the order of the elements in the list...

And yes, my "criterion" was a bit ill-defined, so the very clever function from Ozgur satisfied it :P The real goal is to maximize the number of groups though.

I'll test the solutions more hardly today and try to make the functions from Daniel even more clearer and elegant. I post the results back here!


Thanks a lot!

João Paulo Pizani Flor
joaopizani@gmail.com
Federal University of Santa Catarina


2010/11/23 Daniel Fischer <daniel.is.fischer@web.de>
On Tuesday 23 November 2010 18:23:28, João Paulo Pizani Flor wrote:
> Hello dear Haskellers!
>
> I've been a user and big fan of Haskell for a while, but only now I'm
> writing my first "big" project in Haskell (some thousands of lines of
> code perhaps). It's an interpreter for a programming language, the
> source code is music! Yes, sheet music! :D
>
> OK, so my specific problem goes like this: I have a list of tuples
>
> :t  myList
>
> [ (Float, Integer) ]
>
> And I want to "group" this list into a nested list
> groupAtoms :: [ (Float,Integer) ]  ->  [ [(Float,Integer)] ]
>
> Of course, I want that the concatenation of the groups yield me the
> original list, i.e,  (foldl (++) [] . groupAtoms == id),

Here, foldr is the better choice. (++) can start delivering its result
before inspecting its second argument, hence

foldr (++) [] (list1 : list2 : list3 : ...)
 = list1 ++ (foldr (++) [] (list2 : list3 : ...))
 = list1 ++ (list2 ++ (foldr (++) [] (list3 : ...)))

can be consumed as it is constructed, so it can run in constant space, and
it can stop early if not the entire result is needed. And it can deal with
infinite lists.

The prelude function concat does this.

foldl on the other hand can't deliver anything before the whole input list
has been traversed, in particular it doesn't terminate on infinite input,
it needs at least O(length result) space and it can't stop early.
Additionally, foldl (++) [] constructs left-nested (++)-applications, which
are bad for performance.

> and the criterion that defines a group is that:
> "The sum of the first elements in the tuples comprising the list must be
> greater than or equal to 1.0". That is, given a list of tuples, the
> boolean predicate deciding whether this list is a PROPER group (True) or
> TOO SMALL (False) is:
> \g -> sum (map fst g)  >=  1.0
>
>
> Although the criterion is very clear, I've tried hard until now and
> couldn't come up with a function for producing the nested list based on
> this grouping criterion.

Split the task into two parts.

1. collect the first group and the remainder of the list as a pair of lists
2. cons the first group to what recursion on the remainder delivers

Very much like the definition of `lines'.

For your use-case, the condition when a group is complete doesn't only
depend on the list element but also on the previous, concretely the running
sum of the first components.
For a general such function, you need
- the stopping condition
- an accumulation parameter
- an update function for the accumulating parameter
as arguments.
For example (stupid names, sorry)


breakAcc :: (b -> Bool) -> (a -> b -> b) -> b -> [a] -> ([a],[a])
breakAcc _    _    _   []       = ([],[])
breakAcc cond comb acc xxs@(x:xs)
   | cond acc                  = ([],xxs)
   | otherwise                 = (x:ys,zs)
     where
       (ys, zs) = breakAcc cond comb (comb x acc) xs

solves part 1, then


groupsAcc :: (b -> Bool) -> (a -> b -> b) -> b -> [a] -> [[a]]
groupsAcc _    _    _   []  = []
groupsAcc cond comb acc xs  = let (g,tl) = breakAcc cond comb acc xs
                             in g : groupsAcc cond comb acc tl

gives you part 2, and your particular function is


groupAtoms :: [(Float,Integer)] -> [[(Float,Integer)]]
groupAtoms = groupsAcc (>= 1.0) ((+) . fst) 0


Of course the last group need not be proper.
If there are long groups, the above can cause a space leak (as lines had in
GHC < 7), that can be avoided by sacrificing a little non-strictness and
replacing the let (g,tl) = ... in with a

case ... of
 (g,tl) -> g : groupsAcc ...

or, if you need the non-strictness, with a slightly convoluted method as
used for lines in GHC 7.

> I am sure that the Haskell Hierarchical
> Libraries have something to help me, but only now I see I'm still a big
> noob :P
>
> Could someone please help me writing this function?
>
>
> My best regards from Brazil,
>
> João Paulo Pizani Flor
> joaopizani@gmail.com
> Federal University of Santa Catarina