
Francesco Bochicchio wrote:
But, as you said, appending is expensive in haskell
We can say how expensive. Namely, fully evaluating xs ++ ys takes O(length xs) time. Always appending an element at the end will lead to something like (..((([1] ++ [2]) ++ [3]) ++ [4]) ++ ..) ++ [n] which will take O(n^2) time instead of linear time. This is probably best demonstrated with the following two implementations of reverse reverse1 [] = [] reverse1 (x:xs) = reverse1 xs ++ [x] reverse2 xs = go [] xs where go ys [] = ys go ys (x:xs) = go (x:ys) xs The first runs in quadratic time while second runs in linear time.
so instead I build the lists using (:), which gives me the list in reverse order, and then I use the 'final step' of the function - the base case of the recursion - to reverse the result. If I understand laziness, this should be a low-cost operation because it does not actual build a reversed list, but only make the list to be 'consumed' from the tail.
No, reverse has nothing to do with laziness and still costs O(n) time. It's just that building the list in linear time and then reversing it in linear time is cheaper than building the list in quadratic time.
Here is the code I was referring to:
groups :: (a->Bool)-> [a] -> ([[a]],[a]) groups f lst = groups_ f ([],[]) lst where groups_ f (gs,seps) [] = (reverse gs, reverse seps) groups_ f (gs,seps) [a] = if (f a) then ( reverse ( [a]:gs), reverse seps ) else ( reverse gs , reverse (a:seps) ) groups_ f (gs, seps) lst = groups_ f (g:gs, r:seps) est where (g, (r:est)) = groupWhile f lst
While using reverse instead of appending at the end of the list is a good idea, not building the list in reverse at all is much better. First, let me restate your code as groups p xs = go ([],[]) xs where go (gs,seps) xs = let (g,rest) = span p xs in case rest of r:est -> go (g:gs, r:seps) est [] -> (reverse (g:gs), reverse seps) The groupWhile function can be found in the Prelude, it's called span . This function groups differs from yours, for instance we have groups odd [2,2,2] == ([[],[],[],[]],[2,2,2]) groups odd [1,2] == ([[1],[]],[2]) groups odd [1,1] == ([[1,1]],[]) By the way, your code crashes in the last case. Now, let's reformulate it so that the result won't be built in reverse. The key is to eliminate the unnecessary accumulating parameter and work with the result of the recursive call to groups directly: groups p xs = let (g,rest) = span p xs in case rest of r:est -> let (gs,seps) = groups p est in (g:gs,r:seps) [] -> ([g], []) I'd say that this is the canonical implementation. In case you wrote this function to split strings, have a look at http://hackage.haskell.org/cgi-bin/hackage-scripts/package/split Regards, apfelmus -- http://apfelmus.nfshost.com