
Dear Luke, I suspect Black Magic at work here. This seems to work and I so don't have a clue why. But thank you very much nevertheless, I strongly suspect that, once I figured out why this works, I will have learned a very, very important trick indeed. Had I read "purely functional data structures" from start to finish, would I have come across this? Günther Luke Palmer schrieb:
On Tue, Mar 24, 2009 at 3:51 PM, Luke Palmer
mailto:lrpalmer@gmail.com> wrote: On Tue, Mar 24, 2009 at 3:15 PM, Gü?nther Schmidt
mailto:gue.schmidt@web.de> wrote: Hi,
let say I got an unordered lazy list of key/value pairs like
[('a', 99), ('x', 42), ('a', 33) ... ]
and I need to sum up all the values with the same keys.
So far I wrote a naive implementation, using Data.Map, foldl and insertWith..
The result of this grouping operation, which is effectively another list of key/value pairs, just sums this time, needs to be further processed.
The building of this map is of course a bottleneck, the successive processing needs to wait until the entire list is eventually consumed the Map is built and flattened again.
Is there another way of doing this, something more "streaming architecture" like?
Yeah, make a trie. Here's a quick example.
import Data.Monoid
newtype IntTrie a = IntTrie [a]
singleton :: (Monoid a) => Int -> a -> IntTrie a singleton ch x = IntTrie [ if fromIntegral ch == i then x else mempty | i <- [0..] ]
This definition of singleton unnecessarily leaks memory in some cases. Here's a better one:
singleton ch x = IntTrie $ replicate ch mempty ++ [x] ++ repeat mempty
Luke
lookupTrie :: IntTrie a -> Int -> a lookupTrie (IntTrie xs) n = xs !! n
instance (Monoid a) => Monoid (IntTrie a) where mempty = IntTrie (repeat mempty) mappend (IntTrie xs) (IntTrie ys) = IntTrie (infZipWith mappend xs ys)
infZipWith f ~(x:xs) ~(y:ys) = f x y : infZipWith f xs ys
test = mconcat [ singleton (n `mod` 42) [n] | n <- [0..] ] `lookupTrie` 10
This is an inefficient way to find the class of n such that n mod 42 = 10. Note that it works on an infinite list of inputs.
Here the "trie" was a simple list, but you could replace it with a more advanced data structure for better performace.
Luke
------------------------------------------------------------------------
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe