New subject: Grouping - Map / Reduce

27 Mar 2009

      Well, this approach has the problem that the running sum of key k blocks
until a new value for that k arrives in the input stream.
If you wanted to know the sum of the values of each key after you got
nelements in the input stream, we could change the valuesWithKey
inner function into:

runningSumsOfValuesPerKey :: (Eq k, Num v) => [k] -> [(k, v)] -> [[v]]
runningSumsOfValuesPerKey allPossibleKeys = runningSums . allValuesPerKey
  where
    runningSums = map (scanl (+) 0)
    allValuesPerKey pairs = [ valuesWithKey key pairs | key <-
allPossibleKeys ]
    valuesWithKey key = map (\(k,v) -> if k==key then v else 0)

then map (!!n) on the result of runningSumsOfValuesPerKey gives you the sum
after n elements arrived.

I think if you now generalize this so you don't use 0 but mempty, mconcat
and other Monoid methods, that you might get something like Luke's trie
solution, not sure, Luke is a fair bit smarter than I am :-)

But this code is very inefficient I'm afraid, I guess the blueprint stuff
that was posted really builds a map incrementally, but I don't understand
that yet.

Ik spreek Nederlands ja ('t is te zeggen, "Antwerps").

Yes I'm still learning Haskell, but I think with Haskell this is a never
ending process, since there's soo much stuff to discover and the language
evolves (which makes it both exciting and frustrating, but that's the
dilemma of knowledge anyway, the more you know the better you realize the
vast amount of knowledge that you don't know yet :-)

On Fri, Mar 27, 2009 at 12:53 AM, Guenther Schmidt wrote:
...
Dear Peter,
wow, thanks, this is a very ... interesting ... approach, I hadn't thought
about that yet ;)
Ben je nederlands?
In case you'd be interested to share the "road to Haskell" experience with
another newbie just ask.
Günther
Peter Verswyvelen schrieb:
I'm also learning Haskell so the solution below might be (1) inefficient
...
and (2) incorrect, but hey, let's give it a try :-)
For simplicity, in the testing code, I assume an infinite list of
key/value pairs where they keys are of type Char between 'a' and 'z' and the
values are Integers (the code also seems to work for keys with just a lower
bound but no upper bound)
 I think the code speaks for itself
import System.Random
 runningSumsOfValuesPerKey :: (Eq k, Num v) => [k] -> [(k, v)] -> [[v]]
runningSumsOfValuesPerKey allPossibleKeys = runningSums . allValuesPerKey
 where
   runningSums = map (scanl (+) 0)
   allValuesPerKey pairs = [ valuesWithKey key pairs | key <-
allPossibleKeys ]
   valuesWithKey key = map snd . filter ((==key) . fst)
 -- Testing
randomPairs :: [(Char, Integer)]
randomPairs = zip keys values
 where
   keys        = randomRs ('a','z') rnd1
   values      = randomRs (0,9) rnd2
   (rnd1,rnd2) = split (mkStdGen 0)
 test = map (take 10) [rs `atKey` 'c', rs `atKey` 'z']
 where
   rs = runningSumsOfValuesPerKey ['a'..] randomPairs
   xs `atKey` k = xs !! (fromEnum k - fromEnum 'a')

Re: [Haskell-cafe] Grouping - Map / Reduce

Peter Verswyvelen

jean-christophe mincke

tags

participants (2)