Nice way to calculate character frequency in a string

Sorry to drag this thread out, but here's one more thing you might try... I was thinking, if we just wanted something like intTable :: [Int] -> [(Int, Int)] we could just replace Map with IntMap in the previous solution: intTable xs = IntMap.assocs $! foldl' f IntMap.empty xs where f m x = let m' = IntMap.insertWith (+) x 1 m Just v = IntMap.lookup x m' in v `seq` m' To get another polymorphic version, we could just write this wrapper: freq :: (Enum a) => [a] -> [(a,Int)] freq = map fstToEnum . intTable . map fromEnum where fstToEnum (x,y) = (toEnum x, y) This seems to run faster than the other polymorphic version on my machine. Chad Scherrer Computational Mathematics Group Pacific Northwest National Laboratory "Time flies like an arrow; fruit flies like a banana." -- Groucho Marx

Scherrer, Chad wrote:
Sorry to drag this thread out, but here's one more thing you might try...
(This is the café, isn't it? :-) Another option is perhaps to pack both char and count in one Int and use some kind of Set. This should save some space, and possibly time as well (presuming bitwise ops are less expensive than pointer dereferences, which I believe have been a safe assumption since the mid-90s), but requires a Set being searchable for ranges. (Do any of the implementations support this, BTW?) -k

On Thursday 27 October 2005 10:29, Ketil Malde wrote:
Scherrer, Chad wrote:
Sorry to drag this thread out, but here's one more thing you might try...
(This is the café, isn't it? :-)
Another option is perhaps to pack both char and count in one Int and use some kind of Set. This should save some space, and possibly time as well (presuming bitwise ops are less expensive than pointer dereferences, which I believe have been a safe assumption since the mid-90s), but requires a Set being searchable for ranges. (Do any of the implementations support this, BTW?)
It's interesting that you mention this. I have been working for some time now on a library implementation of 2-3 finger search trees (as described in a paper by Ralf Hinze, see http://www.cs.bonn.edu/~ralf/publications/IAI-TR-98-12.ps.gz). These beasts give you quite efficient split (aka partition) operations: complexity is amortized O(d,n-d), where d is the distance from the split key to the smallest element (or the largest, if you like). You can splice out a range using two consecutive single point splits. OTOH, I have been considering to implement a specialized range operation which could be even more efficient. You could also use the similar (but *much* easier to implement) annotated 2-3 finger /leaf/ trees (see http://www.cs.bonn.edu/~ralf/publications/FingerTrees.pdf). As far as I understood, these are not quite as good as the above mentioned node trees, but they differ only by constant factors. Ben
participants (3)
-
Benjamin Franksen
-
Ketil Malde
-
Scherrer, Chad