
paolo.veronelli@gmail.com wrote:
I'm timing the following script.I'm not expert to evaluate th O'ness of this code, I hope someone can do it. The program clusters n integers in m buckets based on their distance. Anyway I thing should be linear.So I timed som executions changing the first arg. [...] mcluster :: [(Int,Int)] -> [(Int,[Int])] mcluster ls = let (lr,lc) = (f *** f) (unzip ls) where f = length.nub -- coo space width (k,r) = divMod lr lc CState cs _ _ _ = execState (devil ls k) CState{clusters = [],remi = r,colsHeap = constArray lc 0,rowsFlag = constArray lr False } in map collapse . groupBy (comp fst (==)) . sort . map swap $ cs
coupage ls = zip [0..] ls delta fxy xs ys = [(abs(x-y),(n,m))|(n,x) <- coupage xs, (m,y) <- coupage ys] decoupage ls n = fromJust $ lookup n (coupage ls)
test xs ys = let d = snd.unzip.sort $ delta (\x y -> abs (x -y)) xs ys in map (decoupage ys *** map (decoupage xs)) (mcluster d)
It isn't, but not for the reasons you might suspect. You're using 'nub', which is quadratic, and your 'coupage' is also quadratic because it uses 'lookup' on a list, which is linear, a linear number of times. You can get this down to O(n * log n) if you replace these lists by Data.Map and Data.Set, to get down to O(n) you need arrays there, too, but that would be pointless, because you're also using 'sort', which is already in O(n * log n). The core of the algorithm is clearly linear in the length of its input. (Btw, putting 'devil' into a state monad doesn't make much sense. I think, ordinary recursion would be more clear. In fact, it's a 'foldl'.) -Udo -- You're damned if you do; you're damned if you don't. -- Bart Simpson