Haskell hash tables are a notorious performance pig, mostly due to the fact that when we deal with big arrays, if the mutable array changes at all the garbage collector will have to retraverse the entire thing during the next collection. Guess the most common scenario for imperative hash tables that are even lightly tweaked from time to time... ;)
The code below is, I think, n log n, a few seconds on a million + element list.
I wonder if it's possible to get this down to O(N) by using a
hashtable implemementation, or other better data structure.
Further, is there a hashtable implementation for haskell that doesn't
live in IO? Maybe in ST or something?
import Data.HashTable
import qualified Data.Set as S
import Data.List (foldl')
testdata = [1,4,8,9,20,11,20,14,2,15] ++ [1..(10^6)]
wantedsum = 29
-- set data structure
-- findsums locates pairs of integers in a list that add up to a
wanted sum.
findsums :: [Int] -> Int -> S.Set (Int,Int)
findsums xs wanted = snd . foldl' f (S.empty,S.empty) $ xs
where f (candidates,successes) next = if S.member (wanted-next) candidates
then (candidates, S.insert
(next,wanted-next) successes)
else (S.insert next
candidates,successes)
-- hashtable data structure
-- result: t
-- fromList [(15,14),(16,13),(17,12),(18,11),(19,10),(20,9),(21,8),(22,7),(23,6),(24,5),(25,4),(26,3),(27,2),(28,1)]
-- probably O(n log n) complexity since using tree based Data.Set (a
few seconds on million+ element list)
t = findsums testdata wantedsum
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe