
Dear Haskell Cafe, Given a set of sets, and a particular target set, I want to find the sets that are nearest (in terms of Hamming distance) to the target set. I am using the following code: import Data.List import qualified Data.Set as Set nearest_k :: Ord a => Int -> [(Set.Set a, v)] -> Set.Set a -> [(Set.Set a, v)] nearest_k k bs b = take k bs' where bs' = sortOn (hamming b) bs hamming :: Ord a => Set.Set a -> (Set.Set a, v) -> Int hamming x (y, _) = hamming_distance x y hamming_distance :: Ord a => Set.Set a -> Set.Set a -> Int hamming_distance xs ys = Set.size (Set.difference xs ys) + Set.size (Set.difference ys xs) subsets :: [a] -> [[a]] subsets [] = [[]] subsets (x:xs) = subsets xs ++ map (x:) (subsets xs) int_lists :: [[Int]] int_lists = subsets [1..20] values :: [(Set.Set Int, Int)] values = map f (zip [1..] int_lists) where f (i, x) = (Set.fromList x, i) test = nearest_k 8 values (Set.fromList [1,2,3]) ---- This works ok for the test above (with sets of ints), but is rather slow in my actual application (in which the sets are large sets of ground atoms of first-order logic). Is there some major optimization I should be doing here? thanks, Richard