
"Eray Ozkural (exa)" wrote:
Thanks for the suggestion, but isn't this a bit inefficient?
No, it's *quite* inefficient. ;-) If you want something faster, you can use standard techniques like tacking a unique ID onto each distinct set element and performing more expensive comparisons only when IDs match. Consider it a manual compare-by-reference implementation. For example, here's a quick variation on the code I used earlier. This version uses an [(ID, elem)] representation for sets (type UidSet) underneath the Hypergraph representation. Usage is straightforward: You convert a list of edge lists into the new representation HypergraphRep, perform set operations as usual, and when you want to inspect a result, call hgraphUnRep to convert the representation to the normal Set-of-Sets form. I'm not sure if this is appropriate for your application, but it ought to give you a few ideas about what is possible without only minor tinkering. The disadvantage of this representation is that you must thread a unique-ID through set creation. You could use an unsafe variant if you wanted to eliminate the manual threading. ==== begin ==== module Hypergraph where import Set type Hypergraph a = Set (Set a) type UidSet a = Set (Int, a) type HypergraphRep a = UidSet (UidSet a) instance Ord a => Ord (Set a) where x <= y = (setToList x) <= (setToList y) compare x y = compare (setToList x) (setToList y) instance Show a => Show (Set a) where showsPrec p x = showsPrec p (setToList x) hgraph :: Ord a => [[a]] -> Hypergraph a hgraph = mkSet . map mkSet mkUidSet :: Ord a => Int -> [a] -> (Int, UidSet a) mkUidSet uid xs = (uid + length xs, mkSet (zip [uid..] xs)) hgraphR :: Ord a => Int -> [[a]] -> (Int, HypergraphRep a) hgraphR uid = hgraphR' (uid, []) hgraphR' :: Ord a => (Int, [UidSet a]) -> [[a]] -> (Int, HypergraphRep a) hgraphR' (uid, zs) [] = mkUidSet uid zs hgraphR' (uid, zs) (x:xs) = hgraphR' (uid', z:zs) xs where (uid', z) = mkUidSet uid x hgraphUnRep :: Ord a => HypergraphRep a -> Hypergraph a hgraphUnRep = mapSet (mapSet snd . snd) -- compare the two methods -- first, the normal representation bigHSet = hgraph [ [1..x] | x <- [1..1000] ] lilHSet = hgraph [ [0..x] | x <- [1..10] ] -- second, the new UidSet representation uid = 0 (uid', bigHSetR) = hgraphR uid [ [1..x] | x <- [1..1000] ] (uid'',lilHSetR) = hgraphR uid' [ [0..x] | x <- [1..10] ] ==== end ==== As a quick (and contrived) performance comparison, I computed the interection of two "disjoint" Hypergraphs. The first timing uses the normal representation, and the second the new variant. $ ghci -package data ___ ___ _ / _ \ /\ /\/ __(_) / /_\// /_/ / / | | GHC Interactive, version 5.00.2. [...] / /_\\/ __ / /___| | http://www.haskell.org/ghc/ \____/\/ /_/\____/|_| Type :? for help. Loading package std ... linking ... done. Loading package lang ... linking ... done. Loading package concurrent ... linking ... done. Loading package posix ... linking ... done. Loading package util ... linking ... done. Loading package data ... linking ... done. Prelude> :load Hypergraph Compiling Hypergraph ( Hypergraph.hs, interpreted ) Ok, modules loaded: Hypergraph. Hypergraph> :set +s Hypergraph> bigHSet `intersect` lilHSet [] (39.84 secs, 1058269360 bytes) Hypergraph> hgraphUnRep $ bigHSetR `intersect` lilHSetR [] (1.82 secs, 28345104 bytes) Cheers, Tom