
At 14:00 04/06/03 +0100, Keith Wansbrough wrote:
Looks fine, but you're right - the use of "combinations" is inefficient. It's better to iterate over the elements, rather than the length. Then as you add each element, you consider all the sets you have so far, and either add the new element to each or not. This doubles the number of sets. Hence:
powerset :: [a] -> [[a]] powerset [] = [[]] powerset (x:xs) = xss ++ map (x:) xss where xss = powerset xs
Neat! It happens that for the application I have in mind, it is important to generate shorter subsets first, as the required results are almost always going to come from smaller subsets of a possibly large base set, and I'm aiming to exploit lazy evaluation to keep things tractable. Looking at your code, I'm wondering if it would be possible to use some alternate form of composition instead of ++, and some auxiliary functions to pull the results out in the short-to-long sequence. I'm thinking in terms of building a list of trees.. [[ data NTree a = NTree { nthead::a, ntbody::[NTree a] } instance Functor NTree where fmap f (NTree h ts) = NTree (f h) (map (fmap f) ts) powerset1 :: [a] -> [NTree [a]] powerset1 (x:xs) = (NTree [x] (map (fmap (x:)) xss)) : xss where xss = powerset1 xs powerset1 [] = [] listPowerset :: [NTree [a]] -> [[a]] listPowerset [] = [] listPowerset ts = (map nthead ts) ++ listPowerset bodylist where bodylist = concat $ filter (not . null) $ map ntbody ts testN1 = listPowerset $ powerset1 [1,2,3,4] testN2 = listPowerset $ powerset1 "abcdefgh" ]] The list/tree structure looks something like this: [1] [2] [2,1] [3] [3,1] [3,2] [3,2,1] [4] [4,1] [4,2] [4,2,1] [4,3] [4,3,1] [4,3,2] [4,3,2,1] etc. The list function picks off the members by columns (w.r.t. to above diag)
Notice how important it is to include the empty set in the set of subsets - it won't work at all if you omit it.
Yes, I noticed something similar in my original version. I've chosen not include the empty subset in my results, but that's easily adjusted.
This formulation is particularly nice because in memory, you *share* all of the lists from the previous iteration, rather than making copies.
I *think* my revised formulation achieves this. [...]
My solution isn't perfect, though - the use of append (++) is inefficient; if this could be avoided, it would be faster.
I didn't see any easy way to avoid (++) in my list readout, but I think I
can claim that the length of the leading list if never more than O(log N)
the tree size.
If I'm evaluating and using the list lazily, using a typical recursive
traversal pattern (like the powerset function itself), is there any cause
for the "++" to actually be evaluated? I suspect not, but can't be sure.
#g
-------------------
Graham Klyne