powerset

newer
Testing was Re: main modules in...

Graham Klyne

4 Jun 2003 4 Jun '03

11:08 a.m.

A brief search of the libraries didn't reveal (top me) a ready-to-roll powerset function. I thought this may be a useful exercise to see how well I'm picking up on functional programming idioms, so I offer the following for comment: [[ -- |Powerset of a list, in ascending order of size. -- Assumes the supplied list has no duplicate elements. powerSet :: [a] -> [[a]] powerSet as = foldl (++) [] [ combinations n as | n <- intRange 1 (length as) ] -- |Combinations of n elements from a list, each being returned in the -- order that they appear in the list. combinations :: Int -> [a] -> [[a]] combinations _ [] = [] -- Don't include empty combinations combinations n as@(ah:at) | n <= 0 = [[]] | n > length as = [] | n == length as = [as] | otherwise = (map (ah:) $ combinations (n-1) at) ++ (combinations n at) -- |Return list of integers from lo to hi. intRange :: Int -> Int -> [Int] intRange lo hi = take (hi-lo+1) (iterate (+1) 1) -- Tests testcomb0 = combinations 0 "abcd" -- [] testcomb1 = combinations 1 "abcd" -- ["a","b","c","d"] testcomb2 = combinations 2 "abcd" -- ["ab","ac","ad","bc","bd","cd"] testcomb3 = combinations 3 "abcd" -- ["abc","abd","acd","bcd"] testcomb4 = combinations 4 "abcd" -- ["abcd"] testcomb5 = combinations 5 "abcd" -- [] testpower = powerSet "abc" -- ["a","b","c","ab","ac","bc","abc"] ]] I think the recursive use of 'combinations' may be inefficient, and could maybe be improved by "memoizing"? #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Show replies by date

Keith Wansbrough

4 Jun 4 Jun

1 p.m.

...

A brief search of the libraries didn't reveal (top me) a ready-to-roll powerset function.

I thought this may be a useful exercise to see how well I'm picking up on functional programming idioms, so I offer the following for comment: [..] I think the recursive use of 'combinations' may be inefficient, and could maybe be improved by "memoizing"?

Looks fine, but you're right - the use of "combinations" is inefficient. It's better to iterate over the elements, rather than the length. Then as you add each element, you consider all the sets you have so far, and either add the new element to each or not. This doubles the number of sets. Hence: powerset :: [a] -> [[a]] powerset [] = [[]] powerset (x:xs) = xss ++ map (x:) xss where xss = powerset xs Notice how important it is to include the empty set in the set of subsets - it won't work at all if you omit it. This formulation is particularly nice because in memory, you *share* all of the lists from the previous iteration, rather than making copies. After doing "powerset [1,2,3,4]", the heap looks something like this: let a = [] b = 1:a c = 2:a d = 2:b e = 3:a f = 3:b g = 3:c h = 3:d i = 4:a j = 4:b k = 4:c l = 4:d m = 4:e n = 4:f o = 4:g p = 4:h in [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p] Notice all the sharing - this is a very efficient representation! You save on copying, and you save on memory use. My solution isn't perfect, though - the use of append (++) is inefficient; if this could be avoided, it would be faster. --KW 8-)

Graham Klyne

8:04 p.m.

At 14:00 04/06/03 +0100, Keith Wansbrough wrote:

...

Looks fine, but you're right - the use of "combinations" is inefficient. It's better to iterate over the elements, rather than the length. Then as you add each element, you consider all the sets you have so far, and either add the new element to each or not. This doubles the number of sets. Hence:

powerset :: [a] -> [[a]] powerset [] = [[]] powerset (x:xs) = xss ++ map (x:) xss where xss = powerset xs

Neat! It happens that for the application I have in mind, it is important to generate shorter subsets first, as the required results are almost always going to come from smaller subsets of a possibly large base set, and I'm aiming to exploit lazy evaluation to keep things tractable. Looking at your code, I'm wondering if it would be possible to use some alternate form of composition instead of ++, and some auxiliary functions to pull the results out in the short-to-long sequence. I'm thinking in terms of building a list of trees.. [[ data NTree a = NTree { nthead::a, ntbody::[NTree a] } instance Functor NTree where fmap f (NTree h ts) = NTree (f h) (map (fmap f) ts) powerset1 :: [a] -> [NTree [a]] powerset1 (x:xs) = (NTree [x] (map (fmap (x:)) xss)) : xss where xss = powerset1 xs powerset1 [] = [] listPowerset :: [NTree [a]] -> [[a]] listPowerset [] = [] listPowerset ts = (map nthead ts) ++ listPowerset bodylist where bodylist = concat $ filter (not . null) $ map ntbody ts testN1 = listPowerset $ powerset1 [1,2,3,4] testN2 = listPowerset $ powerset1 "abcdefgh" ]] The list/tree structure looks something like this: [1] [2] [2,1] [3] [3,1] [3,2] [3,2,1] [4] [4,1] [4,2] [4,2,1] [4,3] [4,3,1] [4,3,2] [4,3,2,1] etc. The list function picks off the members by columns (w.r.t. to above diag)

...

Notice how important it is to include the empty set in the set of subsets - it won't work at all if you omit it.

Yes, I noticed something similar in my original version. I've chosen not include the empty subset in my results, but that's easily adjusted.

...

This formulation is particularly nice because in memory, you *share* all of the lists from the previous iteration, rather than making copies.

I *think* my revised formulation achieves this. [...]

...

My solution isn't perfect, though - the use of append (++) is inefficient; if this could be avoided, it would be faster.

I didn't see any easy way to avoid (++) in my list readout, but I think I can claim that the length of the leading list if never more than O(log N) the tree size. If I'm evaluating and using the list lazily, using a typical recursive traversal pattern (like the powerset function itself), is there any cause for the "++" to actually be evaluated? I suspect not, but can't be sure. #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Andrew J Bromage

5 Jun 5 Jun

4:53 a.m.

G'day all. On Wed, Jun 04, 2003 at 02:00:08PM +0100, Keith Wansbrough wrote:

...

This formulation is particularly nice because in memory, you *share* all of the lists from the previous iteration, rather than making copies. [...] Notice all the sharing - this is a very efficient representation! You save on copying, and you save on memory use.

I can never resist a can labelled "worms". Let me get out my tin opener... You do save on memory allocations. If, however, you consume the list lazily and discard the results as you consume them (which is the common way lazy programs are written), you actually use more memory at once. Try it if you don't believe me. Test it with this program, using each definition of powerset: summer :: [[a]] -> Integer summer xss = foldl' (\xs r -> r + toInteger (length xs)) 0 xss n :: Int n = 32 main :: IO () main = print (summer (powerset [1..n])) You'll find that one of them runs in O(n) space and the other most likely blows the heap. Cheers, Andrew Bromage

Mark P Jones

6 Jun 6 Jun

3:25 a.m.

| powerset :: [a] -> [[a]] | powerset [] = [[]] | powerset (x:xs) = xss ++ map (x:) xss | where xss = powerset xs Elegant as it is, this program causes a serious space leak. (In fact, it is often cited as an example of why functional programming language implementations might choose not to use common subexpression elimination.) Why? Because it holds on to the whole of xss until the first half of the resulting list has been generated. And xss gets big, fast. In Hugs, try: :set +g length (powerset [1..32]) and watch as your free heap disappears ... One way to fix this is to rewrite the second line in the definition of powerset as: powerset (x:xs) = powerset xs ++ map (x:) (powerset xs) Or, if duplicated computation offends you, replace (++) in the original version of powerset with an interleave operator: powerset :: [a] -> [[a]] powerset [] = [[]] powerset (x:xs) = xss /\/ map (x:) xss where xss = powerset xs (/\/) :: [a] -> [a] -> [a] [] /\/ ys = ys (x:xs) /\/ ys = x : (ys /\/ xs) These two variants both run in constant space (assuming that your compiler isn't "smart" enough to do common subexpr elimination :-) All the best, Mark

Graham Klyne

11:47 a.m.

At 20:25 05/06/03 -0700, Mark P Jones wrote:

...

Or, if duplicated computation offends you, replace (++) in the original version of powerset with an interleave operator:

powerset :: [a] -> [[a]] powerset [] = [[]] powerset (x:xs) = xss /\/ map (x:) xss where xss = powerset xs

(/\/) :: [a] -> [a] -> [a] [] /\/ ys = ys (x:xs) /\/ ys = x : (ys /\/ xs)

These two variants both run in constant space (assuming that your compiler isn't "smart" enough to do common subexpr elimination :-)

Interesting... Picking up my theme or generating the powersets in increasing order of length, I tried a variation on that: [[ powerset3 :: [a] -> [[a]] powerset3 [] = [[]] powerset3 (x:xs) = xss <<< map (x:) xss where xss = powerset3 xs (<<<) :: [[a]] -> [[a]] -> [[a]] [] <<< ys = ys xs <<< [] = xs (x:xs) <<< (y:ys) = if length x < length y then x:(xs <<< (y:ys)) else y:((x:xs) <<< ys) testJ1 = powerset3 [1,2,3,4] testJ2 = powerset3 "abcdefgh" ]] (The length-ordered interleave is a bit clumsy -- I think that could be improved by saving the length with each powerset as it's generated, or by other means.) Empirically, I notice that this still seems to leak *some* space compared with your version, but not nearly as much as the simple version. I also notice, empirically, that these interleaving versions invoke garbage collection much more frequently than the naive version. #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Christian Maeder

3:38 p.m.

powerset :: [a] -> [[a]] powerset [] = [[]] powerset (x:xs) = concatMap ( \ s -> s:[x:s]) (powerset xs) this variant behaves as well, doesn't it?

...

...
powerset :: [a] -> [[a]] powerset [] = [[]] powerset (x:xs) = xss /\/ map (x:) xss where xss = powerset xs

(/\/) :: [a] -> [a] -> [a] [] /\/ ys = ys (x:xs) /\/ ys = x : (ys /\/ xs)

These two variants both run in constant space (assuming that your compiler isn't "smart" enough to do common subexpr elimination :-)

...

Picking up my theme or generating the powersets in increasing order of length, I tried a variation on that:

powerset :: [a] -> [(Int, [a])] powerset [] = [(0, [])] powerset (x:xs) = myconcat $ map ( \ s -> (s, (fst s + 1, x: snd s))) $ powerset xs myconcat :: [((Int, [a]), (Int, [a]))] -> [(Int, [a])] myconcat [(a,b)] = [a, b] myconcat (x:r) = insert x $ myconcat r insert :: ((Int, [a]), (Int, [a])) -> [(Int, [a])] -> [(Int, [a])] insert (a@(i,_), b) l@(c@(j, _) : r) = if i < j then a : b : l else c : insert (a, b) r However, length (powerset [1..32]) in Hugs ends in an: ERROR - Control stack overflow Cheers Christian

...

[[ powerset3 :: [a] -> [[a]] powerset3 [] = [[]] powerset3 (x:xs) = xss <<< map (x:) xss where xss = powerset3 xs

(<<<) :: [[a]] -> [[a]] -> [[a]] [] <<< ys = ys xs <<< [] = xs (x:xs) <<< (y:ys) = if length x < length y then x:(xs <<< (y:ys)) else y:((x:xs) <<< ys)

testJ1 = powerset3 [1,2,3,4] testJ2 = powerset3 "abcdefgh" ]]

(The length-ordered interleave is a bit clumsy -- I think that could be improved by saving the length with each powerset as it's generated, or by other means.)

Empirically, I notice that this still seems to leak *some* space compared with your version, but not nearly as much as the simple version. I also notice, empirically, that these interleaving versions invoke garbage collection much more frequently than the naive version.

Liyang HU

4 Jun 4 Jun

2:08 p.m.

Hi Graham, On Wed, Jun 04, 2003 at 12:08:38PM +0100, Graham Klyne wrote:

...

I thought this may be a useful exercise to see how well I'm picking up on functional programming idioms, so I offer the following for comment:

...

foldl (++) [] [ combinations n as | n <- intRange 1 (length as) ]

By your use of the `intRange' function, I get the feeling you're still thinking somewhat imperatively. (It's awfully reminiscent of a for-loop...) Were you trying to write the function from some definition? (The set of all subsets of X with size <= |X| et c. You're looping over the size of the subset, per se...?) (Side note: I can think of few instances where you _need_ to deal with the length of a list directly -- more often than not you can (and probably should) let recursion take care of that. You can also write [1..length as] rather than use the intRange function, which looks prettier. :-) A key point is to try and think of how you can relate one case of the problem to a simpler instance of the same problem, rather than tackling it head on. Start by looking at the power set of a few small examples. The power set of the empty set is the size 1 set consisting of the empty set:

...

pset [] = [ [] ]

and a couple more:

...

pset [a] = [ [a], [] ] pset [b, a] = [ [b, a], [b], [a], [] ]

Notice how the `second half' of pset [b, a] is exactly pset [a]. Can you see anything that would relate the sets [b, a], [b] to [a], []? (Yes! Chop off the leading b! :) Let's try to generalise this: Take a set X, and an element y not in X. Denoting the power set function by P(), I hope you can see that P(X u {y}) certainly contains P(X). But no set in (read: member of) P(X) has y as a member, and funnily enough, if we add y to each element of P(X) we get missing bits of P(X u {y}). (The fact that the size of the power set is 2^|X| should serve as a hint -- you want to double the size of your power set for each element in X.) So we arrive at our solution:

...

pset [] = [ [] ] pset (x:xs) = let ps = pset xs in map (x:) ps ++ ps

Or at least, this is what would be going through my head if I were trying to write this. ^_^ Hope it helps a bit... later, /Liyang -- .--| Liyang HU |--| http://nerv.cx/ |--| Caius@Cam |--| ICQ: 39391385 |--. | :::::::::::::::::::::: This is not a signature. :::::::::::::::::::::: |

Graham Klyne

8:42 p.m.

At 15:08 04/06/03 +0100, Liyang HU wrote:

...

A key point is to try and think of how you can relate one case of the problem to a simpler instance of the same problem, rather than tackling it head on.

I think that's a good idea to hang on to. Sometimes easier to say than to do, it seems. Thanks, #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Jerzy Karczmarczuk

5 Jun 5 Jun

9:06 a.m.

Graham Klyne wrote:

...

At 15:08 04/06/03 +0100, Liyang HU wrote:

...
A key point is to try and think of how you can relate one case of the problem to a simpler instance of the same problem, rather than tackling it head on.

I think that's a good idea to hang on to. Sometimes easier to say than to do, it seems.

I permit myself to observe that your powerset problem (and the restricted length problem, i.e. the combinations) is usually solved in Prolog, through backtracking, using reasoning/style which adopts this "individualistic" philosophy. powerset(<source>,<possible result>) --- is the pattern. And the solution is powerset([],[]). Since nothing else can be done. Otherwise you pick the item or not. powerset([X|Rest],L) :- powerset(Rest,L). powerset([X|Rest],[X|L) :- powerset(Rest,L). The xxx ++ map (x :) xxx solution in Haskell is a particular formulation (and optimization) of the straightforward transformation from a version using the non-deterministic Monad. This one is really almost a carbon copy of the Prolog solution, with appropriate "lifting" of operations from individuals to lazy lists. Such things are sometimes easier to do than to describe... Jerzy Karczmarczuk

Graham Klyne

4:41 p.m.

At 11:06 05/06/03 +0200, Jerzy Karczmarczuk wrote:

...

I permit myself to observe that your powerset problem (and the restricted length problem, i.e. the combinations) is usually solved in Prolog, through backtracking, using reasoning/style which adopts this "individualistic" philosophy.

powerset(<source>,<possible result>) --- is the pattern. And the solution is

powerset([],[]). Since nothing else can be done. Otherwise you pick the item or not.

powerset([X|Rest],L) :- powerset(Rest,L). powerset([X|Rest],[X|L) :- powerset(Rest,L).

If my (rusty) Prolog serves, this will still fail to generate the shorter sequences before the longer ones, as I think that all powerset members not containing X must be generated before any that do contain X. (Or is that a different thread of discussion I'm introducing here?)

...

The xxx ++ map (x :) xxx solution in Haskell is a particular formulation (and optimization) of the straightforward transformation from a version using the non-deterministic Monad. This one is really almost a carbon copy of the Prolog solution, with appropriate "lifting" of operations from individuals to lazy lists.

It is the case that I'm finding it very easy to code solutions that work very much like Prolog backtracking using what I think is a form of "non-deterministic Monad" (e.g. a list, lazily evaluated, used to return the set of all results?) #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Graham Klyne

11 Jun 11 Jun

7:37 a.m.

New subject: Non-determinism, backtracking and Monads

At 11:06 05/06/03 +0200, Jerzy Karczmarczuk wrote:

...

I permit myself to observe that your powerset problem (and the restricted length problem, i.e. the combinations) is usually solved in Prolog, through backtracking, using reasoning/style which adopts this "individualistic" philosophy.

powerset(<source>,<possible result>) --- is the pattern. And the solution is

powerset([],[]). Since nothing else can be done. Otherwise you pick the item or not.

powerset([X|Rest],L) :- powerset(Rest,L). powerset([X|Rest],[X|L) :- powerset(Rest,L).

The xxx ++ map (x :) xxx solution in Haskell is a particular formulation (and optimization) of the straightforward transformation from a version using the non-deterministic Monad. This one is really almost a carbon copy of the Prolog solution, with appropriate "lifting" of operations from individuals to lazy lists.

I was thinking some more about this comment of yours, and my own experience with the ease of using lists to implement prolog-style generators, and think I come to some better understanding. If I'm right, I assume the following is common knowledge to experienced Haskell programmers. So, in the spirit of testing my understanding... The common thread here is a non-deterministic calculation in which there are several possible solutions for some problem. The goal is to find (a) if there are any solutions, and (b) one, more or all of the solutions. Prolog does this, absent ! (cut), by backtracking through the possible solutions. My first epiphany is that the Haskell idea of using a lazily evaluated list for result of a non-deterministic computation is pretty much the same thing (which is pretty much what you said? Is this what you mean by "the non-deterministic monad"?). The mechanisms for accessing a list mean that the solutions must be accessed in the order they are generated, just like Prolog backtracking. So there seems to be a very close relationship between the lazy list and non-deterministic computations, but what about other data structures? I speculate that other structures, lazily evaluated, may also be used to represent the results of non-deterministic computations, yet allow the results to be accessed in a different order. And these, too, may be (should be?) monads. If so, the Haskell approach might be viewed as a generalization of Prolog's essentially sequential backtracking. In a private message concerning the powerset thread on this list, a correspondent offered a program to evaluate the subsets in size order, which I found particularly elegant:

...

ranked_powerset :: [a] -> [[[a]]] ranked_powerset = takeWhile (not . null) . foldr next_powerset ([[]] : repeat [])

next_powerset :: a -> [[[a]]] -> [[[a]]] next_powerset x r = zipWith (++) ([] : map (map (x:)) r) r

powerset :: [a] -> [[a]] powerset = tail . concat . ranked_powerset

They also pointed out that "ranked_powerset is handy since you can use it to define combinatorial choice etc.":

...

choose :: Int -> [a] -> [[a]] choose k = (!! k) . ranked_powerset

So here is an example of a different structure (a list of lists) also used to represent a non-deterministic computation, and furthermore providing means to access the results in some order other than a single linear sequences (e.g. could be used to enumerate all the powersets containing the nth member of the base set, *or* all the powersets of a given size, without evaluating all of the other powersets). To test this idea, I think it should be possible to define a monad based on a simple tree structure, which also can be used to represent the results of a non-deterministic computation. An example of this is below, at the end of this message, which seems to exhibit the expected properties. So if the idea of representing a non-deterministic computation can be generalized from a list to a tree, why not to other data structures? My tree monad is defined wholly in terms of reduce and fmap. Without going through the exercise, I think the reduce function might be definable in terms of fmap for any data type of the form "Type (Maybe a)", hence applicable to a range of functors? In particular, I'm wondering if it can be applied to any gmap-able structure over a Maybe type. I'm not sure if this is of any practical use; rather it's part of my attempts to understand the relationship between functors and monads and other things functional. #g -- [[ -- spike-treemonad.hs data Tree a = L (Maybe a) | T { l,r :: Tree a } deriving Eq instance (Show a) => Show (Tree a) where show t = (showTree "" t) ++ "\n" showTree :: (Show a) => String -> Tree a -> String showTree _ (L Nothing ) = "()" showTree _ (L (Just a)) = show a showTree i (T l r) = "( " ++ (showTree i' l) ++ "\n" ++ i' ++ (showTree i' r) ++ " )" where i' = ' ':' ':i instance Functor Tree where fmap f (L Nothing) = L Nothing fmap f (L (Just a)) = L (Just (f a)) fmap f (T l r) = T (fmap f l) (fmap f r) reduce :: Tree (Tree a) -> Tree a reduce (L Nothing) = L Nothing reduce (L (Just (L Nothing ) )) = L Nothing reduce (L (Just (L (Just a)) )) = L (Just a) reduce (L (Just (T l r ) )) = T l r reduce (T l r) = T (reduce l) (reduce r) instance Monad Tree where -- L Nothing >>= k = L Nothing t >>= k = reduce $ fmap k t return x = L (Just x) fail s = L Nothing -- tests t1 :: Tree String t1 = T (L $ Just "1") (T (T (T (L $ Just "211") (L $ Just "311")) (L Nothing)) (L $ Just "22")) k1 :: a -> Tree a k1 n = T (L $ Just n) (L $ Just n) r1 = t1 >>= k1 k2 :: String -> Tree String k2 s@('1':_) = L $ Just s k2 s@('2':_) = T (L $ Just s) (L $ Just s) k2 _ = L Nothing r2 = t1 >>= k2 t3 = L Nothing :: Tree String r3 = t3 >>= k1 r4 = t1 >>= k1 >>= k2 r5 = (return "11") :: Tree String r6 = r5 >>= k1 -- Check out monad laws -- return a >>= k = k a m1a = (return t1) >>= k1 m1b = k1 t1 m1c = m1a == m1b -- m >>= return = m m2a = t1 >>= return m2b = t1 m2c = m2a == m2b -- m >>= (\x -> k x >>= h) = (m >>= k) >>= h m3a = t1 >>= (\x -> k1 x >>= k2) m3b = (t1 >>= k1) >>= k2 m3c = m3a == m3b ]] ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Andrew J Bromage

8:02 a.m.

New subject: Non-determinism, backtracking and Monads

G'day all. On Wed, Jun 11, 2003 at 08:37:48AM +0100, Graham Klyne wrote:

...

I was thinking some more about this comment of yours, and my own experience with the ease of using lists to implement prolog-style generators, and think I come to some better understanding.

You might find this amusing: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hfl/hfl/mtl/Logic.hs?rev=1.2 This monad and monad transformer basically implement ground-moded logic programming, including if-then-else with soft cut. (It doesn't implement Prolog cut, but you really don't want it.)

...

So there seems to be a very close relationship between the lazy list and non-deterministic computations, but what about other data structures? I speculate that other structures, lazily evaluated, may also be used to represent the results of non-deterministic computations, yet allow the results to be accessed in a different order.

Yes. The different data structures would, in general I think, correspond to different search rules. Using a lazy list corresponds to depth-first search. Your tree monad actually returns the entire computation tree, which can then be traversed in depth-first order, breadth-first order, or whatever order you want. You have to be careful with monad transformers stacked on top of non-commutative monads, though. Most programmers would expect, in this code: (lift m1 `mplus` lift m2) `mplus` lift m3 that both m1 and m2 will be evaluated before m3; at least in circumstances where it mattered. Cheers, Andrew Bromage

Jerzy Karczmarczuk

10:36 a.m.

New subject: Non-determinism, backtracking and Monads

Graham Klyne wrote on the subject of powerset through backtracking:

...

The common thread here is a non-deterministic calculation in which there are several possible solutions for some problem. The goal is to find (a) if there are any solutions, and (b) one, more or all of the solutions.

Prolog does this, absent ! (cut), by backtracking through the possible solutions.

My first epiphany is that the Haskell idea of using a lazily evaluated list for result of a non-deterministic computation is pretty much the same thing (which is pretty much what you said? Is this what you mean by "the non-deterministic monad"?). The mechanisms for accessing a list mean that the solutions must be accessed in the order they are generated, just like Prolog backtracking.

Yes. I didn't invent any wheel, just pointed out that in these cases the remark "easier to say than to do" is too sad. The translation from Prolog may be really automatic, although the simple-minded result is polluted with (++) and concat.

...

So there seems to be a very close relationship between the lazy list and non-deterministic computations, but what about other data structures? I speculate that other structures, lazily evaluated, may also be used to represent the results of non-deterministic computations, yet allow the results to be accessed in a different order. And these, too, may be (should be?) monads.

Of course, one can produce lazy trees and other plexes. Most of them are naturally Functors, but in the general case I am not sure whether there is a natural way to implement >>= for any shape... === On the other hand there is a Monadic Niche elsewhere which is *very different* from the manipulation of lazy lists, and yet may be used for similar purposes. I mean: the continuation business. It is possible, instead of implementing the *data backtracking* through lazy lists, to make lazy "backtrackable" continuations, permitting to redirect the flow of control to produce something else. The two ways are - perhaps not entirely equivalent, but essentially two orchestrations of the same theme. I lost my references, perhaps somebody?... Jerzy Karczmarczuk

Andrew J Bromage

12 Jun 12 Jun

4:08 a.m.

New subject: Non-determinism, backtracking and Monads

G'day all. On Wed, Jun 11, 2003 at 12:36:30PM +0200, Jerzy Karczmarczuk wrote:

...

It is possible, instead of implementing the *data backtracking* through lazy lists, to make lazy "backtrackable" continuations, permitting to redirect the flow of control to produce something else. The two ways are - perhaps not entirely equivalent, but essentially two orchestrations of the same theme. I lost my references, perhaps somebody?...

If you're referring to the paper(s) by Ralf Hinze, they are most certainly equivalent. WARNING: Long post follows. Consider the simplified term implementation of a nondeterminism monad, which basically operates on lists: data Nondet1 a = Cons a (Nondet1 a) | Fail -- This is the "observer" method runNondet1 :: (Monad m) => Nondet1 a -> m a runNondet1 m = case m of Cons x _ -> return x Fail -> fail "no solutions" return a = Cons a Fail m >>= k = case m of Cons a n -> mplus (k a) (n >>= k) Fail -> Fail mzero = Fail mplus m n = case m of Cons a m' -> Cons a (mplus m' n) Fail -> n You can derive a continuation-passing implementation by transforming away the data structures. This is a technique well-known to practitioners of traditional lambda calculus. We'll start by abstracting the data structures out. We need replacements for both constructor functions (i.e. Cons and Fail) and the pattern matching used above. data Nondet1 a = Fail | Cons a (Nondet1 a) cons1 :: a -> Nondet1 a -> Nondet1 a cons1 a m = Cons a m fail1 :: Nondet1 a fail1 = Fail unpack1 :: Nondet1 a -> (a -> Nondet1 a -> b) -> b -> b unpack1 (Cons a m) c f = c a m unpack1 Fail c f = f The monad can now be re-implemented in terms of these operations: runNondet1 :: (Monad m) => Nondet1 a -> m a runNondet1 m = unpack1 m (\x _ -> return x) (fail "no solutions") return a = cons1 a fail1 m >>= k = unpack1 m (\a n -> k a `mplus` n >>= k) fail1 mzero = fail1 mplus m n = unpack1 m (\a m' -> cons1 a (mplus m' n)) n Note that there are now no data structures in here, only calls to fail1, cons1 and unpack1. We can implement these how we like so long as these properties hold: unpack1 fail1 c f = f unpack1 (cons1 x xs) c f = c x xs The lambda calculus solution is to make unpack1 the identity function. Unfortunately that doesn't entirely work in Haskell because of the type system, but we can get pretty close by using rank-2 types and a newtype constructor: -- Compare this with the type of unpack1 above newtype Nondet2 a = Nondet2 (forall b. (a -> Nondet2 a -> b) -> b -> b) fail2 :: Nondet2 a fail2 = Nondet2 (\c f -> f) cons2 :: a -> Nondet2 a -> Nondet2 a cons2 a m = Nondet2 (\c f -> c a m) unpack2 :: Nondet2 a -> (a -> Nondet2 a -> b) -> b -> b unpack2 (Nondet2 m) = m We can inline these functions everywhere to get: runNondet2 :: (Monad m) => Nondet2 a -> m a runNondet2 (Nondet2 m) = m (\x _ -> return x) (fail "no solutions") return a = Nondet2 (\c _ -> c a (Nondet2 (\_ f -> f))) (Nondet2 m) >>= k = m (\a n -> mplus (k a) (n >>= k)) (Nondet2 (\_ f -> f)) mzero = Nondet2 (\_ f -> f) mplus (Nondet2 m) n = m (\a m' -> Nondet2 (\c _ -> c a (mplus m' n))) n ...and we have a continuation-passing implementation. Note that this is not 100% identical to the one from Ralf's paper and tech report. Transforming the above code into Ralf's is left as an exercise. (It's tricky but mechanical.) Cheers, Andrew Bromage

Derek Elkins

4 Jun 4 Jun

11:50 p.m.

On Wed, 4 Jun 2003 15:08:44 +0100 Liyang HU wrote:

...

Hi Graham,

On Wed, Jun 04, 2003 at 12:08:38PM +0100, Graham Klyne wrote:

...
I thought this may be a useful exercise to see how well I'm picking up on functional programming idioms, so I offer the following for comment:

...
foldl (++) [] [ combinations n as | n <- intRange 1 (length as) ]

*cries out in pain and horror* fold_l_ (++) over combinatorially large lists! (++) has gotten a reputation for being slow. (++) isn't slow in and of itself, even using it a lot isn't slow, what -is- slow is using it left associatively. What happens then is that (a++b)++c builds a copy of a then tacks on b, then it builds a copy of a++b and tacks on c. In this case we've copied a twice when we should have only copied it once. Obviously for ((a++b)++c)++d it'll be copied three times and b twice and so forth. To add insult to injury, there is already a standard function that does what you want, concat, which is defined as foldr (++) [] in the report. In fact, you could rewrite the whole thing as concatMap (flip combinations as) [1..length as]. A list comprehension with only one source and no filters is the same as a map.

...

By your use of the `intRange' function, I get the feeling you're still thinking somewhat imperatively. (It's awfully reminiscent of a for-loop...) Were you trying to write the function from some definition? (The set of all subsets of X with size <= |X| et c. You're looping over the size of the subset, per se...?)

(Side note: I can think of few instances where you _need_ to deal with the length of a list directly -- more often than not you can (and probably should) let recursion take care of that. You can also write [1..length as] rather than use the intRange function, which looks prettier. :-)

Indeed, I think I've used length all of 3 times. You (Graham) also have some parentheses issues; e.g. in foo ++ (combinations 5 l) the parentheses are superfluous. (++) is slow though in that seemingly innocent uses can become n^2. A simple example is displaying a binary tree. A tree like /\ /\ will cause left associative uses of (++). Hence the prelude type ShowS = String -> String and shows :: Show a => a -> ShowS. The problem is we don't want left associativity, so what we do is make a context that says do everything else to the right, then this, e.g. "Foo"++everythingelse. This is simple enough to do with ("Foo"++). (As a side note, using the Writer/Output monad with the list/string monoid is probably not the best of ideas, instead you can use the function monoid and tell ("foo"++).) You can see that this technique is more or less just adding an accumulating parameter as you'll have to provide the 'initial' value.

Graham Klyne

5 Jun 5 Jun

9:03 a.m.

At 19:50 04/06/03 -0400, Derek Elkins wrote:

...

...
...
foldl (++) [] [ combinations n as | n <- intRange 1 (length as) ]

*cries out in pain and horror* fold_l_ (++) over combinatorially large lists! (++) has gotten a reputation for being slow. (++) isn't slow in and of itself, even using it a lot isn't slow, what -is- slow is using it left associatively. What happens then is that (a++b)++c builds a copy of a then tacks on b, then it builds a copy of a++b and tacks on c. In this case we've copied a twice when we should have only copied it once. Obviously for ((a++b)++c)++d it'll be copied three times and b twice and so forth. To add insult to injury, there is already a standard function that does what you want, concat, which is defined as foldr (++) [] in the report. In fact, you could rewrite the whole thing as concatMap (flip combinations as) [1..length as]. A list comprehension with only one source and no filters is the same as a map.

Thank you. I stand duly instructed... this commentary was the kind of feedback I was hoping to draw, though I did not mean to cause so much anguish :-)

...

...
You can also write [1..length as] rather than use the intRange function, which looks prettier. :-)

I agree with Liyang that [1.. ] is much prettier. That's one idiom I've yet to absorb. (I came to functional programming thinking that it was fundamentally simpler than conventional languages -- none of those complicated control structures to worry about, just expressions -- but the syntactic richness of Haskell seems to be quite beyond the conventional languages that I have used.)

...

Indeed, I think I've used length all of 3 times.

This is an interesting comment. I've found myself using length quite often, even when it feels not-quite-right to me. Maybe it's that I'm not yet used to designing algorithms functional-style, or alternative idioms that I'm overlooking. I'm not sure what I'm missing here, so this is a vague probe for further insight.

...

You (Graham) also have some parentheses issues; e.g. in foo ++ (combinations 5 l) the parentheses are superfluous.

I'm tempted to argue that being superfluous doesn't mean they shouldn't be there. This isn't just a functional programming issue... I find that when there are many levels of operator precedence it's easier to be explicit than to try and remember what they all are -- as much for reading the code as writing it in the first place. But maybe I'm still reading functional code in the wrong way? (I still scratch my head over some of the prelude/library functions, though it's getting easier.)

...

(++) is slow though in that seemingly innocent uses can become n^2. A simple example is displaying a binary tree. A tree like /\ /\ will cause left associative uses of (++). Hence the prelude type ShowS = String -> String and shows :: Show a => a -> ShowS. The problem is we don't want left associativity, so what we do is make a context that says do everything else to the right, then this, e.g. "Foo"++everythingelse. This is simple enough to do with ("Foo"++). (As a side note, using the Writer/Output monad with the list/string monoid is probably not the best of ideas, instead you can use the function monoid and tell ("foo"++).) You can see that this technique is more or less just adding an accumulating parameter as you'll have to provide the 'initial' value.

I'd just about figured the ShowS idea, but I've yet to get a handle on this idea of [a] 'monoid'. #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Keith Wansbrough

3:20 p.m.

...

...
You (Graham) also have some parentheses issues; e.g. in foo ++ (combinations 5 l) the parentheses are superfluous.

I'm tempted to argue that being superfluous doesn't mean they shouldn't be there. This isn't just a functional programming issue... I find that when there are many levels of operator precedence it's easier to be explicit than to try and remember what they all are -- as much for reading the code as writing it in the first place. But maybe I'm still reading functional code in the wrong way? (I still scratch my head over some of the prelude/library functions, though it's getting easier.)

This is a particular instance where you never need the parentheses... since it's a _functional_ language, _function application_ (the invisible symbol between combinations and 5, and between combinations 5 and l) binds tighter than anything else. The only time you need parentheses around a function application are when it is to protect it from a competing function application, such as when you are passing it as an argument to a function: map (map f) xss rather than map map f xss HTH. --KW 8-)

Graham Klyne

6:29 p.m.

At 16:20 05/06/03 +0100, Keith Wansbrough wrote:

...

This is a particular instance where you never need the parentheses... since it's a _functional_ language, _function application_ ... binds tighter than anything else.

Ah, fair point. I hadn't fully internalized that function application binds tighter than any (explicit) infix operator. (Reading it is one thing...) #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Liyang HU

5:44 p.m.

On Thu, Jun 05, 2003 at 10:03:55AM +0100, Graham Klyne wrote:

...

At 19:50 04/06/03 -0400, Derek Elkins wrote:

...
You (Graham) also have some parentheses issues; e.g. in foo ++ (combinations 5 l) the parentheses are superfluous. I'm tempted to argue that being superfluous doesn't mean they shouldn't be there.

Keith already made one argument for less parentheses, here's my take on the subject: Given that: (++) :: [a] -> [a] -> [a] -- contatenates lists foo, l :: [a] -- are lists bar :: Int -> [a] -> [a] -- produces a list Which of the following make sense?

...

(foo ++) bar 5 l -- No, because bar isn't a list, (foo ++ bar) 5 l -- and 5 and l would be applied to a list, (foo ++ bar 5) l -- (as opposed to a function,) which make no sense

...

foo (++ bar 5 l) -- Can't apply a function to a value, -- though (++ bar 5 l) foo has the same effect as what we intended

...

foo ++ (bar 5 l) -- which is just foo ++ bar 5 l

Because of the static type checking that takes place, you can't easily (not unless you were _trying_ ;) produce an ambiguous expression such that the removal of brackets keeps it well-typed, yet is not equivalent to the original. So as opposed to the C code where you (and I) would put in extra brackets `just to be sure', I wouldn't bother with them unless I know the expression's going to be ambiguous. (or if the compiler tells me so. ;-) (I suppose you could argue it's not necessarily obvious that foo is a list and bar is a 2-ary function of an Int and a list. My response would be to rename foo and bar so that this is the case. ;-)

...

I'd just about figured the ShowS idea, but I've yet to get a handle on this idea of [a] 'monoid'.

Might http://www.engr.mun.ca/~theo/Misc/haskell_and_monads.htm be of any help? later, /Liyang -- who managed to sneak into Category Theory lectures, but still has no idea what a monad is. ^_^; -- .--| Liyang HU |--| http://nerv.cx/ |--| Caius@Cam |--| ICQ: 39391385 |--. | :::::::::::::::::::::: This is not a signature. :::::::::::::::::::::: |

Graham Klyne

8 Jun 8 Jun

8:06 a.m.

At 18:44 05/06/03 +0100, Liyang HU wrote:

...

...
I'd just about figured the ShowS idea, but I've yet to get a handle on this idea of [a] 'monoid'.

Might http://www.engr.mun.ca/~theo/Misc/haskell_and_monads.htm be of any help?

Ah, thanks. I see: [[ A monoid is an algebraic structure consisting of a set S and an operation * with the following properties ... ]] which is a datum I was missing. (I still don't claim to understand it all, but at least I get a sense of whet the term monoid means.) #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

Graham Klyne

13 Jun 13 Jun

6:03 p.m.

New subject: List comprehensions

In a response to me on the "powerset" thread, you wrote: At 19:50 04/06/03 -0400, Derek Elkins wrote:

...

In fact, you could rewrite the whole thing as concatMap (flip combinations as) [1..length as]. A list comprehension with only one source and no filters is the same as a map.

Is there any particular reason to avoid using a list comprehension, even if a map would do? I ask because I seem not infrequently to find that a list comprehension is more compact and easier to read, even though map could suffice. This may be when the applied function is relatively complex, and would otherwise require a sequence of 'where' definitions. My current example is this: [[ -- |Graph substitution function. -- This function performs the substitutions in 'vars', and -- replaces any nodes corresponding to unbound query variables -- with new blank nodes. rdfQuerySubsBlank :: RDFQueryBindings -> RDFGraph -> [RDFGraph] rdfQuerySubsBlank vars gr = [ remapLabels vs bs True g | v <- vars , let (g,vs) = rdfQuerySubs2 v gr , let bs = allLabels isBlank g ] ]] I could write it with map, but the ways I came up with all seemed convoluted and difficult to follow. #g ------------------- Graham Klyne PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E

8057

Age (days ago)

8066

Last active (days ago)

List overview

Download

21 comments

9 participants

participants (9)

Andrew J Bromage
Christian Maeder
Derek Elkins
Graham Klyne
Graham Klyne
Jerzy Karczmarczuk
Keith Wansbrough
Liyang HU
Mark P Jones

powerset

Keith Wansbrough

Andrew J Bromage

Mark P Jones

Liyang HU

Jerzy Karczmarczuk

Andrew J Bromage

Jerzy Karczmarczuk

Andrew J Bromage

Keith Wansbrough

Liyang HU

tags

participants (9)