Cartesian product of a Set

newer
Some Haskell platformish questions

older
monads and groups -- instead of...

Andy Gimblett

1 Aug 2007 1 Aug '07

2:28 p.m.

Hi all, Is this a reasonable way to compute the cartesian product of a Set?

...

cartesian :: Ord a => S.Set a -> S.Set (a,a) cartesian x = S.fromList [(i,j) | i <- xs, j <- xs] where xs = S.toList x

It's a fairly "obvious" way to do it, but I wondered if there were any hidden gotchas. I'm particularly concerned by toList (O(n)) fromList (O(n log n)) - but for other reasons I'd really like to be using Set rather than List for this (I think). Many thanks for any thoughts, -Andy -- Andy Gimblett Computer Science Department University of Wales Swansea http://www.cs.swan.ac.uk/~csandy/

Show replies by date

Joachim Breitner

1 Aug 1 Aug

3:52 p.m.

Hi, Am Mittwoch, den 01.08.2007, 15:28 +0100 schrieb Andy Gimblett:

...

Hi all,

Is this a reasonable way to compute the cartesian product of a Set?

...
cartesian :: Ord a => S.Set a -> S.Set (a,a) cartesian x = S.fromList [(i,j) | i <- xs, j <- xs] where xs = S.toList x

It's a fairly "obvious" way to do it, but I wondered if there were any hidden gotchas. I'm particularly concerned by toList (O(n)) fromList (O(n log n)) - but for other reasons I'd really like to be using Set rather than List for this (I think).

Using only functions from the Set module, you could write:

...

catesian s = fold (\v prod -> prod `union` map ((,)v) s ) empty s

But this won’t be faster, I guess, as map is O(n·log n) and fold is O(n) OTOH, ((,)v) should be monotonic, so it might be faster (and hopefully still correct) to write:

...

catesian s = fold (\v prod -> prod `union` mapMonotonic ((,)v) s ) empty s

Greetings, Joachim -- Joachim Breitner e-Mail: mail@joachim-breitner.de Homepage: http://www.joachim-breitner.de ICQ#: 74513189

Dan Weston

8:42 p.m.

Andy Gimblett wrote:

...

Hi all,

Is this a reasonable way to compute the cartesian product of a Set?

...
cartesian :: Ord a => S.Set a -> S.Set (a,a) cartesian x = S.fromList [(i,j) | i <- xs, j <- xs] where xs = S.toList x

It's a fairly "obvious" way to do it, but I wondered if there were any hidden gotchas. I'm particularly concerned by toList (O(n)) fromList (O(n log n)) - but for other reasons I'd really like to be using Set rather than List for this (I think).

Many thanks for any thoughts,

-Andy

Your list comprehension always generates a sorted list, so changing S.fromList to its "unsafe" version (but guarateed by you) S.fromDistinctAscList should get you back to O(n). Of course the order of the generators was key (i before j). Dan

Andy Gimblett

9:22 p.m.

On Wed, Aug 01, 2007 at 01:42:52PM -0700, Dan Weston wrote:

...

Andy Gimblett wrote:

...
...
cartesian :: Ord a => S.Set a -> S.Set (a,a) cartesian x = S.fromList [(i,j) | i <- xs, j <- xs] where xs = S.toList x

Your list comprehension always generates a sorted list, so changing S.fromList to its "unsafe" version (but guarateed by you) S.fromDistinctAscList should get you back to O(n).

That'll do it: with that change, my program runs eight times faster; according to the profiler, it's gone from spending ~85% of its time in cartesian to ~0%. :-) I don't see why my list comprehension always generates a sorted list, however: toList generates a sorted list? It doesn't claim to in the docs. Do I perhaps need to use toAscList instead? I happen to have put my data into the Set in order in this example, so maybe I just lucked out? Or am I missing something?

...

Of course the order of the generators was key (i before j).

Lucky me. :-) Thanks! -Andy -- Andy Gimblett Computer Science Department University of Wales Swansea http://www.cs.swan.ac.uk/~csandy/

Rodrigo Queiro

9:37 p.m.

toList generates a sorted list since it is implemented as toList = toAscList, but it's probably bad form to rely on that. On 01/08/07, Andy Gimblett wrote:

...

On Wed, Aug 01, 2007 at 01:42:52PM -0700, Dan Weston wrote:

...
Andy Gimblett wrote:

...
...
cartesian :: Ord a => S.Set a -> S.Set (a,a) cartesian x = S.fromList [(i,j) | i <- xs, j <- xs] where xs = S.toList x

Your list comprehension always generates a sorted list, so changing S.fromList to its "unsafe" version (but guarateed by you) S.fromDistinctAscList should get you back to O(n).

That'll do it: with that change, my program runs eight times faster; according to the profiler, it's gone from spending ~85% of its time in cartesian to ~0%. :-)

I don't see why my list comprehension always generates a sorted list, however: toList generates a sorted list? It doesn't claim to in the docs. Do I perhaps need to use toAscList instead?

I happen to have put my data into the Set in order in this example, so maybe I just lucked out? Or am I missing something?

...
Of course the order of the generators was key (i before j).

Lucky me. :-)

Thanks!

-Andy

-- Andy Gimblett Computer Science Department University of Wales Swansea http://www.cs.swan.ac.uk/~csandy/ _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Dan Weston

11:34 p.m.

Since S.toAscList x is sorted (and yes, as another poster points out, you should use toAsciList to get that sorting guarantee), you can prove that your cartesian product list is also sorted. Data.Set will almost certainly always have access to the sorted list, but even if not, length x < length (cartesian x) for any nonempty list (n vs. n^2), so it's worth forcing it to sort. The Prelude defines the instance (Ord a, Ord b) => Ord (a, b), with the obvious lexicographic ordering that your list comprehension generates (fst varying more slowly than snd): sorted(xs) ==> sorted [(i,j) | i <- xs, j <- xs]

...

...
This is *not* true if the generators are reversed: BAD: sorted(xs) =/=> sorted [(i,j) | j <- xs, i <- xs]

This allows you to safely say: cartesian x = S.fromDistinctAscList [(i,j) | i <- xs, j <- xs] where xs = S.toAscList x -- fromDistinctAscList sorted precondition satisfied by construction with full confidence that (cartesian x) is a well-constructed set. It may be worth adding a comment that you have fulfilled your proof obligation in calling fromDistinctAscList, so others aren't left wondering. Dan Andy Gimblett wrote:

...

On Wed, Aug 01, 2007 at 01:42:52PM -0700, Dan Weston wrote:

...
Andy Gimblett wrote:

...
...
cartesian :: Ord a => S.Set a -> S.Set (a,a) cartesian x = S.fromList [(i,j) | i <- xs, j <- xs] where xs = S.toList x Your list comprehension always generates a sorted list, so changing S.fromList to its "unsafe" version (but guarateed by you) S.fromDistinctAscList should get you back to O(n).

That'll do it: with that change, my program runs eight times faster; according to the profiler, it's gone from spending ~85% of its time in cartesian to ~0%. :-)

I don't see why my list comprehension always generates a sorted list, however: toList generates a sorted list? It doesn't claim to in the docs. Do I perhaps need to use toAscList instead?

I happen to have put my data into the Set in order in this example, so maybe I just lucked out? Or am I missing something?

...
Of course the order of the generators was key (i before j).

Lucky me. :-)

Thanks!

-Andy

ok

2 Aug 2 Aug

4:15 a.m.

On 2 Aug 2007, at 2:28 am, Andy Gimblett wrote:

...

Is this a reasonable way to compute the cartesian product of a Set?

...
cartesian :: Ord a => S.Set a -> S.Set (a,a) cartesian x = S.fromList [(i,j) | i <- xs, j <- xs] where xs = S.toList x

Following up on my recent message about (ab)use of monadic operations, imagine presenting it this way: cartesian s = S.fromList $ liftM2 (,) s' s' where s' = S.toList s There are times to use monads and times not to. (One wonders whether S.Set is an instance of Monad, and if not, why not.)

...

It's a fairly "obvious" way to do it, but I wondered if there were any hidden gotchas. I'm particularly concerned by toList (O(n)) fromList (O(n log n)) - but for other reasons I'd really like to be using Set rather than List for this (I think).

In fact it's worse than that. Given a set with n elements, the Cartesian product of that set with itself has n**2 elements, so S.fromList must take O(n**2.log(n**2)) = O(n**2.log n) time. If sets are represented as ordered lists, then the list comprehension above generates a suitably ordered result. On the other hand, I've usually found that it pays to avoid explicitly constructing things like Cartesian products. Could that be the case here?

Andy Gimblett

9:14 a.m.

On Thu, Aug 02, 2007 at 04:15:35PM +1200, ok wrote:

...

On the other hand, I've usually found that it pays to avoid explicitly constructing things like Cartesian products. Could that be the case here?

Quite possibly, though for my purposes I don't _think_ it's worth routing around it. I'm using it in a naive (and intentionally so) algorithm to search for "local top elements" in a partial order. That is, we require that the PO satisfies the property: for all a,b,c in PO . a <= b and a <= c => exists d in PO . b <= d and c <= d where a is a "local bottom element" for b and c, and d is the corresponding "local top element". Given a PO, for every such a, I want the set of corresponding d's. If any such set is empty, a big red "NO" lights up elsewhere in the program. My algorithm is the simplest thing I could think of that works: represent the PO as a Set of (a,a), then simply search its cartesian product, a Set of ((a,a),(a,a)), for elements of the right shape. It works, in only a few lines of code, and really looks a lot like the definition given above. It also seems to be fast enough for reasonably sized examples - for a PO with ~40 elements it's instantaneous. That's at the bottom end of "realistic" for my application, and I need to check it for ~100 elements (which is more like my reality), but I think it ought to be fine/usable. If, with real data, it turns out to be too slow then yes, I'll ditch this naive method and look at graph algorithms, which is of course the Smart thing to do. However, it's beautiful (to me) code right now, which strongly reflects the definition of the problem, so I'd be happy not to. :-) Cheers, -Andy -- Andy Gimblett Computer Science Department University of Wales Swansea http://www.cs.swan.ac.uk/~csandy/

6557

Age (days ago)

6558

Last active (days ago)

List overview

Download

7 comments

5 participants

participants (5)

Andy Gimblett
Dan Weston
Joachim Breitner
ok
Rodrigo Queiro