List comparisons and permutation group code

older
getting a DB library working with...

Mikael Johansson

19 Oct 2006 19 Oct '06

2:03 p.m.

Comparing the code for permutationgropus at http://www.polyomino.f2s.com/david/haskell/codeindex.html with my own thoughts on the matter, I discover the one line to figure out whether a specific list represents the identity: isIdentity (PL xs) = all (\(i,j) -> i==j) (zip [1..] xs) Is there any sort of benefit to be won by using this construction instead of isIdentity (PL xs) = xs == [1..(length xs)] and if so, what? Best, -- Mikael Johansson | To see the world in a grain of sand mikael@johanssons.org | And heaven in a wild flower http://www.mikael.johanssons.org | To hold infinity in the palm of your hand | And eternity for an hour

Show replies by date

Mikael Johansson

19 Oct 19 Oct

2:09 p.m.

On Thu, 19 Oct 2006, Mikael Johansson wrote:

...

Comparing the code for permutationgropus at http://www.polyomino.f2s.com/david/haskell/codeindex.html with my own thoughts on the matter, I discover the one line to figure out whether a specific list represents the identity:

isIdentity (PL xs) = all (\(i,j) -> i==j) (zip [1..] xs)

Is there any sort of benefit to be won by using this construction instead of

isIdentity (PL xs) = xs == [1..(length xs)]

and if so, what?

At some point in the future, I'll learn to think more before I post. Say isIdentity xs = all (\(i,j) -> i==j) (zip [1..] xs) isIdentity' xs = xs == [1..(length xs)] Then isIdentity 1:3:2:[4..100000] finishes in an instant, whereas isIdentity' 1:3:2:[4..100000] takes noticable time before completing. So it's a question of getting laziness to work for you. -- Mikael Johansson | To see the world in a grain of sand mikael@johanssons.org | And heaven in a wild flower http://www.mikael.johanssons.org | To hold infinity in the palm of your hand | And eternity for an hour

Henning Thielemann

4:27 p.m.

On Thu, 19 Oct 2006, Mikael Johansson wrote:

...

On Thu, 19 Oct 2006, Mikael Johansson wrote:

...
Comparing the code for permutationgropus at http://www.polyomino.f2s.com/david/haskell/codeindex.html with my own thoughts on the matter, I discover the one line to figure out whether a specific list represents the identity:

isIdentity (PL xs) = all (\(i,j) -> i==j) (zip [1..] xs)

Is there any sort of benefit to be won by using this construction instead of

isIdentity (PL xs) = xs == [1..(length xs)]

and if so, what?

At some point in the future, I'll learn to think more before I post. Say

isIdentity xs = all (\(i,j) -> i==j) (zip [1..] xs) isIdentity' xs = xs == [1..(length xs)]

Then isIdentity 1:3:2:[4..100000] finishes in an instant, whereas isIdentity' 1:3:2:[4..100000] takes noticable time before completing.

So it's a question of getting laziness to work for you.

Indeed. The first version works also for infinite lists. That is, if the infinite list does not represent the identity, the function will eventually return False, otherwise it runs forever. Btw. you can simplify this version to: isIdentity (PL xs) = and (zipWith (==) [1..] xs)

David House

4:51 p.m.

On 19/10/06, Mikael Johansson wrote:

...

isIdentity xs = all (\(i,j) -> i==j) (zip [1..] xs) isIdentity' xs = xs == [1..(length xs)]

Then isIdentity 1:3:2:[4..100000] finishes in an instant, whereas isIdentity' 1:3:2:[4..100000] takes noticable time before completing.

Why is this so? I'd have thought that the equality function for lists only forces evaluation of as many elements from its arguments as to determine the answer. In other words, the computation should go something like this: (We're comparing let xs = 1:3:2:[4..100000] in xs == [1..length xs]) <thunk> == <thunk> 1:<thunk> == 1:<thunk> (Evaluate first element to reveal a cons cell) 1:3:<thunk> == 1:2:<thunk> (Evaluate second element) False Why doesn't this happen? This is how I imagine the computation unfolding, drawing upon the definitions of == and &&: (1): [] == [] = True (2): (x:xs) == (y:ys) = x == y && xs == ys (3): _xs == _ys = False (1): True && x = x (2): False && _ = False xs == ys x:xs == y:ys (Evaluate cons cell) x == y && xs == ys (Equation (2) of ==) 1 == 1 && xs == y (Evaluate head of lists) True && xs == ys xs == ys (Equation (1) of &&) x:xs == y:ys (Evaluate next cons cell) x == y && xs == ys 3 == 2 && xs == ys (Evaluate next elements) False && xs == ys False (Equation (2) of &&) As an aside, here's output from Hugs that shows the difference quite noticably: Hugs.Base> let xs = 1:3:2:[4..100000] in xs == [1..length xs] False (3400043 reductions, 4396061 cells, 5 garbage collections) Hugs.Base> let xs = 1:3:2:[4..100000] in all (uncurry (==)) (zip [1..] xs) False (70 reductions, 148 cells) -- -David House, dmhouse@gmail.com

Cale Gibbard

5:37 p.m.

On 19/10/06, David House wrote:

...

On 19/10/06, Mikael Johansson wrote:

...
isIdentity xs = all (\(i,j) -> i==j) (zip [1..] xs) isIdentity' xs = xs == [1..(length xs)]

Then isIdentity 1:3:2:[4..100000] finishes in an instant, whereas isIdentity' 1:3:2:[4..100000] takes noticable time before completing.

Why is this so? I'd have thought that the equality function for lists only forces evaluation of as many elements from its arguments as to determine the answer.

In order to determine if [1..length xs] has an element at all, you have to evaluate length xs, which involves forcing the entire spine of xs, because integers can't be partially evaluated. Computing lengths of lists is a great way to introduce strictness.

Tomasz Zielonka

7:27 p.m.

On Thu, Oct 19, 2006 at 01:37:16PM -0400, Cale Gibbard wrote:

...

...
Why is this so? I'd have thought that the equality function for lists only forces evaluation of as many elements from its arguments as to determine the answer.

In order to determine if [1..length xs] has an element at all, you have to evaluate length xs, which involves forcing the entire spine of xs, because integers can't be partially evaluated. Computing lengths of lists is a great way to introduce strictness.

Right, so if Ints were represented as a datatype with Succ and Zero constructors (so integers could be partially evaluated), then the version with length would behave nicely on large and infinite lists :-) Best regards Tomasz

Nils Anders Danielsson

8:44 p.m.

On Thu, 19 Oct 2006, Tomasz Zielonka wrote:

...

On Thu, Oct 19, 2006 at 01:37:16PM -0400, Cale Gibbard wrote:

...
In order to determine if [1..length xs] has an element at all, you have to evaluate length xs, which involves forcing the entire spine of xs, because integers can't be partially evaluated. Computing lengths of lists is a great way to introduce strictness.

Right, so if Ints were represented as a datatype with Succ and Zero constructors (so integers could be partially evaluated), then the version with length would behave nicely on large and infinite lists :-)

Using genericLength for unary, lazy natural numbers can be convenient for other tasks as well, for instance choosing the shorter of two lists in a simple and lazy way. See "Modular Lazy Search for Constraint Satisfaction Problems", Nordin and Tolmach, http://web.cecs.pdx.edu/~apt/, around page 25, for another (related) example. -- /NAD

Nicolas Frisby

9:18 p.m.

It's nice to have that pointed out; I'm always forgeting that there's a representation optimization going on when using Ints/Integers for naturals. This Peano approach makes the length check no longer strict in the spine of its input. xs is consumed lazily, [1..natLength xs] is produced lazily, and thus isIdentity' works lazily. Of course [1...natLength xs] would have to elaborate to some catamorphism on Nat:

...

data Nat = Succ Nat | Zero

[1...nat] = cataPhi nat

cataPhi Zero = [] cataPhi (Succ n) = 1 : map (+1) (cataPhi n)

or a List-anamorphism with Nat's in the state-space

...

data List a = Cons a | Nil -- pretending built-in [] works like this

[1...nat] = ana psi (nat, 1) where psi (Zero, _) = Nil psi (Succ n, x) = Cons x (n, x+1)

Unfortunately Enum and Num are not granular enough to welcome Nat as an instance, so the [1...Nat] syntax couldn't elaborate thusly today. I'm sure I'm mentioning things (numeric type classes) here we've already discussed... sorry if this is all old hat. I think the cata/ana perspective may highlight the preservation of laziness during composition issues. Composing particular omega-morphisms has some theory--am I off in the woods to think it might apply? It's a bit foggy still. Thanks, Nick On 10/19/06, Tomasz Zielonka wrote:

...

On Thu, Oct 19, 2006 at 01:37:16PM -0400, Cale Gibbard wrote:

...
...
Why is this so? I'd have thought that the equality function for lists only forces evaluation of as many elements from its arguments as to determine the answer.

In order to determine if [1..length xs] has an element at all, you have to evaluate length xs, which involves forcing the entire spine of xs, because integers can't be partially evaluated. Computing lengths of lists is a great way to introduce strictness.

Right, so if Ints were represented as a datatype with Succ and Zero constructors (so integers could be partially evaluated), then the version with length would behave nicely on large and infinite lists :-)

Best regards Tomasz _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Robert Dockins

6:28 p.m.

On Oct 19, 2006, at 12:51 PM, David House wrote:

...

On 19/10/06, Mikael Johansson wrote:

...
isIdentity xs = all (\(i,j) -> i==j) (zip [1..] xs) isIdentity' xs = xs == [1..(length xs)]

Then isIdentity 1:3:2:[4..100000] finishes in an instant, whereas isIdentity' 1:3:2:[4..100000] takes noticable time before completing.

Why is this so? I'd have thought that the equality function for lists only forces evaluation of as many elements from its arguments as to determine the answer. In other words, the computation should go something like this:

I wondered this too for a minute. I'm pretty sure that the answer is that the 'length' function is the culprit, not (==). Calling 'length' forces the spine of 'xs', which accounts for the extra computation. Just say 'no' to length (when you want laziness). [snip]

...

-- -David House, dmhouse@gmail.com

Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Nicolas Frisby

7:40 p.m.

I may have missed this in the discussion so far, but it seems we could use a summary. In short: isIdentity does not check for exact equivalence, only a prefix equivalence. That's why it doesn't exhibit the same time/space behavior as a reformulation based on full equivalence. More verbosely: isIdentity works lazily because it effectively determines if the list xs has the same prefix as the infinite list [1..]. It is not actually an equivalence check. But isIdentity' is an equivalence check and it must construct the finite list [1..(length xs)]. As has been discussed, the length demands the spine of the entire xs list, thereby incurring the delay you originally noticed. Nick On 10/19/06, Robert Dockins wrote:

...

On Oct 19, 2006, at 12:51 PM, David House wrote:

...
On 19/10/06, Mikael Johansson wrote:

...
isIdentity xs = all (\(i,j) -> i==j) (zip [1..] xs) isIdentity' xs = xs == [1..(length xs)]

Then isIdentity 1:3:2:[4..100000] finishes in an instant, whereas isIdentity' 1:3:2:[4..100000] takes noticable time before completing.

Why is this so? I'd have thought that the equality function for lists only forces evaluation of as many elements from its arguments as to determine the answer. In other words, the computation should go something like this:

I wondered this too for a minute. I'm pretty sure that the answer is that the 'length' function is the culprit, not (==). Calling 'length' forces the spine of 'xs', which accounts for the extra computation.

Just say 'no' to length (when you want laziness).

[snip]

...
-- -David House, dmhouse@gmail.com

Rob Dockins

Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Brandon Moore

7:51 p.m.

Nicolas Frisby wrote:

...

I may have missed this in the discussion so far, but it seems we could use a summary.

In short: isIdentity does not check for exact equivalence, only a prefix equivalence. That's why it doesn't exhibit the same time/space behavior as a reformulation based on full equivalence.

Both versions check whether the provided list matches a prefix of [1..], it's just that the formulation with == is written to construct the prefix and then compare, while the version with zipWith (==) relies on zip taking just a prefix of the longer list. The reason the version using == is bad is because it is strict in the (spine of) the first list, because you need to compute length xs before you can begin constructing [1..length xs]. if you arrange to lazily construct the reference list, the functions should be roughly equivalent: isIdentity xs = xs == takeLengthOf xs [1..] where takeLengthOf xs ys = zipWith const ys xs for finite lists, takeLengthOf xs ys == take (length xs) ys Brandon

David House

8:38 p.m.

On 19/10/06, Brandon Moore wrote:

...

isIdentity xs = xs == takeLengthOf xs [1..] where takeLengthOf xs ys = zipWith const ys xs

You probably mean zipWith (flip const) xs ys.

...

for finite lists, takeLengthOf xs ys == take (length xs) ys

This ruins the laziness again: Hugs.Base> let takeLengthOf xs ys = take (length xs) ys; isIdentity xs = xs == takeLengthOf xs [1..] in isIdentity (1:3:2:[4..10000]) False (210064 reductions, 278075 cells) Hugs.Base> let takeLengthOf = zipWith (flip const); isIdentity xs = xs == takeLengthOf xs [1..] in isIdentity (1:3:2:[4..1000]) False (60 reductions, 114 cells) -- -David House, dmhouse@gmail.com

Brandon Moore

9:17 p.m.

David House wrote:

...

On 19/10/06, Brandon Moore wrote:

...
isIdentity xs = xs == takeLengthOf xs [1..] where takeLengthOf xs ys = zipWith const ys xs

You probably mean zipWith (flip const) xs ys. Either way, as long as I didn't write "zipWith const xs ys".

...
for finite lists, takeLengthOf xs ys == take (length xs) ys

This ruins the laziness again:

Hugs.Base> let takeLengthOf xs ys = take (length xs) ys; isIdentity xs = xs == takeLengthOf xs [1..] in isIdentity (1:3:2:[4..10000]) False (210064 reductions, 278075 cells) Hugs.Base> let takeLengthOf = zipWith (flip const); isIdentity xs = xs == takeLengthOf xs [1..] in isIdentity (1:3:2:[4..1000]) False (60 reductions, 114 cells)

It's meant as an explanation, and partial specification: Prelude> let takeLength xs ys = zipWith const ys xs Prelude> Test.QuickCheck.test (\xs ys -> takeLength xs ys == take (length xs) ys) OK, passed 100 tests. This reminds me that QuickCheck only generates finite lists. Brandon

Tomasz Zielonka

11:01 p.m.

On Thu, Oct 19, 2006 at 04:03:38PM +0200, Mikael Johansson wrote:

...

isIdentity (PL xs) = all (\(i,j) -> i==j) (zip [1..] xs)

isIdentity (PL xs) = xs == [1..(length xs)]

How about a compromise? isIdentity (PL xs) = xs == zipWith const [1..] xs Best regards Tomasz

Henk-Jan van Tuyl

20 Oct 20 Oct

4:08 p.m.

How about this: import List isIdentity (PL xs) = xs `isPrefixOf` [1..] ? Best regards, Henk-Jan van Tuyl On Fri, 20 Oct 2006 01:01:33 +0200, Tomasz Zielonka wrote:

...

On Thu, Oct 19, 2006 at 04:03:38PM +0200, Mikael Johansson wrote:

...
isIdentity (PL xs) = all (\(i,j) -> i==j) (zip [1..] xs)

isIdentity (PL xs) = xs == [1..(length xs)]

How about a compromise?

isIdentity (PL xs) = xs == zipWith const [1..] xs

Best regards Tomasz _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

-- Met vriendelijke groet, Henk-Jan van Tuyl -- http://Van.Tuyl.eu/ -- Using Opera's revolutionary e-mail client: https://secure.bmtmicro.com/opera/buy-opera.html?AID=789433

David House

4:28 p.m.

On 20/10/06, Henk-Jan van Tuyl wrote:

...

How about this: import List isIdentity (PL xs) = xs `isPrefixOf` [1..] ?

Nice! Short, lazy and says what it does (onomatopoeic code? :)). -- -David House, dmhouse@gmail.com

Nicolas Frisby

5:13 p.m.

Using the base libraries... how sneaky. :) On 10/20/06, David House wrote:

...

On 20/10/06, Henk-Jan van Tuyl wrote:

...
How about this: import List isIdentity (PL xs) = xs `isPrefixOf` [1..] ?

Nice! Short, lazy and says what it does (onomatopoeic code? :)).

-- -David House, dmhouse@gmail.com _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Tomasz Zielonka

21 Oct 21 Oct

8:30 a.m.

On Fri, Oct 20, 2006 at 06:08:26PM +0200, Henk-Jan van Tuyl wrote:

...

How about this: import List isIdentity (PL xs) = xs `isPrefixOf` [1..] ?

Great! This is so natural. When will I finally learn to continue thinking after finding the first solution, especially when I feel it's not ideal?! Do you also have this experience with Haskell?: when you feel that some code is not ideal, almost always it can be improved. It's much harder in some other languages, for example, in C++ there is always something wrong with the code ;-) Best regards Tomasz

David House

1:33 p.m.

On 21/10/06, Tomasz Zielonka wrote:

...

Do you also have this experience with Haskell?: when you feel that some code is not ideal, almost always it can be improved.

One of the recurring features of the #haskell IRC conversations is something called 'Algorithm Golf' (which is a misnomer and should really be 'Algorithm Tennis'): one person will request an algorithm and anyone interested sets about building their own. The results are then shared using lambdabot's Haskell evaluation feature and collaboratively improved. Aside from leading to efficient and natural-looking solutions to people's problems, the rounds are often pedagogical and great fun! Silly spin-offs are also common, leading to such wiki pages as http://haskell.org/haskellwiki/Compose. That particular example was mostly my doing and answers the question 'Can you build a function compose :: [a -> a] -> a -> a, such that a value will be fed into the top of the list and we'll get a result out of the bottom?'. Of course, the sane solution is foldl (flip (.)) id, but after noticing that the State monad permits a particularly elegant solution, execState . mapM modify, the task evolved into 'In how many other monads can we write this function?' The examples get more and more silly until we reach Cont when things disappear off the proverbial horizon into the land of 'How on Earth did Cale ever think of that?'. I'd recommend hanging out in the channel to anyone. :) -- -David House, dmhouse@gmail.com

dons＠cse.unsw.edu.au

1:40 p.m.

dmhouse:

...

On 21/10/06, Tomasz Zielonka wrote:

...
Do you also have this experience with Haskell?: when you feel that some code is not ideal, almost always it can be improved.

One of the recurring features of the #haskell IRC conversations is something called 'Algorithm Golf' (which is a misnomer and should really be 'Algorithm Tennis'): one person will request an algorithm and anyone interested sets about building their own. The results are then shared using lambdabot's Haskell evaluation feature and collaboratively improved.

I also like how when doing true 'golf', with @pl, we find new combinators: http://haskell.org/haskellwiki/Pointfree#Combinator_discoveries Like the owl: ((.)$(.))

...

I'd recommend hanging out in the channel to anyone. :)

I agree, if you're not on #haskell, you're missing out! http://haskell.org/haskellwiki/IRC_channel :) -- Don

Ross Paterson

2:05 p.m.

On Sat, Oct 21, 2006 at 02:33:12PM +0100, David House wrote:

...

Silly spin-offs are also common, leading to such wiki pages as http://haskell.org/haskellwiki/Compose. That particular example was mostly my doing and answers the question 'Can you build a function compose :: [a -> a] -> a -> a, such that a value will be fed into the top of the list and we'll get a result out of the bottom?'. Of course, the sane solution is foldl (flip (.)) id,

flip (foldl (flip id)) ? I usually use the reverse composition, flip (foldr id).

6840

Age (days ago)

6842

Last active (days ago)

List overview

Download

20 comments

12 participants

participants (12)

Brandon Moore
Cale Gibbard
David House
dons＠cse.unsw.edu.au
Henk-Jan van Tuyl
Henning Thielemann
Mikael Johansson
Nicolas Frisby
Nils Anders Danielsson
Robert Dockins
Ross Paterson
Tomasz Zielonka