Common subexpression elemination (CSE)

the following code does not run as fast as expected: modexp a e n = if e <= 0 then 1 else if even e then mod (modexp a (div e 2) n * modexp a (div e 2) n) n else mod (a * modexp a (e - 1) n) n it gets only fast if written as: modexp2 a e n = if e <= 0 then 1 else if even e then let d = modexp2 a (div e 2) n in mod (d * d) n else mod (a * modexp2 a (e - 1) n) n I wonder, why the common subexpression "modexp a (div e 2) n" is not eliminated in the first case. Does CSE work at all? For testing the exponent "e" should have at least 10 digits. Cheers Christian P.S. Other alternatives are slower, too modexp1 a e n = if e <= 0 then 1 else mod (a * modexp1 a (e - 1) n) n modexp3 a e n = mod (a ^ e) n

GHC doesn't normally do CSE. CSE can cause space leaks, so you can't do it willy-nilly. I'm sure there are some strict contexts where it could be done safely, but I don't think ghc uses that information (yet). -- Lennart On Nov 27, 2006, at 08:34 , Christian Maeder wrote:
the following code does not run as fast as expected:
modexp a e n = if e <= 0 then 1 else if even e then mod (modexp a (div e 2) n * modexp a (div e 2) n) n else mod (a * modexp a (e - 1) n) n
it gets only fast if written as:
modexp2 a e n = if e <= 0 then 1 else if even e then let d = modexp2 a (div e 2) n in mod (d * d) n else mod (a * modexp2 a (e - 1) n) n
I wonder, why the common subexpression "modexp a (div e 2) n" is not eliminated in the first case. Does CSE work at all?
For testing the exponent "e" should have at least 10 digits.
Cheers Christian
P.S. Other alternatives are slower, too
modexp1 a e n = if e <= 0 then 1 else mod (a * modexp1 a (e - 1) n) n
modexp3 a e n = mod (a ^ e) n _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Lennart Augustsson schrieb:
GHC doesn't normally do CSE. CSE can cause space leaks, so you can't do it willy-nilly. I'm sure there are some strict contexts where it could be done safely, but I don't think ghc uses that information (yet).
Interestingly, it can only be switched off by -fno-cse, so I suppose there should be some cases of cse. C.
On Nov 27, 2006, at 08:34 , Christian Maeder wrote:
the following code does not run as fast as expected:
modexp a e n = if e <= 0 then 1 else if even e then mod (modexp a (div e 2) n * modexp a (div e 2) n) n else mod (a * modexp a (e - 1) n) n
it gets only fast if written as:
modexp2 a e n = if e <= 0 then 1 else if even e then let d = modexp2 a (div e 2) n in mod (d * d) n else mod (a * modexp2 a (e - 1) n) n

GHC does a very simple form of CSE. If it sees let x = e in ....e.... it replaces the inner 'e' by x. But that's all at the moment. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] | On Behalf Of Lennart Augustsson | Sent: 27 November 2006 13:40 | To: Christian Maeder | Cc: GHC Users Mailing List | Subject: Re: Common subexpression elemination (CSE) | | GHC doesn't normally do CSE. CSE can cause space leaks, so you can't | do it willy-nilly. | I'm sure there are some strict contexts where it could be done | safely, but I don't think ghc uses that information (yet). | | -- Lennart | | On Nov 27, 2006, at 08:34 , Christian Maeder wrote: | | > the following code does not run as fast as expected: | > | > modexp a e n = if e <= 0 then 1 else | > if even e | > then mod (modexp a (div e 2) n * modexp a (div e 2) n) n | > else mod (a * modexp a (e - 1) n) n | > | > it gets only fast if written as: | > | > modexp2 a e n = if e <= 0 then 1 else | > if even e | > then let d = modexp2 a (div e 2) n in mod (d * d) n | > else mod (a * modexp2 a (e - 1) n) n | > | > I wonder, why the common subexpression "modexp a (div e 2) n" is not | > eliminated in the first case. Does CSE work at all? | > | > For testing the exponent "e" should have at least 10 digits. | > | > Cheers Christian | > | > P.S. Other alternatives are slower, too | > | > modexp1 a e n = if e <= 0 then 1 else | > mod (a * modexp1 a (e - 1) n) n | > | > modexp3 a e n = mod (a ^ e) n | > _______________________________________________ | > Glasgow-haskell-users mailing list | > Glasgow-haskell-users@haskell.org | > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 11/27/06, Lennart Augustsson
GHC doesn't normally do CSE. CSE can cause space leaks, so you can't do it willy-nilly. I'm sure there are some strict contexts where it could be done safely, but I don't think ghc uses that information (yet).
-- Lennart
My apologies in advance for asking possibly stupid questions, but I don't understand this. How exactly can CSE cause space leaks, and what does this have to do with strictness? Cheers, Dinko

Dinko Tenev wrote:
On 11/27/06, Lennart Augustsson
wrote: GHC doesn't normally do CSE. CSE can cause space leaks, so you can't do it willy-nilly. I'm sure there are some strict contexts where it could be done safely, but I don't think ghc uses that information (yet).
-- Lennart
My apologies in advance for asking possibly stupid questions, but I don't understand this.
How exactly can CSE cause space leaks, and what does this have to do with strictness?
Combining two expressions means that they're represented by the same memory location. In particular when you start evaluating the first, the second reference to the value will keep all of it alive even if parts of it could otherwise be freed. This is especially problematic for infinite lists. http://hackage.haskell.org/trac/ghc/ticket/947 demonstrates this problem, caused by the little CSE that ghc does. (Note: This is not of the form "let x = term in ... term ...", but it will be once it's desugared and the simplifier has floated out the constant expressions from the "primes0" and "primes" functions) I'm not sure how it relates to strictness. I'd be more worried about about the size of the data that's being kept alive. Numbers are more likely to be ok than lists. Bertram

On 11/28/06, Bertram Felgenhauer
Dinko Tenev wrote:
On 11/27/06, Lennart Augustsson
wrote: GHC doesn't normally do CSE. CSE can cause space leaks, so you can't do it willy-nilly. I'm sure there are some strict contexts where it could be done safely, but I don't think ghc uses that information (yet).
-- Lennart
My apologies in advance for asking possibly stupid questions, but I don't understand this.
How exactly can CSE cause space leaks, and what does this have to do with strictness?
Combining two expressions means that they're represented by the same memory location. In particular when you start evaluating the first, the second reference to the value will keep all of it alive even if parts of it could otherwise be freed. This is especially problematic for infinite lists.
http://hackage.haskell.org/trac/ghc/ticket/947
demonstrates this problem, caused by the little CSE that ghc does. (Note: This is not of the form "let x = term in ... term ...", but it will be once it's desugared and the simplifier has floated out the constant expressions from the "primes0" and "primes" functions)
I'm not sure how it relates to strictness. I'd be more worried about about the size of the data that's being kept alive. Numbers are more likely to be ok than lists.
Bertram
I see the point, though it still feels a bit weird to call this a leak (it also explains the relation to strictness for me, as the issue seems to be irrelevant in a strict context.) Thanks for the explanation. -- Cheers, Dinko

The relation to strictness is that you have much tighter control over when things are evaluated (typically) for something strict, so it is less likely to leak. Take the expression 'e + e' at some base type. It's harmless to CSE this to 'let x = e in x+x' because + is strict in x. Whereas '(e,e)' can't be CSE:ed to 'let x = e in (x,x)' without risking a space leak. It's not exactly strictness, it's more about knowing when a value is going to be consumed. Also, things that are "small" can be CSE:ed, since the evaulated form doesn't take more space than the unevaluated. -- Lennart On Nov 28, 2006, at 09:50 , Bertram Felgenhauer wrote:
Dinko Tenev wrote:
On 11/27/06, Lennart Augustsson
wrote: GHC doesn't normally do CSE. CSE can cause space leaks, so you can't do it willy-nilly. I'm sure there are some strict contexts where it could be done safely, but I don't think ghc uses that information (yet).
-- Lennart
My apologies in advance for asking possibly stupid questions, but I don't understand this.
How exactly can CSE cause space leaks, and what does this have to do with strictness?
Combining two expressions means that they're represented by the same memory location. In particular when you start evaluating the first, the second reference to the value will keep all of it alive even if parts of it could otherwise be freed. This is especially problematic for infinite lists.
http://hackage.haskell.org/trac/ghc/ticket/947
demonstrates this problem, caused by the little CSE that ghc does. (Note: This is not of the form "let x = term in ... term ...", but it will be once it's desugared and the simplifier has floated out the constant expressions from the "primes0" and "primes" functions)
I'm not sure how it relates to strictness. I'd be more worried about about the size of the data that's being kept alive. Numbers are more likely to be ok than lists.
Bertram _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Nov 28, 2006, at 10:34 AM, Dinko Tenev wrote:
How exactly can CSE cause space leaks, and what does this have to do with strictness?
See this message for a nice example: http://www.haskell.org/pipermail/haskell-cafe/2003-June/004484.html / Ulf
participants (6)
-
Bertram Felgenhauer
-
Christian Maeder
-
Dinko Tenev
-
Lennart Augustsson
-
Simon Peyton-Jones
-
Ulf Norell