Common subexpressions are not optimized to single one?

Hi, I'm wondering about the optimization of ghc. I thought that since functions are pure in haskell, compiler can/do factorize common subexpressions when it optimizes a code. But the result of the following experiment looks negative; "g 10 100" is caluculated twice. Am I missing something? If not, What prevents ghc from performing that optimization? Should I always factorize common subexpressions manually(using let or where)? --- $ cat op.hs module Main where import System.IO.Unsafe foreign import ccall "count" io_count:: IO Int f :: Int -> IO Int f x = do {y <- io_count; return $ x + y} g :: Int -> Int -> Int g x y = unsafePerformIO $ do {z <- f x; return $ z + y} main = do print (g 10 100, g 10 100) $ $ cat ffi_test_c.c static int counter = 0; int count() {return counter++;} $ $ gcc -c ffi_test_c.c;ghc -O2 -ffi op.hs;ghc -o op_test op.hs ffi_test_c.o $ ./op_test (110,111) --- I want to use some C functions from haskell each of which is not pure but the result of their sequential combination is pure. I'm planning to write some functions like g above(but more complex and actually pure) and considering the optimization of the code using them. Thanks in advance. Koji Nakahara

G'day all.
Quoting Koji Nakahara
I'm wondering about the optimization of ghc. I thought that since functions are pure in haskell, compiler can/do factorize common subexpressions when it optimizes a code.
The short answer is "no". While the optimisation preserves semantics, it will not, in general, preserve pragmatics. A classic (and contrived) example is this: slow_pow2 :: Int -> Int slow_pow2 n = length (subsets [1..n]) where subsets [] = [[]] subsets (x:xs) = subsets xs ++ [ x:xs' | xs' <- subsets xs ] On my machine, slow_pow2 32 runs (very slowly; I didn't bother to let it finish). Optimising the common subexpression on the last line (i.e. the recursive call) exhausts the heap. In general, automatically optimising common subexpressions is only a good idea if the compiler can prove that the expression optimised has a bounded size, and even then, it's not always a good idea. Some libraries which use unsafePerformIO creatively actually rely on this behaviour. I've written code, for example, which uses does something like this: {-# NOINLINE fooRef1 #-} fooRef1 :: IORef Foo fooRef1 = unsafePerformIO (newIORef makeAFoo) {-# NOINLINE fooRef1 #-} fooRef2 :: IORef Foo fooRef2 = unsafePerformIO (newIORef makeAFoo) fooOp1 :: IO () fooOp1 = modifyIORef fooRef1 something >> return () fooOp2 :: IO () fooOp2 = modifyIORef fooRef2 somethingElse >> return () This is perfectly safe (as long as the compiler respects NOINLINE!), but would break horribly if fooRef1 and fooRef2 were "optimised". Cheers, Andrew Bromage

On Wed, 3 Dec 2003 03:29:14 +0900
Koji Nakahara
I want to use some C functions from haskell each of which is not pure but the result of their sequential combination is pure. I'm planning to write some functions like g above(but more complex and actually pure) and considering the optimization of the code using them.
Then you should write it monadically, then unsafePerformIO the result, you shouldn't rely on the compiler performing (or not performing) an optimization.

My original mail might be misleading, but I didn't mean that I want to rely on the optimization "to make the program work". I just thought If I could expect such an optimization to be performed, I would not need to rewrite everytime
func (g x) (g x) -- g x = unsafeperformIO $ do {...} to let h = g x in func h h in order to make the program run faster.
But ,in the first place, is it meaningful to write "let h = g x in func h h" in order to avoid re-evaluation of g x, on the assumption that h is CAF? Thanks. -- Koji Nakahara
participants (3)
-
ajb@spamcop.net
-
Derek Elkins
-
Koji Nakahara