Re: [Haskell-cafe] how to optmize this code?

31 Mar 2011

      On Thursday 31 March 2011 14:27:59, Yves Parès wrote:
...
Just to be sure, because I am not quite familiar with the dark hairy
internals of GHC:
...
Of course, given a type signature that allows strictness to be
inferred.
You mean a signature with no type variables and types that are know to
GHC as being strict?
(Like Int -> Int -> Int instead of (Num a) => a -> a -> a)
Yes. For a type class polymorphic function like (+), it is of course 
impossible to infer strictness, since there can be strict as well as lazy 
instances.
For monomorphic types, it may be possible to infer strictness (the 
implementation can be too complicated for the strictness analyser to 
discover that "yes, this function is strict, you may as well evaluate 
things immediately).

One important thing for (in particular type class) polymorphic functions is 
to generate specialised versions for frequently used types to let GHC take 
advantage of their properties, so it's generally a good idea to

{-# SPECIALISE foo :: Int -> Int -> Int,
                      Integer -> Integer -> Integer,
                      Double -> Double -> Double
  #-}

if strictness helps in foo (and compile the defining module as well as the 
using modules with optimisations) [to reduce code bloat, specialise only 
for the types you really use/expect to be used much].

For things like arithmetic operations on Int or Integer, strictness is 
known, so you get immediate evaluation (with optimisations) as soon as the 
analyser sees "if the result of some function is ever needed, it needs to 
evaluate this arithmetic expression".

In foldl (+) 0 :: [Int] -> Int, that means, if the function is entered at 
all, you get a nice strict loop adding things on the fly and a wrapper 
providing the outermost laziness, guarding the entrance.
...
...
The difference is that the explicit recursion produces the better code
even
...
with optimisations turned off, except that the overload of (+) to use
is not inlined, so the accumulator still builds a thunk, while with
optimisations you get the specialised strict additions (+# resp.
plusInteger, ...) so you have the strictness you need.
(+#) is then the GHC's strict equivalent of (+)?
(+#) is addition of unboxed Ints.

In GHC, we have

data Int = I# Int#

and Int# is a raw machine integer (native word sized). On Int#, we have the 
primitive operations (+#), (-#), (*#), negateInt#, (==#) and a couple more, 
which translate directly to the machine instructions (at least, that's the 
intention).

When you have an Int-calculation, if it's determined to be strict, GHC 
unboxes things as far as possible and carries out the calculation on the 
unboxed Int#s, wrapping the result in a I# when it's done.

So, (+#) is a little better than just a strict addition of Ints, which 
would wrap all intermediate results again in the constructor I#, only to 
immediately unbox them for the next step.

Analogous for

data Word = W# Word#

(plusWord#, minusWord#, eqWord# ...)

data Double = D# Double#

((+##), (-##), (*##), (**##), (==##), ...)

data Float = F# Float#

(plusFloat#, ...)

Most of the time, you need not worry about that, GHC's strictness analyser 
is pretty good, sometimes you need to help it with a few bang patterns or 
seq's, check the generated core (-ddump-simpl), lots of #'s and 'case's are 
good, 'let's and boxed Ints (Words, ...) are generally less desirable [in 
loops and such].
Only rarely you need to directly use the raw types and primops.
...
But if you make an overlay to (+), like, say:
(?) :: (Num a) => a -> a -> a
a ? b = a + b
Then (?) will be lazy, won't it?
Yes, generally, but
...
Then optimizations will not occur, a ? b will remain a thunk and not be
replaced by a +# b and be strictly evaluated?
Well, it's very small, so it will be inlined and you might as well directly 
write (+).
If it's used at the appropriate types, it will be replaced with (+#), 
plusWord# or whatever if (+) will be.
Add a {-# NOINLINE (?) #-} pragma or have it large enough to not be inlined 
(or recursive) and you shut out the strictness analyser (except you invite 
it in with {-# SPECIALISE #-} pragmas or so).
...
If so, then it means that you can always turn a strict function into a
non strict one, am I right?
Err, terminology problem here.
Strictly speaking, a function is strict iff

f _|_ = _|_

while we are talking here about evaluation strategies, so we should better 
have spoken of eager vs. deferred evaluation.
A non-strict function has different semantics from a strict one by 
definition.

If you have a strict function, you may evaluate its argument eagerly 
without changing the result¹, while eager evaluation of a non-strict 
function's argument may produce _|_ where deferred evaluation wouldn't.

By default, everything in Haskell is deferredly evaluated, but the 
strictness analyser may find that it's okay to evaluate some things eagerly 
(or the programmer indicates that eager evaluation is desired with a seq or 
bang pattern). Then the compiler rewrites the function.

So it's about functions that are rewritten by the compiler into functions 
eagerly evaluating their arguments.
One can always [unless I'm mistaken] prevent (or force) that rewrite, that 
can be simple or involve jumping through a lot of hoops.
In the above example, you have to make sure that the inliner doesn't kick 
in to defeat the intent(?) of deferring evaluation of Int arguments.

[¹] Actually, eager evaluation of a strict function's argument may produce 
results where deferred evaluation doesn't, cf. foldl vs. foldl'.
However, that's because of coincidental limits like stack/heap/RAM size, 
not fundamental. Given enough of those, the deferred evaluation would 
produce the same result.

Re: [Haskell-cafe] how to optmize this code?

Daniel Fischer