
Krasimir Angelov wrote:
Well I actually did, almost. I added this function:
quotX :: Int -> Int -> Int a `quotX` b | b == 0 = error "divZeroError" | b == (-1) && a == minBound = error "overflowError" | otherwise = a `quotInt` b
It does the right thing. However to be sure that this doesn't interfere with some other GHC magic the real quot function have to be changed and tested. I haven't build GHC from source for 2-3 years now and I don't have the time to do it just to test whether this works.
It works, and it has the desired effect. It's not always a win though; the nofib prime sieve benchmark suffers. For a patch, see http://int-e.home.tlink.de/haskell/ghc-libraries-base-tune-division.patch Nofib results extract: ------------------------------------------------------------------------ Program Size Allocs Runtime Elapsed ------------------------------------------------------------------------ fish -0.7% -5.3% 0.05 0.04 primes -0.0% +28.5% +25.6% +25.5% wheel-sieve2 -0.0% -0.3% -17.9% -18.6% ------------------------------------------------------------------------ Min -0.9% -5.3% -17.9% -18.6% Max +0.1% +28.5% +25.6% +25.5% Geometric Mean -0.2% +0.2% -0.0% +0.2% 'primes' is an outlier - the other slowdowns are below 3% What happens in 'primes' is that 'mod' no longer gets inlined; apparently it now looks bigger to the compiler than before. Using -funfolding-use-threshold=10 brings the benchmark back to its original speed, despite the extra comparison before doing the division. In 'wheel-sieve2', the first different optimization choice I see is again a 'mod' that is no longer inlined; this leads to a change in other inlining choices that result in a speedup for reasons that I have not investigated. Bertram