More good performance results for ghc 6.8

Trying out some of the great language shootout programs with ghc 6.8 is producing nice results. For example, our "classic" cache-hammering, bitwise sieve benchmark is out of the box 10% faster with the new compiler. The (already rather good) benchmark is here (the same speed as the OCaml version under ghc 6.6): http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebits&lang=all And timing with old and new ghc: ghc 6.6.1 Primes up to 81920000 4774995 Primes up to 40960000 2488465 Primes up to 20480000 1299069 ./A66 13 4.50s user 0.02s system 100% cpu 4.515 total ghc 6.8.1 Primes up to 81920000 4774995 Primes up to 40960000 2488465 Primes up to 20480000 1299069 ./A68 13 4.13s user 0.01s system 99% cpu 4.142 total Lovely work GHC HQ, when low level, highly tuned code like this gets magically faster! Once 6.8 is in Gentoo (or earlier...) we should see similar improvements across a range of shootout programs. -- Don

On Wednesday 14 November 2007 00:14, Don Stewart wrote:
Trying out some of the great language shootout programs with ghc 6.8 is producing nice results. For example, our "classic" cache-hammering, bitwise sieve benchmark is out of the box 10% faster with the new compiler. The (already rather good) benchmark is here (the same speed as the OCaml version under ghc 6.6):
http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebits&lang=al l
And timing with old and new ghc:
ghc 6.6.1 Primes up to 81920000 4774995 Primes up to 40960000 2488465 Primes up to 20480000 1299069 ./A66 13 4.50s user 0.02s system 100% cpu 4.515 total
ghc 6.8.1 Primes up to 81920000 4774995 Primes up to 40960000 2488465 Primes up to 20480000 1299069 ./A68 13 4.13s user 0.01s system 99% cpu 4.142 total
Lovely work GHC HQ, when low level, highly tuned code like this gets magically faster! Once 6.8 is in Gentoo (or earlier...) we should see similar improvements across a range of shootout programs.
I get slightly different results on our 2.2GHz Athlon64 X2 machines but they are only up to 20% slower than C++ for OCaml and Haskell: F# 1.9.2.9: 2.125s on 32-bit Windows XP GHC 6.8: 1.414s on 64-bit Debian Lenny GHC 6.6: 1.387s on 64-bit Debian Lenny OCaml 3.10: 1.278s on 64-bit Debian Lenny g++ 4.2.3: 1.176s on 64-bit Debian Lenny Interestingly, GHC 6.8 does not improve over GHC 6.6 on this machine. Regardless, the performance in absolute terms is incredible (IMHO). Also, I took the liberty of optimizing the OCaml: open Printf open Bigarray let rec clear (a : (int, int8_unsigned_elt, c_layout) Bigarray.Array1.t) n i = let j = ref (2 * i) in while !j <= n do let ic = !j lsr 3 in let bit = a.{ic} land lnot(1 lsl (!j land 0x7)) in if a.{ic} <> bit then a.{ic} <- bit; j := !j + i done let nsieve n = let a = Array1.create int8_unsigned c_layout ((n lsr 3) + 2) in Array1.fill a 0xFF; let count = ref 0 in for i = 2 to n do if a.{i lsr 3} land (1 lsl (i land 0x7)) > 0 then begin incr count; if i*i <= n then clear a n i end done; !count let test n = printf "Primes up to %8i %8i\n%!" n (nsieve n) let () = match Sys.argv with | [|_; n|] -> let n = int_of_string n in List.iter (fun n -> if n>0 then test ((1 lsl n) * 10000)) [n; n-1; n-2] | _ -> printf "Usage: ./sieve <n>\n%!" -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e
participants (2)
-
Don Stewart
-
Jon Harrop