Number 1, at least for now

dons＠cse.unsw.edu.au

1 Feb 2006 1 Feb '06

7:22 a.m.

Haskell is now ranked number 1 on the Great Language Shootout! http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=all Hooray :) -- Don

Show replies by date

Sebastian Sylvan

1 Feb 1 Feb

10:27 a.m.

On 2/1/06, Donald Bruce Stewart wrote:

...

Haskell is now ranked number 1 on the Great Language Shootout!

http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=all

Hooray :)

That is neat. Mostly for dispelling the "pure lazy fp is inherently slow" argument. It's not likely to last though, as soon as someone implements the three missing C programs we'll be bumped down again. But it's still pretty cool! /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

Ben Lippmeier

11:49 p.m.

Sebastian Sylvan wrote:

...

...
Haskell is now ranked number 1 on the Great Language Shootout!

That is neat. Mostly for dispelling the "pure lazy fp is inherently slow" argument.

Ha! I don't think "pure lazy fp" would be the phrase I would choose to describe this code. An example (from fannkuch): t <- IO $ \s -> case readIntOffAddr# p1# 0# s of (# s, p1 #) -> case readIntOffAddr# p1# (n# -# 1#) s of (# s, pn #) -> (# s, not (p1 ==# 0# || pn ==# (n# -# 1#)) #) Ben.

dons＠cse.unsw.edu.au

2 Feb 2 Feb

1:51 a.m.

Ben.Lippmeier:

...

Sebastian Sylvan wrote:

...
...
Haskell is now ranked number 1 on the Great Language Shootout!

That is neat. Mostly for dispelling the "pure lazy fp is inherently slow" argument.

Ha! I don't think "pure lazy fp" would be the phrase I would choose to describe this code.

An example (from fannkuch):

t <- IO $ \s -> case readIntOffAddr# p1# 0# s of (# s, p1 #) -> case readIntOffAddr# p1# (n# -# 1#) s of (# s, pn #) -> (# s, not (p1 ==# 0# || pn ==# (n# -# 1#)) #)

Ok, I'll bite ;) This is a very atypical "example", as fannkuch is the only entry written this way (and it could be rewritten, with careful attention to the Core). There are also many lovely pure, lazy entries. A great thing about pure, lazy Haskell is that you can write impure strict code if you have to. Cheers, Don

Ben Lippmeier

3:01 a.m.

Donald Bruce Stewart wrote:

...

...
Ha! I don't think "pure lazy fp" would be the phrase I would choose to describe this code.

An example (from fannkuch):

t <- IO $ \s -> case readIntOffAddr# p1# 0# s of (# s, p1 #) -> case readIntOffAddr# p1# (n# -# 1#) s of (# s, pn #) -> (# s, not (p1 ==# 0# || pn ==# (n# -# 1#)) #)

Ok, I'll bite ;) This is a very atypical "example", as fannkuch is the only entry written this way (and it could be rewritten, with careful attention to the Core). There are also many lovely pure, lazy entries.

Not so atypical. More examples (I didn't look /that/ hard.. :)) ----------------------------------- * From reverse-complement reverseit iubc strand i l s = if i >=# l then (# s, (I# i, I# l) #) else case readWord8OffAddr# strand i s of { (# s, c #) -> case readWord8OffAddr# strand l s of { (# s, x #) -> case readWord8OffAddr# iubc (word2Int# x) s of { (# s, y#) [snip] * From k-nucleotide. eqmem i ptr1 ptr2 s = if i ==# 0# then (# s , True #) else case readInt8OffAddr# ptr1 0# s of { (# s, i8a #) -> case readInt8OffAddr# ptr2 0# s of { (# s, i8b #) -> if i8a ==# i8b then eqmem (i -# 1#) (plusAddr# ptr1 1#) (plusAddr# ptr2 1#) s else (# s, False #) } } * From n-body. kineticE i = let i' = (.|. (shiftL i 3)) in do m <- mass i vx <- unsafeRead b (i' vx) vy <- unsafeRead b (i' vy) vz <- unsafeRead b (i' vz) return $! 0.5 * m * (vx*vx + vy*vy + vz*vz) ---------------------- *I* am certainly not implying that Haskell is anything less than the most wonderous language in the entire world. I'm saying that there's a stark difference in style between the programs submitted to the shootout, and the ones I would show to people that I myself was trying to introduce to the wonders of purely lazy functional programming. :). I think there's a big-fat-lesson about the tension between abstraction and implementation in these entries. On one hand we've got "This is what I want", on the other it's "What do I have to do to implement it". Ben.

John Meacham

3:14 a.m.

On Thu, Feb 02, 2006 at 02:01:28PM +1100, Ben Lippmeier wrote:

...

I think there's a big-fat-lesson about the tension between abstraction and implementation in these entries.

though, I think this is a great oprotunity to improve ghc's optimizer. compare the core output from the hand optimized ones with the naturally written ones (that use the exact same algorithms, just not with all the crazy hand-unboxing action) and look for places where ghc is not optimizing something it could be. or at least, add some 'seq's until it generates the same core and submit that program as it is more idiomatic haskell if somewhat hand-optimized. (but not so much as to use explicit boxing) John -- John Meacham - ⑆repetae.net⑆john⑈

Gour

9:04 a.m.

On Wed, 2006-02-01 at 19:14 -0800, John Meacham wrote:

...

though, I think this is a great oprotunity to improve ghc's optimizer.

Huh, that would be the best thing with the whole shootout endeavour.. Sincerely, Gour

dons＠cse.unsw.edu.au

3:26 a.m.

Ben.Lippmeier:

...

Donald Bruce Stewart wrote:

...
...
Ha! I don't think "pure lazy fp" would be the phrase I would choose to describe this code.

An example (from fannkuch):

t <- IO $ \s -> case readIntOffAddr# p1# 0# s of (# s, p1 #) -> case readIntOffAddr# p1# (n# -# 1#) s of (# s, pn #) -> (# s, not (p1 ==# 0# || pn ==# (n# -# 1#)) #)

Ok, I'll bite ;) This is a very atypical "example", as fannkuch is the only entry written this way (and it could be rewritten, with careful attention to the Core). There are also many lovely pure, lazy entries.

Not so atypical.

More examples (I didn't look /that/ hard.. :))

----------------------------------- * From reverse-complement * From k-nucleotide. * From n-body.

Yeah, these were the hard entries to improve, and we tended towards brute force. The problems were either IO intensive, or requiring mutable state by spec definition, so we end up with tightly optimised inner loops, and this is how we do that in Haskell :) Still, the majority of code is not written this way -- problems not doing a lot of IO, or needing mutable arrays, are written at a higher level. The plan is that most of these ugly entries disappear once packed strings are in the base library, which solves the IO issue. A good packed string regex library would also be useful.

...

*I* am certainly not implying that Haskell is anything less than the most wonderous language in the entire world.

I'm saying that there's a stark difference in style between the programs submitted to the shootout, and the ones I would show to people that I myself was trying to introduce to the wonders of purely lazy functional programming. :).

Hehe, Maybe I should write "Learning Haskell: The Shootout Way" :) But sometimes elegance and speed do coincide. You could show them: * chameneos * pidigits * binary-trees * cheap-concurrency * partial-sums rather than the nasty IO problems.

...

I think there's a big-fat-lesson about the tension between abstraction and implementation in these entries.

On one hand we've got "This is what I want", on the other it's "What do I have to do to implement it".

Fair enough. I think my point is that the examples you point to illustrate the main problem we've had: fast, heavy duty IO and fast mutable unboxed arrays. We can expect improvements on both these issues in the next few months. -- Don

Wolfgang Jeltsch

7:46 p.m.

Am Donnerstag, 2. Februar 2006 04:26 schrieb Donald Bruce Stewart:

...

[...]

...

A good packed string regex library would also be useful.

But only one that gives us regular expressions which are parsed at compile time instead of runtime.

...

[...]

Best wishes, Wolfgang

Chris Kuklewicz

8:52 a.m.

Everyone is welcome to write 'better' code for the shootout. Its a good way to learn useful techniques, and share them. I will add notes about the benefits that were to be had from using this ugly code. Ben Lippmeier wrote:

...

Donald Bruce Stewart wrote:

...
...
Ha! I don't think "pure lazy fp" would be the phrase I would choose to describe this code.

An example (from fannkuch):

t <- IO $ \s -> case readIntOffAddr# p1# 0# s of (# s, p1 #) -> case readIntOffAddr# p1# (n# -# 1#) s of (# s, pn #) -> (# s, not (p1 ==# 0# || pn ==# (n# -# 1#)) #)

Ok, I'll bite ;) This is a very atypical "example", as fannkuch is the only entry written this way (and it could be rewritten, with careful attention to the Core). There are also many lovely pure, lazy entries.

There are two Haskell entries for fannkuch. The "normal" one has this code:

...

rotate 2 (x1:x2:xs) = x2:x1:xs rotate 3 (x1:x2:x3:xs) = x2:x3:x1:xs rotate 4 (x1:x2:x3:x4:xs) = x2:x3:x4:x1:xs rotate 5 (x1:x2:x3:x4:x5:xs) = x2:x3:x4:x5:x1:xs rotate 6 (x1:x2:x3:x4:x5:x6:xs) = x2:x3:x4:x5:x6:x1:xs rotate 7 (x1:x2:x3:x4:x5:x6:x7:xs) = x2:x3:x4:x5:x6:x7:x1:xs rotate 8 (x1:x2:x3:x4:x5:x6:x7:x8:xs) = x2:x3:x4:x5:x6:x7:x8:x1:xs rotate 9 (x1:x2:x3:x4:x5:x6:x7:x8:x9:xs) = x2:x3:x4:x5:x6:x7:x8:x9:x1:xs rotate 10 (x1:x2:x3:x4:x5:x6:x7:x8:x9:x10:xs) = x2:x3:x4:x5:x6:x7:x8:x9:x10:x1:xs rotate n (x:xs) = rotate' n xs where rotate' 1 xs = x:xs rotate' n (x:xs) = x:rotate' (n-1) xs

Which does not scale particularly well. The "normal" entry is 9.0x slower than the leader, which the readIntOffAddr# is 3.1x slower. So the gain from explicit unboxing and state passing was substantial.

...

Not so atypical.

More examples (I didn't look /that/ hard.. :))

----------------------------------- * From reverse-complement

reverseit iubc strand i l s = if i >=# l then (# s, (I# i, I# l) #) else case readWord8OffAddr# strand i s of { (# s, c #) -> case readWord8OffAddr# strand l s of { (# s, x #) -> case readWord8OffAddr# iubc (word2Int# x) s of { (# s, y#) case readWord8OffAddr# iubc (word2Int# c) s of { (# s, z #) -> case writeWord8OffAddr# strand i y s of { s -> case writeWord8OffAddr# strand l z s of { s -> reverseit iubc strand (i +# 1#) (l -# 1#) s } } } } } }

Again, there is a "normal" entry, with a screen full of the definition of

...

complement :: Base -> Base complement 'A' = 'T' complement 'a' = 'T' complement 'C' = 'G' complement 'c' = 'G' complement 'G' = 'C' ...

The "normal" entry was 32x slower than the leader and used 31x the memory. The optimized entry runs 6.3x slower and uses 1.1x the memory. The fast entry uses only the state & unboxing seen above, the rest of the program is normal Haskell, e.g. the one line that replaces a screen full:

...

pairs = map (c2w *** c2w) [('A','T'),('C','G'),('B','V'),('D','H'),('K','M'),('R','Y'),('\0','\0')]

...

* From k-nucleotide.

eqmem i ptr1 ptr2 s = if i ==# 0# then (# s , True #) else case readInt8OffAddr# ptr1 0# s of { (# s, i8a #) -> case readInt8OffAddr# ptr2 0# s of { (# s, i8b #) -> if i8a ==# i8b then eqmem (i -# 1#) (plusAddr# ptr1 1#) (plusAddr# ptr2 1#) s else (# s, False #) } }

I wrote that, copying the technique from Don. That is what a fast packed string library which compares ascii bytes should do internally. In c-code it is exacly the 'memcmp' function. And yet the GHC 6.4.1 distribution does not have this function. This is an example of the library missing things that it should have. The "normal" Haskell entry was 26x slower and 29x bigger than the leader. The replacement code above is 22x and 7.1x instead. Note that I did *not* get a big speed increase. This is because I must use the provided Data.Hashtable which is simply not competitive to other languages. This shootout benchmark has highlighted the deficiency.

...

* From n-body.

kineticE i = let i' = (.|. (shiftL i 3)) in do m <- mass i vx <- unsafeRead b (i' vx) vy <- unsafeRead b (i' vy) vz <- unsafeRead b (i' vz) return $! 0.5 * m * (vx*vx + vy*vy + vz*vz)

I wrote that out of desperation. Einar has a previous entry that is beautiful Haskell, creating an "open mutable records" system just for this benchmark. It ran in 18x time, 5.8 space. Many of us put entries on the wiki which were different ways to approach it (arrays of mutable data; mutable arrays of constant data). None were able to beat Einar's entry. Eventually I and Don wrote a flat ptr to array of double version, doing it's own offset computation using (shiftL) and (.!.) as you see above. This runs in 6.6x time and 4.1x space. So without resorting to state & unboxing we got nearly triple the speed. Though we are still half the speed of OCaml here.

...

----------------------

*I* am certainly not implying that Haskell is anything less than the most wonderous language in the entire world.

The neat thing is that for speed we do not drop out into c-code or assembly. We just use different parts of Haskell. The mutable state for the n-body code is simple using the IO features to act on a ptr to machine doubles.

...

I'm saying that there's a stark difference in style between the programs submitted to the shootout, and the ones I would show to people that I myself was trying to introduce to the wonders of purely lazy functional programming. :).

True. The ugly code above is not wondrous. I will assert that the use of explicit-IO-state passing and unboxed code is good for teaching. If you want to explain the IO monad, you will end up talking about the state of the "real world" and how nice it is that you don't have to pass it around so that pure functions can peek/poke at it. By showing someone the examples above, they get to read what their IO-do-notation means once it has been de-sugared to a lower level than (>>=). Another way to look at it is that "readIntOffAddr# p1# 0# s" is making use of a nominally pure function, where peekPtr / pokePtr are impure.

...

I think there's a big-fat-lesson about the tension between abstraction and implementation in these entries.

On one hand we've got "This is what I want", on the other it's "What do I have to do to implement it".

Ben.

You have the truth of it. Haskell wins the lines-of-code metric by a large margin from the abstraction. But entries that run into big performance problems get parts rewritten in lower level Haskell.

Wolfgang Jeltsch

7:41 p.m.

Am Mittwoch, 1. Februar 2006 08:22 schrieb Donald Bruce Stewart:

...

Haskell is now ranked number 1 on the Great Language Shootout!

http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=all

Hooray :)

-- Don

It seems to be number 2 at the moment. Best wishes, Wolfgang

Sebastian Sylvan

8:01 p.m.

On 2/2/06, Wolfgang Jeltsch wrote:

...

Am Mittwoch, 1. Februar 2006 08:22 schrieb Donald Bruce Stewart:

...
Haskell is now ranked number 1 on the Great Language Shootout!

http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=all

Hooray :)

-- Don

It seems to be number 2 at the moment.

It looks like it, all of a sudden, has one missing benchmark. Did something break? /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

Isaac Gouy

8:08 p.m.

--- Sebastian Sylvan wrote:

...

...
It seems to be number 2 at the moment.

It looks like it, all of a sudden, has one missing benchmark. Did something break?

Previously the GHC program was shown incorrectly as completing regex-dna within the timeout - now it's shown correctly. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

7278

Age (days ago)

7279

Last active (days ago)

List overview

Download

12 comments

8 participants

participants (8)

Ben Lippmeier
Chris Kuklewicz
dons＠cse.unsw.edu.au
Gour
Isaac Gouy
John Meacham
Sebastian Sylvan
Wolfgang Jeltsch