Re: [Haskell-cafe] Great language shootout: reloaded

11 Nov 2006

      Sebastian Sylvan wrote:
...
On 11/10/06, Henk-Jan van Tuyl <hjgtuyl@chello.nl> wrote:
...
On Fri, 10 Nov 2006 01:44:15 +0100, Donald Bruce Stewart
<dons@cse.unsw.edu.au> wrote:
...
So back in January we had lots of fun tuning up Haskell code for the
Great Language Shootout[1]. We did quite well at the time, at one 
point
ranking overall first[2]. [...]
Haskell suddenly dropped several places in the overall socre, when the
size measurement changed from line-count to number-of-bytes after
gzipping. Maybe it's worth it, to study why this is; Haskell programs 
are
often much more compact then programs in other languages, but after
gzipping, other languages do much better. One reason I can think of, is
that for very short programs, the import statements weigh heavily.
I think the main factor is that languages with large syntactic
redundancy get that compressed away. I.e if you write:
MyVeryLongAndConvlutedClassName MyVeryLargeAndConvulutedObject new
MyVeryLongAndConvolutedClassName( somOtherLongVariableName );
Or something like that, that makes the code clumpsy and difficult to
read, but it won't affect the gzipped byte count very much.
Their current way of meassuring is pretty much pointless, since the
main thing the gzipping does is remove the impact of clunky syntax.
Meassuring lines of code is certainly not perfect, but IMO it's a lot
more useful as a metric then gzipped bytes.
It may not be useful on its own, but it is not entirely meaningless. By 
using a lossless compression algorithm, you might infer some meaning 
about the code. Where it fails though is that if the algorithm was ideal 
(preferring low space at the expense of time), then the resulting bytes 
should be exactly the same. If it is not, then the samples did not do 
the exact same thing in the first place and so are not comparable! So, 
assuming gzip is ideal, then it is considered a win by having a higher 
compressed output!

It is not that the method is pointless, it is the extrapolation and 
interpretation of the results. You could argue that the gzipped output 
is just the same thing written in a new programming language - of 
course, it is not very readable (at least not to me since I do not have 
gunzip installed in my brain, but I do have a Haskell interpreter of 
some sort). Achieving minimum expressiveness at the source code level is 
entirely subjective and is based on an interpretation by the observer. 
Using gzip attempts to minimise this subjectivity - whether or not it is 
successful is not entirely decidable, but it is at least better. 
Unfortunately, the results have been misinterpreted.

Just smile and nod, I do :)

Re: [Haskell-cafe] Great language shootout: reloaded

Tony Morris