[0/16] SBM: Simple Bytestring Microbenchmarks, Overview and Introduction

Table of contents: 0/16 This email 1/16 The Haskell and C benchmarks 2/16 Inner loops of the hand-tweaked assembly benchmarks 3/16 The Makefile 4/16 How to use the Makefile (how to run benchmarks etc.) 5/16 Support scripts and scriptlets 6/16 6.9.20071124 Athlon Duron 7/16 6.9.20071208 Core Duo 8/16 6.9.20071119 Pentium III 9/16 6.9.20071119 Athlon64 10/16 Graphs for 6.9.x across four cpus 11/16 Graphs for hand-tweaked assembly benchmarks 12/16 Graphs for 7 ghc/bytestring combinations on a 2GHz Athlon64 300+ 13/16 Graphs that show the infidelity of -sstderr 14/16 Behind the measurements (rationale) 15/16 Predictions compared to the measurements 16/16 Discussion and Conclusion Simple Bytestring Microbenchmarks --------------------------------- Introduction ------------ I love parsers. I have been writing parsers for fun for over twenty years. The nicest way to construct a parser used to be to write a recursive descent parser by hand. If you had to work with people who'd had the misfortune of a university education, you would resort to lex and yacc (flex and bison), despite their many shortcomings. Combinator parsers is the only real improvement over hand-written recursive descent parsers that I know of. They do tend to require features that not all languages provide. I don't know how to write a good one in C, for example. They do work very well in Haskell, though. So, I've started writing a parser in Haskell (ghc, really) for the programming language X++. X++ is not a nice language but that's beside the point. The challenge for me is to write an efficient compiler + provide good analysis tools for X++. I think I stand a better chance of doing that in Haskell (ghc) than in practically any other language. There are a few drawbacks, though. I love speed. And efficiency. String handling in Haskell -------------------------- Native strings are simple and generally work well, but they are slow, take up too much memory and there's the whole encoding mess that still needs to be sorted out. People have worked on other string representations and libraries for quite a while. I think packedstrings (as used in Darcs) was one of the first ones. Bytestrings is the current incarnation. It seems to be just the right thing, especially when combined with improved automatic fusion in the compiler so higher-order functions don't have to be expensive. I use Parsec as my parser combinator library at the moment, which uses native strings. I would dearly love Parsec to be faster and use less memory. I think bytestrings will be part of any substantial improvement in Parsec's resource consumption. Other performance concerns -------------------------- File I/O is also interesting for a compiler writer. I would like to have a program that is as fast as possible both when the source files are already cached by the operating system and when they are not. The former situation is best handled with mmap() and the latter is best handled by read(), preferably in combination with multi-threading so the compiler doesn't have to waste too much time waiting for disk seeks. Haskell seems to be very close to ideal for me because it has very good threading support and very accessible raw access to the operating system. File I/O is not my current bottleneck, though. I'll probably take a closer look at file I/O when the other performance problems have been solved. Then there's the general quality of the generated code. Having read just about every paper on ghc that was available back in the late nineties (when I first looked at Haskell), I'd thought that the quality was good and that the compiler also had extremely good high-level optimizations, in other words, that abstraction was free. I also read the C-- papers and thought that it was a very interesting and promising approach. I'd expected the C-- path to have matured and be well- optimized by now. Unfortunately, the backend is /the/ weak spot in ghc. The frontend is heroic, the typesystems are (too) abundant and rich, the language itself is nice -- but the backend is not. Looking at the generated code I'd say that it is slightly better than Turbo Pascal 3.x and about on par with Turbo Pascal 4.0, a compiler that didn't use any intermediate code at all, compiled each statement in isolation, was single-pass, and had a compilation speed of about 27000 lines per minute on an 8 MHz IBM PC AT. Ecosystem and culture --------------------- Haskell has a very good ecosystem. Probably the second best one amongst the modern functional languages. Ten years ago, I'd thought that Standard ML would win but the only MLish language with a good ecosystem and culture is OCaml, which unfortunately isn't really Standard ML. By ecosystem I mean things like access to raw operating system calls, access to libraries written in other languages, readily available libraries for graphical user interfaces, databases, XML processing, network I/O. Parsing is nice, too, but practically all functional programming languages have that -- and not much else. The culture is also good. People actually use this stuff. They care about it. And they don't hang around waiting for somebody to tell them what to do, they start on their own. And they actually seem to be interested in good performance :) Hackage and cabal are very promising already and may be what finally makes ghc real-world useful for more people, because most people are not interested in working with raw source packages and fiddling with compiler flags and weird error messages. They don't like chasing dependencies, either. So, what's the problem? ----------------------- The major problem with Haskell (ghc) is that its performance (in terms of both speed and memory use) is unpredictable. The second-worst problem is that the actual performance is not good enough. These benchmarks ---------------- I have written a bunch of microbenchmarks that either count all the bytes in stdin or all the spaces in stdin. And some supporting benchmarks in C. And I've also handtweaked the assembly of one byte-counting and one space-counting microbenchmark to illustrate what difference it would make if the backend could use registers in a less stupid^W^W more efficient manner. Homepage + source code ---------------------- I have put up a homepage for the benchmarks at: http://vax64.dk/ghc-bs-tests The raw measurements are in tarballs on that page. The source code for the benchmarks (+ support code) is in a mercurial repository at: http://vax64.dyndns.org/repo/hg/ghc-bs-tests I used scripts to install the various versions of ghc and bytestring both to avoid operator error and so you could me look over my shoulder. The scripts are in a mercurial repository at: http://vax64.dyndns.org/repo/hg/ghc-installations You can either follow the link and download any version you like as a tarball or you can (preferably) clone the repositories with: hg http://vax64.dyndns.org/repo/hg/ghc-bs-tests hg http://vax64.dyndns.org/repo/hg/ghc-installations All my code in those repositories is GPLv2. The text file 'text.txt' in the ghc-bs-tests repository is unfortunately partly in Danish and partly in very terse English. Acknowledgements ---------------- Daniel Fischer, for running the benchmarks on his SuSE 8.2 Athlon Duron 1200 MHz machine and for being helpful and patient while I made the scripts work on a 2.4 kernel and with unhelpful versions of GNU Make and strace. Erik van der Meer, for letting me run the benchmarks (and install ghc) on his Core Duo laptop. And for discussions on measurements over the years. Don Stewart, for playing along and for fixing a bytestring problem. And for contributing three benchmarks (one of which I had to change a bit before it would compile). Duncan Coutts, for playing along. Jules Bean (quicksilver), for contributing a benchmark. Bertram Felgenhauer (int-e), for contributing three benchmarks (in the form of a single file, which I untangled to three files). Spencer Jannsen (sjannsen), for contributing a benchmark. William Lee Irwin III (wli), for inspiring me to add the getwchar benchmarks. -Peter
participants (1)
-
Peter Firefly Brodersen Lund