
On Sat, 2007-12-15 at 11:59 -0800, Don Stewart wrote:
firefly:
What do you think the relative speeds are of the six small haskell programs at the end of this email?
All they do is read from stdin and count the number of spaces they see. There are two that use strict bytestrings, two that use lazy bytestrings, and two that use the standard Haskell strings. Three use a recursive function with an accumulator parameter and three use a foldl with a lambda function.
Say the fastest one takes the time 1. How much time will the others take?
And how about memory? How much memory do you think they require? Let's say we feed a 150MB(*) file into each of them, how many megabytes do you think they end up using (as seen from the OS, not in terms of how big the live heap is)?
I'm going to post full benchmarks + analysis on Wednesday.
How are you compiling these programs, by the way? ghc-6.8.2 -O2 ? (-O2 is required for bytestrings :)
With -O2. I have measured with 6.8.1, 6.9.20071119, 6.9.20071208 (approx), and 6.9.200712xx (as of yesterday or today). The picture changes very little with the compiler, if it changes at all. I have run them on three very different microarchitectures (2GHz Athlon64 3000+, 1667MHz Core Duo, 600MHz Pentium III). All the measurements are scripted. 'make phase1' compiles the benchmarks, creates input files of various sizes, and runs each benchmark once to gather information about memory use (peak RSS + ghc RTS' own information about allocations and gcs), page faults, and an strace. This phase is not timing sensitive so I can browse the web and listen to the music etc. while running it. 'make phase2' runs each benchmark a number of times, calculates the average time for each + the relative size of the standard deviation + how much user+sys is different from real (as reported by bash' built-in time command). A report with barcharts indicating relative time and relative peak RSS is generated in either pure ASCII or in UTF-8 (with fractional-width block chars so the charts look nice and have high resolution). If the measurements are deemed to be bad (too high standard deviation or user+sys doesn't add up to real) then the barchart is done with '%' characters. The "quality indicators" for each timing test are always printed out next to each bar, so we know how close we are to being perfect or, conversely, how bad the measurements are. There is a script that analyzes the I/O pattern and sums it up (as "4375 x read(3, 4096, ...) = 4096 followed by 1 x read(3, 4096, ...) = 1242 followed by 1 x read(3, 4096, ...) = 0" an similar). There are a set of simple I/O programs in C so we can compare ghc's performance with "speed of light", and so different I/O strategies can be compared in a cleaner, purer form. There are also 'make cache' (runs cachegrind), 'make hs/xxxx.doc' (creates a file with source code + core + stg + c-- + assembly + times for a given benchmark), etc. 'make sysinfo' creates a file with information about the Linux distribution used (/etc/lsb-release), kernel version (uname -a), CPU (/proc/cpuinfo), and compilers used (ghc --version, gcc --version). 'make zipdata' creates a zip file of about 20K with all the raw time measurements + the sysinfo. I also have a set of scripts that installs each ghc version so 1) it is easy for me to repeat the tests on various machines and 2) you can see exactly which ghc versions I use and repeat the experiments yourself. -Peter