New subject: File I/O benchmark help (conduit, io-streams and Handle)

8 Mar 2013

      Hi all,

I'm turning to the community for some help understanding some benchmark
results[1]. I was curious to see how the new io-streams would work with
conduit, as it looks like a far saner low-level approach than Handles. In
fact, the API is so simple that the entire wrapper is just a few lines of
code[2].

I then added in some basic file copy benchmarks, comparing conduit+Handle
(with ResourceT or bracket), conduit+io-streams, straight io-streams, and
lazy I/O. All approaches fell into the same ballpark, with conduit+bracket
and conduit+io-streams taking a slight lead. (I haven't analyzed that
enough to know if it means anything, however.)

Then I decided to pull up the NoHandle code I wrote a while ago for
conduit. This code was written initially for Windows only, to work around
the fact that System.IO.openFile does some file locking. To avoid using
Handles, I wrote a simple FFI wrapper exposing open, read, and close system
calls, ported it to POSIX, and hid it behind a Cabal flag. Out of
curiosity, I decided to expose it and include it in the benchmark.

The results are extreme. I've confirmed multiple times that the copy
algorithm is in fact copying the file, so I don't think the test itself is
cheating somehow. But I don't know how to explain the massive gap. I've run
this on two different systems. The results you see linked are from my local
machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle
code was still 75% faster than the others.

My initial guess is that I'm not properly tying into the IO manager, but I
wanted to see if the community had any thoughts. The relevant pieces of
code are [3][4][5].

Michael

[1] http://static.snoyman.com/streams.html
[2]
https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con...
[3]
https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs...
[4]
https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
[5]
https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...

File I/O benchmark help (conduit, io-streams and Handle)

Michael Snoyman

Michael Snoyman

John Lato

Gregory Collins

Gregory Collins

John Lato

Gregory Collins

Alexander Kjeldaas

Gregory Collins

Michael Snoyman

Simon Marlow

John Lato

Michael Snoyman

tags

participants (5)