
I'd like to point out that it's entirely possible to get good performance
out of a handle. The iteratee package has had both FD and Handle-based
IO for a while, and I've never observed any serious performance differences
between the two. Also, if I may be so bold, Michael's supercharged copy
speeds are on par with iteratee's performance using Handles:
http://www.tiresiaspress.us/io-benchmarks.html
So while there's definitely something interesting going on here, I think it
needs a bit more investigation before suggesting that Handles should be
avoided.
For comparison, on my system I get
$ time cp input.dat output.dat
real 0m0.004s
user 0m0.000s
sys 0m0.000s
so the throughput observed on the faster times is entirely reasonable.
John L.
On Fri, Mar 8, 2013 at 4:36 PM, Gregory Collins
+Simon Marlow A couple of comments:
- maybe we shouldn't back the file by a Handle. io-streams does this by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be "good enough". - io-streams turns Handle buffering off in withFileAsOutput. So the difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? - the IO manager should not have any bearing here because file code doesn't actually ever use it (epoll() doesn't work for files) - does the difference persist when the file size gets bigger? - your file descriptor code doesn't handle EINTR properly, although you said you checked that the file copy is being done? - Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other methods have a more believable ~70MB/s throughput.
G
On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman
wrote: Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/streams.html [2] https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe