File I/O benchmark help (conduit, io-streams and Handle)

Hi all, I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2]. I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.) Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark. The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others. My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5]. Michael [1] http://static.snoyman.com/streams.html [2] https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...

One clarification: it seems that sourceFile and sourceFileNoHandle have
virtually no difference in speed. The gap comes exclusively from sinkFile
vs sinkFileNoHandle. This makes me think that it might be a buffer copy
that's causing the slowdown, in which case the benchmark may in fact be
accurate.
On Mar 8, 2013 8:30 AM, "Michael Snoyman"
Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/streams.html [2] https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...

I would have expected sourceFileNoHandle to make the most difference, since
that's one location (write) where you've obviously removed a copy. Does
sourceFileNoHandle allocate less?
Incidentally, I've recently been making similar changes to IO code
(removing buffer copies) and getting similar speedups. Although the
results tend to be less pronounced in code that isn't strictly IO-bound.
On Fri, Mar 8, 2013 at 2:50 PM, Michael Snoyman
One clarification: it seems that sourceFile and sourceFileNoHandle have virtually no difference in speed. The gap comes exclusively from sinkFile vs sinkFileNoHandle. This makes me think that it might be a buffer copy that's causing the slowdown, in which case the benchmark may in fact be accurate. On Mar 8, 2013 8:30 AM, "Michael Snoyman"
wrote: Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/streams.html [2] https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

+Simon Marlow
A couple of comments:
- maybe we shouldn't back the file by a Handle. io-streams does this by
default out of the box; I had a posix file interface for unix (guarded by
CPP) for a while but decided to ditch it for simplicity. If your results
are correct, given how slow going by Handle seems to be I may revisit this,
I figured it would be "good enough".
- io-streams turns Handle buffering off in withFileAsOutput. So the
difference shouldn't be as a result of buffering. Simon: is this an
expected result? I presume you did some Handle debugging?
- the IO manager should not have any bearing here because file code
doesn't actually ever use it (epoll() doesn't work for files)
- does the difference persist when the file size gets bigger?
- your file descriptor code doesn't handle EINTR properly, although you
said you checked that the file copy is being done?
- Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other
methods have a more believable ~70MB/s throughput.
G
On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman
Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/streams.html [2] https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
--
Gregory Collins

I'd like to point out that it's entirely possible to get good performance
out of a handle. The iteratee package has had both FD and Handle-based
IO for a while, and I've never observed any serious performance differences
between the two. Also, if I may be so bold, Michael's supercharged copy
speeds are on par with iteratee's performance using Handles:
http://www.tiresiaspress.us/io-benchmarks.html
So while there's definitely something interesting going on here, I think it
needs a bit more investigation before suggesting that Handles should be
avoided.
For comparison, on my system I get
$ time cp input.dat output.dat
real 0m0.004s
user 0m0.000s
sys 0m0.000s
so the throughput observed on the faster times is entirely reasonable.
John L.
On Fri, Mar 8, 2013 at 4:36 PM, Gregory Collins
+Simon Marlow A couple of comments:
- maybe we shouldn't back the file by a Handle. io-streams does this by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be "good enough". - io-streams turns Handle buffering off in withFileAsOutput. So the difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? - the IO manager should not have any bearing here because file code doesn't actually ever use it (epoll() doesn't work for files) - does the difference persist when the file size gets bigger? - your file descriptor code doesn't handle EINTR properly, although you said you checked that the file copy is being done? - Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other methods have a more believable ~70MB/s throughput.
G
On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman
wrote: Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/streams.html [2] https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Fri, Mar 8, 2013 at 9:53 AM, Gregory Collins
On Fri, Mar 8, 2013 at 9:48 AM, John Lato
wrote: For comparison, on my system I get $ time cp input.dat output.dat
real 0m0.004s user 0m0.000s sys 0m0.000s
Does your workstation have an SSD? Michael's using a spinning disk.
If you're only copying a GB or so, it should only be memory traffic. Alexander
-- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Something must be wrong with the conduit "NoHandle" code. I increased the filesize to 60MB and implemented the copy loop in pure C, the code and results are here: https://gist.github.com/gregorycollins/5115491 Everything but the conduit NoHandle code runs in roughly 600-620ms, including the pure C version. G On Fri, Mar 8, 2013 at 10:13 AM, Alexander Kjeldaas < alexander.kjeldaas@gmail.com> wrote:
On Fri, Mar 8, 2013 at 9:53 AM, Gregory Collins
wrote: On Fri, Mar 8, 2013 at 9:48 AM, John Lato
wrote: For comparison, on my system I get $ time cp input.dat output.dat
real 0m0.004s user 0m0.000s sys 0m0.000s
Does your workstation have an SSD? Michael's using a spinning disk.
If you're only copying a GB or so, it should only be memory traffic.
Alexander
-- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
--
Gregory Collins

That demonstrated the issue: I'd forgotten to pass O_TRUNC to the open
system call. Adding that back makes the numbers much more comparable.
Thanks for the input everyone, and Gregory for finding the actual problem
(as well as pointing out a few other improvements).
On Fri, Mar 8, 2013 at 12:13 PM, Gregory Collins
Something must be wrong with the conduit "NoHandle" code. I increased the filesize to 60MB and implemented the copy loop in pure C, the code and results are here:
https://gist.github.com/gregorycollins/5115491
Everything but the conduit NoHandle code runs in roughly 600-620ms, including the pure C version.
G
On Fri, Mar 8, 2013 at 10:13 AM, Alexander Kjeldaas < alexander.kjeldaas@gmail.com> wrote:
On Fri, Mar 8, 2013 at 9:53 AM, Gregory Collins
wrote: On Fri, Mar 8, 2013 at 9:48 AM, John Lato
wrote: For comparison, on my system I get $ time cp input.dat output.dat
real 0m0.004s user 0m0.000s sys 0m0.000s
Does your workstation have an SSD? Michael's using a spinning disk.
If you're only copying a GB or so, it should only be memory traffic.
Alexander
-- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Gregory Collins
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

1GB/s for copying a file is reasonable - it's around half the memory bandwidth, so copying the data twice would give that result (assuming no actual I/O is taking place, which is what you want because actual I/O will swamp any differences at the software level). The Handle overhead should be negligible if you're only using hGetBufSome and hPutBuf, because those functions basically just call read() and write() when the amount of data is larger than the buffer size. There's clearly something suspicious going on here, unfortunately I don't have time right now to investigate, but I'll keep an eye on the thread. Cheers, Simon On 08/03/13 08:36, Gregory Collins wrote:
+Simon Marlow A couple of comments:
* maybe we shouldn't back the file by a Handle. io-streams does this by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be "good enough". * io-streams turns Handle buffering off in withFileAsOutput. So the difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? * the IO manager should not have any bearing here because file code doesn't actually ever use it (epoll() doesn't work for files) * does the difference persist when the file size gets bigger? * your file descriptor code doesn't handle EINTR properly, although you said you checked that the file copy is being done? * Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other methods have a more believable ~70MB/s throughput.
G
On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman
mailto:michael@snoyman.com> wrote: Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/streams.html [2] https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org mailto:Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Gregory Collins
mailto:greg@gregorycollins.net>

On Fri, Mar 8, 2013 at 6:36 PM, Simon Marlow
1GB/s for copying a file is reasonable - it's around half the memory bandwidth, so copying the data twice would give that result (assuming no actual I/O is taking place, which is what you want because actual I/O will swamp any differences at the software level).
The Handle overhead should be negligible if you're only using hGetBufSome and hPutBuf, because those functions basically just call read() and write() when the amount of data is larger than the buffer size.
There's clearly something suspicious going on here, unfortunately I don't have time right now to investigate, but I'll keep an eye on the thread.
Possibly disk caching/syncing issues? If some of the tests are able to either read entirely from cache (on the 1MB test), or don't completely sync after the write, they could happen much faster than others that have to actually hit the disk. For the 60MB test, it's almost guaranteed that actual IO would take place and dominate the timings. John L.
Cheers, Simon
On 08/03/13 08:36, Gregory Collins wrote:
+Simon Marlow A couple of comments:
* maybe we shouldn't back the file by a Handle. io-streams does this
by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be "good enough". * io-streams turns Handle buffering off in withFileAsOutput. So the
difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? * the IO manager should not have any bearing here because file code
doesn't actually ever use it (epoll() doesn't work for files) * does the difference persist when the file size gets bigger? * your file descriptor code doesn't handle EINTR properly, although
you said you checked that the file copy is being done? * Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other
methods have a more believable ~70MB/s throughput.
G
On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman
mailto:michael@snoyman.com> wrote: Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/**streams.htmlhttp://static.snoyman.com/streams.html [2] https://github.com/snoyberg/**conduit/blob/streams/io-** streams-conduit/Data/Conduit/**Streams.hshttps://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/**conduit/blob/streams/conduit/** System/PosixFile.hschttps://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/**conduit/blob/streams/conduit/** Data/Conduit/Binary.hs#L54https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/**conduit/blob/streams/conduit/** Data/Conduit/Binary.hs#L167https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org
http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe
-- Gregory Collins
>> ______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

Just to clarify: the problem was in fact with my code, I was not passing
O_TRUNC to the open system call. Gregory's C code showed me the problem.
Once I add in that option, all the different benchmarks complete in roughly
the same amount of time. So given that our Haskell implementations based on
Handle are just about as fast as a raw C implementation, I'd say Handle is
performing very well.
Apologies if I got anyone overly concerned.
On Fri, Mar 8, 2013 at 12:36 PM, Simon Marlow
1GB/s for copying a file is reasonable - it's around half the memory bandwidth, so copying the data twice would give that result (assuming no actual I/O is taking place, which is what you want because actual I/O will swamp any differences at the software level).
The Handle overhead should be negligible if you're only using hGetBufSome and hPutBuf, because those functions basically just call read() and write() when the amount of data is larger than the buffer size.
There's clearly something suspicious going on here, unfortunately I don't have time right now to investigate, but I'll keep an eye on the thread.
Cheers, Simon
On 08/03/13 08:36, Gregory Collins wrote:
+Simon Marlow A couple of comments:
* maybe we shouldn't back the file by a Handle. io-streams does this
by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be "good enough". * io-streams turns Handle buffering off in withFileAsOutput. So the
difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? * the IO manager should not have any bearing here because file code
doesn't actually ever use it (epoll() doesn't work for files) * does the difference persist when the file size gets bigger? * your file descriptor code doesn't handle EINTR properly, although
you said you checked that the file copy is being done? * Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other
methods have a more believable ~70MB/s throughput.
G
On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman
mailto:michael@snoyman.com> wrote: Hi all,
I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2].
I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.)
Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark.
The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others.
My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5].
Michael
[1] http://static.snoyman.com/**streams.htmlhttp://static.snoyman.com/streams.html [2] https://github.com/snoyberg/**conduit/blob/streams/io-** streams-conduit/Data/Conduit/**Streams.hshttps://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Con... [3] https://github.com/snoyberg/**conduit/blob/streams/conduit/** System/PosixFile.hschttps://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hs... [4] https://github.com/snoyberg/**conduit/blob/streams/conduit/** Data/Conduit/Binary.hs#L54https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary... [5] https://github.com/snoyberg/**conduit/blob/streams/conduit/** Data/Conduit/Binary.hs#L167https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary...
______________________________**_________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org
http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe
-- Gregory Collins
>>
participants (5)
-
Alexander Kjeldaas
-
Gregory Collins
-
John Lato
-
Michael Snoyman
-
Simon Marlow