
Purely as an experiment, I've written a function that uses ByteString to simply elemIndex it's way across a string here. Look for <, then look for
. Repeat until done.
https://github.com/chrisdone/xeno (under src/Xeno.hs) But if you scroll down the README to the 182kb file example, you see that hexml takes 33us and xeno takes 111us. That's surprising to me because I'm doing just a walk across a string and hexml is doing a full parse. It's written in C, but still, 3x faster AND doing allocations and more work. I tried replacing the ByteString with a raw Ptr Word8 and it didn't make a difference, actually increased time a little bit. My weigh results indicate that it's not doing any allocations during the process, at least nothing linear or above. So, I haven't looked at the core or asm yet, but I'm guessing it's simply doing more instructions and/or indirections than necessary. You can reproduce this with stack build --bench xeno Can anyone make an improvement to the speed? I already nerd sniped myself enough with this, so I'm spreading the bug elsewhere. I think it's a pretty good "raw" performance exercise and possibly something that could serve as a tutorial on haskell performance. Ciao!

Christopher Done wrote:
But if you scroll down the README to the 182kb file example, you see that hexml takes 33us and xeno takes 111us. That's surprising to me because I'm doing just a walk across a string and hexml is doing a full parse. It's written in C, but still, 3x faster AND doing allocations and more work.
I tried replacing the ByteString with a raw Ptr Word8 and it didn't make a difference, actually increased time a little bit.
The code you have written still looks like Haskell code. When I write Haskell code that needs to compete speedwise with C, it usually ends up looking like C as well. My suggestion is to drop `Data.ByteString.elemIndex` in favour of direct unsafe array accesses. If I find a bit of time over the next couple of days I might have a crack at this. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On 23 December 2016 at 03:23, Christopher Done
But if you scroll down the README to the 182kb file example, you see that hexml takes 33us and xeno takes 111us. That's surprising to me because I'm doing just a walk across a string and hexml is doing a full parse. It's written in C, but still, 3x faster AND doing allocations and more work.
hexml being a full parser might fail, on the other hand your program unconditionally walks the bytestring. Are you sure hexml is actually completing and not aborting or short-circuiting because of a parse error or some other error? In all other data points xeno is taking much less time than hexml except this one. So I am suspecting it could be a problem with the input, making hexml fail silently. I see that the file used on this data point has japanese characters, maybe hexml is not able to handle those? -harendra

Oh, you're correct! It's unable to parse that file! The files is a test
suite file from the XML spec, I guess hexml is unable to parse this one.
I'll remove it from my benchmark suite in favor of something that does
parse.
Cheers!
On 23 December 2016 at 14:55, Harendra Kumar
On 23 December 2016 at 03:23, Christopher Done
wrote: But if you scroll down the README to the 182kb file example, you see that hexml takes 33us and xeno takes 111us. That's surprising to me because I'm doing just a walk across a string and hexml is doing a full parse. It's written in C, but still, 3x faster AND doing allocations and more work.
hexml being a full parser might fail, on the other hand your program unconditionally walks the bytestring. Are you sure hexml is actually completing and not aborting or short-circuiting because of a parse error or some other error? In all other data points xeno is taking much less time than hexml except this one. So I am suspecting it could be a problem with the input, making hexml fail silently. I see that the file used on this data point has japanese characters, maybe hexml is not able to handle those?
-harendra
participants (3)
-
Christopher Done
-
Erik de Castro Lopo
-
Harendra Kumar