Lack of inlining -> slow parsing with Data.Binary

Hi, I'm parsing Java classfiles with Data.Binary, the code is here: http://paste.org/index.php?id=4625 The problem is that the resulting code parses rt.jar from JDK6 (about 15K classes, 47Mb zipped) in 15 seconds (run the program with main -mclose rt.jar, for instance), which is 10 times slower than my Java version of the same code. I compile the program with -O2 ; I tried -ddump-inlinings and it turns out that my readByte/readWord16/readWord32 functions don't get inlined, despite being simply aliases for 'get::Get WordXX'; so, in places where my Java version does a pointer access (after being JIT-compiled), the Haskell version does two function calls. What can be the reason of this lack of inlining? Or how do I understand the output of -ddump-inlinings? -- Eugene Kirpichov

Which version of binary did you use? There were similar problems a
while ago, but, IIRC, they're supposed to be fixed (apparently too
*many* INLINE pragmas were the problem).
2008/12/26 Eugene Kirpichov
Hi,
I'm parsing Java classfiles with Data.Binary, the code is here: http://paste.org/index.php?id=4625
The problem is that the resulting code parses rt.jar from JDK6 (about 15K classes, 47Mb zipped) in 15 seconds (run the program with main -mclose rt.jar, for instance), which is 10 times slower than my Java version of the same code.
I compile the program with -O2 ; I tried -ddump-inlinings and it turns out that my readByte/readWord16/readWord32 functions don't get inlined, despite being simply aliases for 'get::Get WordXX'; so, in places where my Java version does a pointer access (after being JIT-compiled), the Haskell version does two function calls.
What can be the reason of this lack of inlining? Or how do I understand the output of -ddump-inlinings?
-- Eugene Kirpichov _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Push the envelope. Watch it bend.

ekirpichov:
Hi,
I'm parsing Java classfiles with Data.Binary, the code is here: http://paste.org/index.php?id=4625
The problem is that the resulting code parses rt.jar from JDK6 (about 15K classes, 47Mb zipped) in 15 seconds (run the program with main -mclose rt.jar, for instance), which is 10 times slower than my Java version of the same code.
I compile the program with -O2 ; I tried -ddump-inlinings and it turns out that my readByte/readWord16/readWord32 functions don't get inlined, despite being simply aliases for 'get::Get WordXX'; so, in places where my Java version does a pointer access (after being JIT-compiled), the Haskell version does two function calls.
What can be the reason of this lack of inlining? Or how do I understand the output of -ddump-inlinings?
Which version of GHC and Data.Binary are you using? If using 6.8.x, use the previous Data.Binary release. If using 6.10.x, use the latest. -- Don

Thanks; I'm using GHC 6.10.1 and the latest binary now, and things get
inlined perfectly well.
Anyways, the main bottleneck turned out to be the performance of
zip-archive , which is now (since 1-2 days ago) ~25x better, and now
the Haskell version is about just 2.5x slower than the Java one, and
I'm quite satisfied with this result and with the process that led to
it.
(Surprisingly, the bottleneck is now in a conversion from a linked
list to an STArray)
In case anyone is interested, here are the results of my hacking:
- http://hackage.haskell.org/cgi-bin/hackage-scripts/package/digest -
bindings to crc32 and adler32 from zlib
- http://hackage.haskell.org/cgi-bin/hackage-scripts/package/zip-archive
- updated version of zip-archive that uses digest and doesn't suffer
from a crc32 bottleneck
- http://hackage.haskell.org/cgi-bin/hackage-scripts/package/jarfind
- the very utility in question (the classfile searcher)
All in all, Haskell rocks :)
2009/1/6 Don Stewart
ekirpichov:
Hi,
I'm parsing Java classfiles with Data.Binary, the code is here: http://paste.org/index.php?id=4625
The problem is that the resulting code parses rt.jar from JDK6 (about 15K classes, 47Mb zipped) in 15 seconds (run the program with main -mclose rt.jar, for instance), which is 10 times slower than my Java version of the same code.
I compile the program with -O2 ; I tried -ddump-inlinings and it turns out that my readByte/readWord16/readWord32 functions don't get inlined, despite being simply aliases for 'get::Get WordXX'; so, in places where my Java version does a pointer access (after being JIT-compiled), the Haskell version does two function calls.
What can be the reason of this lack of inlining? Or how do I understand the output of -ddump-inlinings?
Which version of GHC and Data.Binary are you using? If using 6.8.x, use the previous Data.Binary release. If using 6.10.x, use the latest.
-- Don
-- Евгений Кирпичев Разработчик Яндекс.Маркета
participants (3)
-
Don Stewart
-
Eugene Kirpichov
-
Thomas Schilling