
Hi, I like to develop on Hugs, because its a nice platform to work with, and provides WinHugs, auto-reloading, sub-second compilation etc. Unfortunately some of the newer libraries (ByteString/Binary in particular) have been optimised to within an inch of their lives on GHC, at the cost of being really really slow on Hugs. Taking the example of Yhc Core files, which are stored in binary. Using a very basic hPutChar sequence is miles faster (10x at least) than all the fancy ByteString/Binary trickery. Taking the example of nobench, Malcolm told me he reimplemented ByteString in terms of [Char] and gained a massive performance increase (6000x springs to mind, but that seems way too high to be true) using nhc. Could we have a collective thought, and decide whether we wish to either kill off all compilers that don't start with a G, or could people at least do minimal benchmarking on Hugs? I'm not quite sure what the solution is, but it probably needs some discussion. Thanks Neil

On Tue, 2007-05-01 at 20:37 +0100, Neil Mitchell wrote:
Hi,
I like to develop on Hugs, because its a nice platform to work with, and provides WinHugs, auto-reloading, sub-second compilation etc. Unfortunately some of the newer libraries (ByteString/Binary in particular) have been optimised to within an inch of their lives on GHC, at the cost of being really really slow on Hugs.
Taking the example of Yhc Core files, which are stored in binary. Using a very basic hPutChar sequence is miles faster (10x at least) than all the fancy ByteString/Binary trickery.
Taking the example of nobench, Malcolm told me he reimplemented ByteString in terms of [Char] and gained a massive performance increase (6000x springs to mind, but that seems way too high to be true) using nhc.
That does not surprise me.
Could we have a collective thought, and decide whether we wish to either kill off all compilers that don't start with a G, or could people at least do minimal benchmarking on Hugs? I'm not quite sure what the solution is, but it probably needs some discussion.
I don't think doing minimal benchmarking on hugs will help at all unless we are prepared to act on it and I'm pretty sure anything we do to improve hugs performance will be detrimental to the GHC performance. We're optimising for totally different sets of primitives. With GHC we're optimising for machine code and thinking about branch prediction and cache misses. We're also writing high level combinators that are quite inefficient to execute directly but we rely on inlining and rewrite rules to combine then expand them to efficient low level code. With hugs/yhc/nhc I assume the optimisation technique is simply to minimise the number of primitive reduction steps. This is really totally different. I don't see any obvious way of reconciling the two in a single implementation of an interface. Having totally different implementations of an interface for different Haskell systems is an option though it has obvious disadvantages. So I don't know what to do. We're not stopping out quest for high performance idiomatic code because it doesn't play nicely with interpreters. Duncan

Hi
Could we have a collective thought, and decide whether we wish to either kill off all compilers that don't start with a G, or could people at least do minimal benchmarking on Hugs? I'm not quite sure what the solution is, but it probably needs some discussion.
I don't think doing minimal benchmarking on hugs will help at all unless we are prepared to act on it and I'm pretty sure anything we do to improve hugs performance will be detrimental to the GHC performance.
#ifdef? Malcolm has a ByteString implementation that runs much faster under nhc, and I suspect would also run faster under Hugs. Why not have a big #ifdef around the difference?
With hugs/yhc/nhc I assume the optimisation technique is simply to minimise the number of primitive reduction steps. This is really totally different. I don't see any obvious way of reconciling the two in a single implementation of an interface. Having totally different implementations of an interface for different Haskell systems is an option though it has obvious disadvantages.
I can see that two implementations are undesirable, but at the moment people have a choice: to write fast GHC and slow Hugs (ByteString), or to write slow GHC and fast Hugs (String). If we could make ByteString the "right answer" always, then I think its a much nicer choice. For the particular case of ByteString, type ByteString=String means you roughly import Data.List - not that much additional work or maintenance.
So I don't know what to do. We're not stopping out quest for high performance idiomatic code because it doesn't play nicely with interpreters.
Indeed, and you shouldn't! Your quest for nice idiomatic code has saved countless programmers from low-level IO prodding, and for that we salute you! However, if you could at least give a nod in the direction of Hugs, even if you get to 50% slower than before, it keeps Hugs at least useable with the new API's. Thanks Neil

Hello Neil, Wednesday, May 2, 2007, 2:48:16 PM, you wrote:
the "right answer" always, then I think its a much nicer choice. For the particular case of ByteString, type ByteString=String means you roughly import Data.List - not that much additional work or maintenance.
then Binary library want to access ByteString at low level, imports Data.ByteString.Base and discovers that all great low-level functions defined there can't work with lists :) btw, i has the same problem with my Streams/AltBinary lib. once i missed one INLINE pragma and got 200x slower computation even with "ghc -O2"! -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

ndmitchell:
Hi,
I like to develop on Hugs, because its a nice platform to work with, and provides WinHugs, auto-reloading, sub-second compilation etc. Unfortunately some of the newer libraries (ByteString/Binary in particular) have been optimised to within an inch of their lives on GHC, at the cost of being really really slow on Hugs.
Taking the example of Yhc Core files, which are stored in binary. Using a very basic hPutChar sequence is miles faster (10x at least) than all the fancy ByteString/Binary trickery.
Taking the example of nobench, Malcolm told me he reimplemented ByteString in terms of [Char] and gained a massive performance increase (6000x springs to mind, but that seems way too high to be true) using nhc.
Could we have a collective thought, and decide whether we wish to either kill off all compilers that don't start with a G, or could people at least do minimal benchmarking on Hugs? I'm not quite sure what the solution is, but it probably needs some discussion.
I'm not sure how we can optimise both for interpreters, and compilers. The Binary and ByteString stuff pays close attention to the hardware: cache misses, branch prediction. And there's no option to abandon high performance compiled Haskell, to help out the interpreters. Interestingly, the techniques we use for, say, Data.ByteString, seem to also produce very good results in MLton. So it really is a matter of optimising for compiled native code, versus bytecode interpreters. I don't know if there's anything that can be done here. -- Don
participants (4)
-
Bulat Ziganshin
-
dons@cse.unsw.edu.au
-
Duncan Coutts
-
Neil Mitchell