
I was wittering on about stream fusion and how great it is, and I got a message from Mr C++. (Mr C++ develops commercial games, and is obsessed with performance. For him, the only way to achieve the best performance is to have total control over every minute detail of the implementation. He sees Haskell is a stupid language that can never be fast. It seems he's not alone...) He said two things. The first was "it really perplexes me that Haskell insists that everything must be read-only". Naturally, I have a good answer to that one. (It's like being "perplexed" that relational databases refuse to keep table data in the same order. It's not to be awkward, it is a fundamental and important properly of the underlying theoretical foundation - the relational algebra, et al.) He also said what basically boils down to "being able to swap elements around in O(1) time and O(0) space is the whole thing that makes linked lists useful in the first place; take that away and it's rather pointless". I don't really have an answer to that one. (Lazyness and GC is probably going to kill most of the space cost. There's still a time cost - particularly the extra GC time...) I've asked this before and nobody answered, so I take it that nobody knows the answer... Does GHC *ever* do an in-place update on anything? Does the STG even have a command for that? I always assumed that the compiler tried to find instances where a new structure is created and the old one is no longer referenced, and make that be in-place. But now I'm not so sure...

Hello Andrew, Sunday, July 8, 2007, 9:40:15 PM, you wrote:
I've asked this before and nobody answered, so I take it that nobody knows the answer... Does GHC *ever* do an in-place update on anything?
no. this will break GC's heart :) and it really breaks it as far as you start to work with updateable arrays. look for details at http://haskell.org/haskellwiki/Modern_array_libraries
Does the STG even have a command for that?
hm. there are ghc primitives that work with mutable arrays. look for primops.txt.pp in ghc sources. btw, you doesn't need to use unix in order to play with ghc HEAD - you can download compiled windows binary -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin wrote:
Hello Andrew,
Sunday, July 8, 2007, 9:40:15 PM, you wrote:
I've asked this before and nobody answered, so I take it that nobody knows the answer... Does GHC *ever* do an in-place update on anything?
no. this will break GC's heart :)
Yeah, having only immutable objects to collect must simplify a whole bunch of stuff...
and it really breaks it as far as you start to work with updateable arrays. look for details at http://haskell.org/haskellwiki/Modern_array_libraries
"GHC 6.6 (currently in beta testing) will add..." Oh, that's cute. ;-)
Does the STG even have a command for that?
hm. there are ghc primitives that work with mutable arrays. look for primops.txt.pp in ghc sources.
The GHC sources. Scary place. ;-) I did think about compiling GHC myself once. But then I found out that it's not actually written in Haskell - it's written in Haskell + C + asm + Perl (?!) and it's utterly *huge*...
btw, you doesn't need to use unix in order to play with ghc HEAD - you can download compiled windows binary
Seriously? I'm pretty sure I tried to do that and couldn't...

On Sun, Jul 08, 2007 at 07:52:19PM +0100, Andrew Coppin wrote:
Bulat Ziganshin wrote:
Hello Andrew, Sunday, July 8, 2007, 9:40:15 PM, you wrote:
I've asked this before and nobody answered, so I take it that nobody knows the answer... Does GHC *ever* do an in-place update on anything?
no. this will break GC's heart :)
Yeah, having only immutable objects to collect must simplify a whole bunch of stuff...
Yes, GHC does in-place updates on almost everything. But probably not in the way you wanted. The crucial feature of laziness, that distinguishes it from call-by-name as found in old ALGOL, is that when you finish evaluating an expression you overwrite the expression with its value. This guarantees that expressions are only evaluated once, and is crucial for the performance of idioms such as circular programming and array-memoization. This does mean that GHC needs a *very* simple write barrier in the GC - it just tacks the current node onto a linked list. No overflow checks, no occurence checks, just a couple of fast instructions. But this does mean that everything overwritable (with pointers) (MutVar#, MutArray#, SynchVar#, TVar#, indirections) needs an extra link pointer.
Does the STG even have a command for that?
hm. there are ghc primitives that work with mutable arrays. look for primops.txt.pp in ghc sources.
The GHC sources. Scary place. ;-)
You could also look at the HTMLized primop documentation: http://haskell.org/ghc/dist/current/docs/libraries/base/GHC-Prim.html writeArray#, write<type>Array#, write<type>OffAddr#, writeMutVar#, writeTVar#, takeMVar#, tryTakeMVar#, putMVar#, tryPutMVar#, do inplace writes.
I did think about compiling GHC myself once. But then I found out that it's not actually written in Haskell - it's written in Haskell + C + asm + Perl (?!) and it's utterly *huge*...
Perl may die soon - GHC HQ is considering retiring the registerised -fvia-C path, leaving only native code for normal use and ANSI C for porting. This is because mangled C is only about 3% faster according to the GHC benchmark suite, and carries a high maintaince cost. Stefan

Stefan O'Rear wrote:
Yes, GHC does in-place updates on almost everything. But probably not in the way you wanted.
Ah yes, true... (Try implementing that in Java. Tricky...)
I did think about compiling GHC myself once. But then I found out that it's not actually written in Haskell - it's written in Haskell + C + asm + Perl (?!) and it's utterly *huge*...
Perl may die soon
Yay! Oh wait, you meant Perl in GHC? :-}
GHC HQ is considering retiring the registerised -fvia-C path, leaving only native code for normal use and ANSI C for porting. This is because mangled C is only about 3% faster according to the GHC benchmark suite, and carries a high maintaince cost.
It would mean that on Windoze, you don't have to supply gcc and an emulator for running it... (Are you keeping an external assembler and linker, or doing the whole shooting match in GHC?)

On Sun, Jul 08, 2007 at 09:35:10PM +0100, Andrew Coppin wrote:
GHC HQ is considering retiring the registerised -fvia-C path, leaving only native code for normal use and ANSI C for porting. This is because mangled C is only about 3% faster according to the GHC benchmark suite, and carries a high maintaince cost.
It would mean that on Windoze, you don't have to supply gcc and an emulator for running it...
That was also part of the justification.
(Are you keeping an external assembler and linker, or doing the whole shooting match in GHC?)
Currently, we use gas and ld, yes... Stefan

Hi
btw, you doesn't need to use unix in order to play with ghc HEAD - you can download compiled windows binary Seriously? I'm pretty sure I tried to do that and couldn't...
Seriously. Thanks to Igloo, you can even download a GHC nightly complete with an installer! It doesn't get any easier than that :) http://www.haskell.org/ghc/dist/current/dist/ghc-6.7.20070706-i386-unknown-m...
I did think about compiling GHC myself once. But then I found out that it's not actually written in Haskell - it's written in Haskell + C + asm + Perl (?!) and it's utterly *huge*...
Perl may die soon - GHC HQ is considering retiring the registerised -fvia-C path, leaving only native code for normal use and ANSI C for porting. This is because mangled C is only about 3% faster according to the GHC benchmark suite, and carries a high maintaince cost.
I still wouldn't want to compile GHC on Windows... For the only GHC patch I've ever contributed I borrowed a friends Linux box. The build instructions alone for Windows scarred me off. Thanks Neil

Neil Mitchell wrote:
Hi
btw, you doesn't need to use unix in order to play with ghc HEAD - you can download compiled windows binary Seriously? I'm pretty sure I tried to do that and couldn't...
Seriously. Thanks to Igloo, you can even download a GHC nightly complete with an installer! It doesn't get any easier than that :)
http://www.haskell.org/ghc/dist/current/dist/ghc-6.7.20070706-i386-unknown-m...
OK. So that's just the GHC binary itself, right?
Perl may die soon - GHC HQ is considering retiring the registerised -fvia-C path, leaving only native code for normal use and ANSI C for porting. This is because mangled C is only about 3% faster according to the GHC benchmark suite, and carries a high maintaince cost.
I still wouldn't want to compile GHC on Windows... For the only GHC patch I've ever contributed I borrowed a friends Linux box. The build instructions alone for Windows scarred me off.
Hmm... Mind you, not sure I fancy trying it on Linux either. Just because Linux is scary... :-P

On Sun, Jul 08, 2007 at 09:36:25PM +0100, Andrew Coppin wrote:
Neil Mitchell wrote:
Seriously. Thanks to Igloo,
And Neil, who wrote the actual installer-generator!
you can even download a GHC nightly complete with an installer! It doesn't get any easier than that :)
http://www.haskell.org/ghc/dist/current/dist/ghc-6.7.20070706-i386-unknown-m...
OK. So that's just the GHC binary itself, right?
It includes everything the installer for released versions includes. (the release installer for 6.8.1 will be one of these nightly-built installers). Thanks Ian

andrewcoppin:
I was wittering on about stream fusion and how great it is, and I got a message from Mr C++.
(Mr C++ develops commercial games, and is obsessed with performance. For him, the only way to achieve the best performance is to have total control over every minute detail of the implementation. He sees Haskell is a stupid language that can never be fast. It seems he's not alone...)
Many libraries use inplace update and mutable data. Like, hmm, Data.ByteString. -- Don

On 7/8/07, Andrew Coppin
I was wittering on about stream fusion and how great it is, and I got a message from Mr C++.
(Mr C++ develops commercial games, and is obsessed with performance. For him, the only way to achieve the best performance is to have total control over every minute detail of the implementation. He sees Haskell is a stupid language that can never be fast. It seems he's not alone...)
Just a random observation: the competition for Haskell is not really C or C++. C is basically dead; C++ is only really useful when you dont want people to be able to read your source code. Java comes close to being competition, but it's slow and eats memory (except in benchmarks where it somehow does quite well). The competition for Haskell is C#, which runs at 80% C++ speed, has a really decent interface into legacy C libraries, has a garbage collector, bounds checking, decent debug messages.

Hallo,
On 7/10/07, Hugh Perkins
On 7/8/07, Andrew Coppin
wrote: I was wittering on about stream fusion and how great it is, and I got a message from Mr C++.
(Mr C++ develops commercial games, and is obsessed with performance. For him, the only way to achieve the best performance is to have total control over every minute detail of the implementation. He sees Haskell is a stupid language that can never be fast. It seems he's not alone...)
Just a random observation: the competition for Haskell is not really C or C++. C is basically dead;
20 years from now people will still be saying this... -- -alex http://www.ventonegro.org/

You're talking to someone who spent his teenage years doing assembler
because that's what l33t games developers did, and I wanted to be like
them. Sure, you can still find assembler around the place, but even l33t
games developers dont use it any more.
On 7/10/07, Alex Queiroz
Just a random observation: the competition for Haskell is not really C or C++. C is basically dead;
20 years from now people will still be saying this...

On 10/07/07, Alex Queiroz
Hallo,
On 7/10/07, Hugh Perkins
wrote: On 7/8/07, Andrew Coppin
wrote: I was wittering on about stream fusion and how great it is, and I got a message from Mr C++.
(Mr C++ develops commercial games, and is obsessed with performance. For him, the only way to achieve the best performance is to have total control over every minute detail of the implementation. He sees Haskell is a stupid language that can never be fast. It seems he's not alone...)
Just a random observation: the competition for Haskell is not really C or C++. C is basically dead;
20 years from now people will still be saying this...
I highly doubt that. For two reasons: 1. People can only cling to unproductive and clumsy tools for so long (we don't write much assembly any more...). Capitalism works to ensure this; people who are willing to switch to more efficient tools will put the rest out of business (if they really are more efficient). 2. The many-core revolution that's on the horizon. While I personally think that the productivity argument should be enough to "make the switch", the killer-app (the app that will kill C, that is :-)) is concurrency. C is just not a tractable tool to program highly concurrent programs, unless the problem happens to be highly amenable to concurrency (web servers etc.). We need *something* else. It may not be Haskell, but it will be something (and it will probably be closer to Haskell than C!). -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Hallo,
On 7/10/07, Sebastian Sylvan
I highly doubt that. For two reasons: 1. People can only cling to unproductive and clumsy tools for so long (we don't write much assembly any more...). Capitalism works to ensure this; people who are willing to switch to more efficient tools will put the rest out of business (if they really are more efficient).
As I replied to Hugh, the Universe of computers is not restricted to PCs. We, embedded developers, will be using C for a lot of time still. Cheers, -- -alex http://www.ventonegro.org/

On 10/07/07, Alex Queiroz
Hallo,
On 7/10/07, Sebastian Sylvan
wrote: I highly doubt that. For two reasons: 1. People can only cling to unproductive and clumsy tools for so long (we don't write much assembly any more...). Capitalism works to ensure this; people who are willing to switch to more efficient tools will put the rest out of business (if they really are more efficient).
As I replied to Hugh, the Universe of computers is not restricted to PCs. We, embedded developers, will be using C for a lot of time still.
That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't. When "the next mainstream language" comes along to "solve" the concurrency problem (to some extent), it would seem highly likely that there will relatively soon be compilers for it for most embedded devices too, so why would you make your life miserable with C in that case (and cost your company X dollars due to inefficiency in the process)? -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Hallo,
On 7/10/07, Sebastian Sylvan
That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it. Cheers, -- -alex http://www.ventonegro.org/

On 10/07/07, Alex Queiroz
Hallo,
On 7/10/07, Sebastian Sylvan
wrote: That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it.
No, you use it because you have to, there is very little choice. Which is exactly my point. I don't think it's unreasonable to expect that when nobody uses C for desktop applications, games etc. anymore because there's a better language available and widely supported, that some version of this "next mainstream language" will make it onto embedded devices too. The revolution (tm) won't come at the same time for all domains. C is probably used/supported in embedded devices mostly because it's popular for non-embedded devices (not because C is somehow uniquely suited for embedded devices). So what happens when something else is popular, when most industries have stopped using C and almost nobody coming from university knows it very well or at all? Isn't it likely that a lot of vendors will write compilers targeting embedded devices for this new popular language? -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Sebastian Sylvan wrote:
On 10/07/07, Alex Queiroz
wrote: So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it.
No, you use it because you have to, there is very little choice. Which is exactly my point.
I don't think it's unreasonable to expect that when nobody uses C for desktop applications, games etc. anymore because there's a better language available and widely supported, that some version of this "next mainstream language" will make it onto embedded devices too.
The revolution (tm) won't come at the same time for all domains. C is probably used/supported in embedded devices mostly because it's popular for non-embedded devices (not because C is somehow uniquely suited for embedded devices). So what happens when something else is popular, when most industries have stopped using C and almost nobody coming from university knows it very well or at all? Isn't it likely that a lot of vendors will write compilers targeting embedded devices for this new popular language?
Mmm... a garbage-collected language on a PIC with single-digit RAM capacity? That's going to be fun! :-D OTOH, isn't somebody out there using Haskell to design logic? (As in, computer ICs.) I doubt you'll even run "Haskell" on a PIC, but you might well use it to *construct* a program that works on a PIC...

On 10/07/07, Andrew Coppin
Sebastian Sylvan wrote:
On 10/07/07, Alex Queiroz
wrote: So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it.
No, you use it because you have to, there is very little choice. Which is exactly my point.
I don't think it's unreasonable to expect that when nobody uses C for desktop applications, games etc. anymore because there's a better language available and widely supported, that some version of this "next mainstream language" will make it onto embedded devices too.
The revolution (tm) won't come at the same time for all domains. C is probably used/supported in embedded devices mostly because it's popular for non-embedded devices (not because C is somehow uniquely suited for embedded devices). So what happens when something else is popular, when most industries have stopped using C and almost nobody coming from university knows it very well or at all? Isn't it likely that a lot of vendors will write compilers targeting embedded devices for this new popular language?
Mmm... a garbage-collected language on a PIC with single-digit RAM capacity? That's going to be fun! :-D
OTOH, isn't somebody out there using Haskell to design logic? (As in, computer ICs.) I doubt you'll even run "Haskell" on a PIC, but you might well use it to *construct* a program that works on a PIC...
Yeah, and 640K should be enough for everybody... Again, the original statement was about 20 years down the line. Go back 20 years and people would say similar things about C (comparing it to assembly). -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 2007-07-10, Sebastian Sylvan
On 10/07/07, Andrew Coppin
wrote: Sebastian Sylvan wrote:
On 10/07/07, Alex Queiroz
wrote: So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it.
No, you use it because you have to, there is very little choice. Which is exactly my point.
I don't think it's unreasonable to expect that when nobody uses C for desktop applications, games etc. anymore because there's a better language available and widely supported, that some version of this "next mainstream language" will make it onto embedded devices too.
The revolution (tm) won't come at the same time for all domains. C is probably used/supported in embedded devices mostly because it's popular for non-embedded devices (not because C is somehow uniquely suited for embedded devices). So what happens when something else is popular, when most industries have stopped using C and almost nobody coming from university knows it very well or at all? Isn't it likely that a lot of vendors will write compilers targeting embedded devices for this new popular language?
Mmm... a garbage-collected language on a PIC with single-digit RAM capacity? That's going to be fun! :-D
OTOH, isn't somebody out there using Haskell to design logic? (As in, computer ICs.) I doubt you'll even run "Haskell" on a PIC, but you might well use it to *construct* a program that works on a PIC...
Yeah, and 640K should be enough for everybody... Again, the original statement was about 20 years down the line. Go back 20 years and people would say similar things about C (comparing it to assembly).
And assembly is still widely used. Moore's law as applied to the embedded domain has a lot of the transistors going to more, cheaper devices, not bigger ones. -- Aaron Denney -><-

On 10/07/07, Aaron Denney
On 2007-07-10, Sebastian Sylvan
wrote: On 10/07/07, Andrew Coppin
wrote: Sebastian Sylvan wrote:
On 10/07/07, Alex Queiroz
wrote: So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it.
No, you use it because you have to, there is very little choice. Which is exactly my point.
I don't think it's unreasonable to expect that when nobody uses C for desktop applications, games etc. anymore because there's a better language available and widely supported, that some version of this "next mainstream language" will make it onto embedded devices too.
The revolution (tm) won't come at the same time for all domains. C is probably used/supported in embedded devices mostly because it's popular for non-embedded devices (not because C is somehow uniquely suited for embedded devices). So what happens when something else is popular, when most industries have stopped using C and almost nobody coming from university knows it very well or at all? Isn't it likely that a lot of vendors will write compilers targeting embedded devices for this new popular language?
Mmm... a garbage-collected language on a PIC with single-digit RAM capacity? That's going to be fun! :-D
OTOH, isn't somebody out there using Haskell to design logic? (As in, computer ICs.) I doubt you'll even run "Haskell" on a PIC, but you might well use it to *construct* a program that works on a PIC...
Yeah, and 640K should be enough for everybody... Again, the original statement was about 20 years down the line. Go back 20 years and people would say similar things about C (comparing it to assembly).
And assembly is still widely used. Moore's law as applied to the embedded domain has a lot of the transistors going to more, cheaper devices, not bigger ones.
Depends on your definition of "widely used". You'll always need some low-level stuff at the bottom (e.g. for the page manager in an OS), and if your device is nothing but "the bottom", well then that's what you get. Doesn't mean that assembly isn't "dead" in the most reasonable sense of the word for the purposes of a discussion like this (i.e. nobody chooses to use assembly when they don't need to). And that's what I predict will happen (and already has in very many domains) with C. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

G'day all.
Quoting Sebastian Sylvan
Depends on your definition of "widely used".
There are more digital watches, freebie four-function calculators, irritating-music-playing doorbells and furby-like toys pumped out every year than there are PCs, servers, routers and mainframes. That's the sense of "widely used" here.
You'll always need some low-level stuff at the bottom (e.g. for the page manager in an OS), and if your device is nothing but "the bottom", well then that's what you get.
Old platforms never die, they just get pushed down the food chain. Today's PC is tomorrow's peripherial, which is the day after's handheld PDA, which is the day after's elevator controller, which is the day after's smart card, which is the day after's Happy Meal(R) give-away toy. Me? I write firmware for nanotech devices. There's no point making the world's smallest strain gauge if it needs to be attached to an ATX-sized case. (I should point out, though, that embedded systems are not created equal. The issues in writing software for a heart pacemaker are very different from those of a stopwatch or a factory management/control system.)
Doesn't mean that assembly isn't "dead" in the most reasonable sense of the word for the purposes of a discussion like this (i.e. nobody chooses to use assembly when they don't need to).
Only foolish engineers choose ANYTHING if they don't need to. Heck, don't even write a program in the first place if you don't need to. Cheers, Andrew Bromage

On 11/07/07, ajb@spamcop.net
G'day all.
Quoting Sebastian Sylvan
: Depends on your definition of "widely used".
There are more digital watches, freebie four-function calculators, irritating-music-playing doorbells and furby-like toys pumped out every year than there are PCs, servers, routers and mainframes. That's the sense of "widely used" here.
You'll always need some low-level stuff at the bottom (e.g. for the page manager in an OS), and if your device is nothing but "the bottom", well then that's what you get.
Old platforms never die, they just get pushed down the food chain. Today's PC is tomorrow's peripherial, which is the day after's handheld PDA, which is the day after's elevator controller, which is the day after's smart card, which is the day after's Happy Meal(R) give-away toy.
Me? I write firmware for nanotech devices. There's no point making the world's smallest strain gauge if it needs to be attached to an ATX-sized case.
(I should point out, though, that embedded systems are not created equal. The issues in writing software for a heart pacemaker are very different from those of a stopwatch or a factory management/control system.)
Doesn't mean that assembly isn't "dead" in the most reasonable sense of the word for the purposes of a discussion like this (i.e. nobody chooses to use assembly when they don't need to).
Only foolish engineers choose ANYTHING if they don't need to. Heck, don't even write a program in the first place if you don't need to.
Clarification. If you need to write something, and you have a choice between C and almost any other language, then you probably won't choose C for many domains, and I think that trend will continue. Lots of people still choose C for games for example, but I think that's on its way out (there are Xbox 360 games being written in C# as we speak). So what I was trying to say is that asm and C are both tools which nobody chooses to use willingly, whereas e.g. C# is. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Yeah, and 640K should be enough for everybody... Again, the original statement was about 20 years down the line. Go back 20 years and people would say similar things about C (comparing it to assembly).
but one could argue that the democratization of programming will indeed marginalize "hair shirt" tech like haskell in favor of something accesible to the wider public what is the fastest growing segment of the programming industry? web development and look at the wonderfully advanced toolkit - a document retrieval protocol hacked up as a packet abstraction (http). a markup format intended to render simple text formatting now coerced into a generalized viewport description mechanism (html). a scripting language with networking features from the 70s. and every day people are migrating existing applications to this stack. worse is better!?

On 11/07/07, brad clawsie
Yeah, and 640K should be enough for everybody... Again, the original statement was about 20 years down the line. Go back 20 years and people would say similar things about C (comparing it to assembly).
but one could argue that the democratization of programming will indeed marginalize "hair shirt" tech like haskell in favor of something accesible to the wider public
Maybe. Frankly I don't much care. Haskell may become more accessible to the wider public, or it may not. But it seems pretty clear that we need something *like* Haskell to be able to effecitvely program the devices of the future, and whatever language that turns out to be, I'll just be happy that I don't need to use C anymore! -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Hallo,
On 7/10/07, Sebastian Sylvan
The revolution (tm) won't come at the same time for all domains. C is probably used/supported in embedded devices mostly because it's popular for non-embedded devices (not because C is somehow uniquely suited for embedded devices).
Wrong. Java and C# are far more popular languages, but C is just the right tool for the job. Cheers, -- -alex http://www.ventonegro.org/

Hello Alex, Wednesday, July 11, 2007, 12:25:00 AM, you wrote:
So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it.
20 years ago it was hard to imagine that $30 processor may be *too* fast for average user. 20 years later the same will probably apply to embedded devices - with a multi-ghz cores in less than 1mm*mm -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Wed, 2007-07-11 at 14:01 +0400, Bulat Ziganshin wrote:
Hello Alex,
Wednesday, July 11, 2007, 12:25:00 AM, you wrote:
So you think we use C because we like it? :-) When this revolutionary tool of yours arrive that compiles Haskell to PIC devices, I'm gonna be the first to use it.
20 years ago it was hard to imagine that $30 processor may be *too* fast for average user. 20 years later the same will probably apply to embedded devices - with a multi-ghz cores in less than 1mm*mm
Amorphous computing.

Sebastian Sylvan wrote:
That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
When "the next mainstream language" comes along to "solve" the concurrency problem (to some extent), it would seem highly likely that there will relatively soon be compilers for it for most embedded devices too, so why would you make your life miserable with C in that case (and cost your company X dollars due to inefficiency in the process)?
...because only C works on bizzare and unusual hardware?

On 10/07/07, Andrew Coppin
Sebastian Sylvan wrote:
That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
When "the next mainstream language" comes along to "solve" the concurrency problem (to some extent), it would seem highly likely that there will relatively soon be compilers for it for most embedded devices too, so why would you make your life miserable with C in that case (and cost your company X dollars due to inefficiency in the process)?
...because only C works on bizzare and unusual hardware?
By what magic is this the case? Hardware automatically supports C without the efforts of compiler-writers? We're talking 20 years down the line here, when someone can choose to write a C compiler, or an X compiler (where X is the most popular systems programming language of the time). -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 2007-07-10, Sebastian Sylvan
On 10/07/07, Andrew Coppin
wrote: Sebastian Sylvan wrote:
That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
When "the next mainstream language" comes along to "solve" the concurrency problem (to some extent), it would seem highly likely that there will relatively soon be compilers for it for most embedded devices too, so why would you make your life miserable with C in that case (and cost your company X dollars due to inefficiency in the process)?
...because only C works on bizzare and unusual hardware?
By what magic is this the case? Hardware automatically supports C without the efforts of compiler-writers?
No, of course not. But the most popular architectures support C with /much smaller/ efforts of compiler writers. -- Aaron Denney -><-

On 10/07/07, Aaron Denney
On 2007-07-10, Sebastian Sylvan
wrote: On 10/07/07, Andrew Coppin
wrote: Sebastian Sylvan wrote:
That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
When "the next mainstream language" comes along to "solve" the concurrency problem (to some extent), it would seem highly likely that there will relatively soon be compilers for it for most embedded devices too, so why would you make your life miserable with C in that case (and cost your company X dollars due to inefficiency in the process)?
...because only C works on bizzare and unusual hardware?
By what magic is this the case? Hardware automatically supports C without the efforts of compiler-writers?
No, of course not. But the most popular architectures support C with /much smaller/ efforts of compiler writers.
Competition sorts that one out. If there's a clear alternative that's let's say 10x more productive, then the cost of developing a compiler (or a backend for an existing one) is offset by the benefits of being able to offer a more productive programming environment for your customers. My point is that C isn't a magical language that is immune to progress, and I would say it's likely that the main future competitor to C in the embedded domain will eventually be [a version of] whatever langauge wins out in the other domains (e.g. due to the many-core stuff). -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/10/07, Aaron Denney
On 10/07/07, Andrew Coppin
wrote: Sebastian Sylvan wrote:
That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
When "the next mainstream language" comes along to "solve" the concurrency problem (to some extent), it would seem highly likely
On 2007-07-10, Sebastian Sylvan
wrote: that there will relatively soon be compilers for it for most embedded devices too, so why would you make your life miserable with C in that case (and cost your company X dollars due to inefficiency in the process)?
...because only C works on bizzare and unusual hardware?
By what magic is this the case? Hardware automatically supports C without the efforts of compiler-writers?
No, of course not. But the most popular architectures support C with /much smaller/ efforts of compiler writers.
Now is this just because of the relative simplicity of C, because of a larger amount of collective experience in writing C compilers, or something else entirely

On 2007-07-10, Creighton Hogg
No, of course not. But the most popular architectures support C with /much smaller/ efforts of compiler writers.
Now is this just because of the relative simplicity of C, because of a larger amount of collective experience in writing C compilers, or something else entirely
Because of the (relatively) close match between C and currently popular styles of hardware. -- Aaron Denney -><-

That might eliminate the concurrency imperative (for a while!), but it doesn't adress the productivity point. My hypothesis is this: People don't like using unproductive tools, and if they don't have to, they won't.
productivity is not the only metric for a tool for some people, it is performance for many employers, it is access to a vibrant labor pool for some other coders (database admins, web developers), they don't even have a choice of tool and how do we measure productivity? i would claim the tool that requires me to produce the least new code to get to a sufficient solution is the most productive. haskell's syntax and semantics can aid in reducing code, but do not address real problem domains. thats where hackage comes in. compare hackage to cpan, we've got a ways to go. i'm going to add something to hackage tonight to help!

On Jul 10, 2007, at 16:07 , Alex Queiroz wrote:
As I replied to Hugh, the Universe of computers is not restricted to PCs. We, embedded developers, will be using C for a lot of time still.
Doesn't nhc98 target embedded devices? -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

allbery:
On Jul 10, 2007, at 16:07 , Alex Queiroz wrote:
As I replied to Hugh, the Universe of computers is not restricted to PCs. We, embedded developers, will be using C for a lot of time still.
Doesn't nhc98 target embedded devices?
It's been used on embedded arm devices the size of a credit card. -- Don

On 7/10/07, Sebastian Sylvan
While I personally think that the productivity argument should be enough to "make the switch", the killer-app (the app that will kill C, that is :-)) is concurrency. C is just not a tractable tool to program highly concurrent programs, unless the problem happens to be highly amenable to concurrency (web servers etc.). We need *something* else. It may not be Haskell, but it will be something (and it will probably be closer to Haskell than C!).
Yeah I agree with this. C# totally rocks, but threading is an unsolved problem. If you can get to the stage where you can get a non-optimized, readable/maintainable Haskell program to run at more than say 30% of the speed of a non-optimized, readable/maintainable C# program, but automatically runs across 16 or 256 cores, then you're on to a winner.

Hugh Perkins wrote:
Yeah I agree with this. C# totally rocks, but threading is an unsolved problem.
I have repeatedly attempted to discover what C# actually is, all to no avail. Still, that probably means I don't need it...
If you can get to the stage where you can get a non-optimized, readable/maintainable Haskell program to run at more than say 30% of the speed of a non-optimized, readable/maintainable C# program, but automatically runs across 16 or 256 cores, then you're on to a winner.
Hint: If you can get readable/maintainable Haskell to run on more than one core "automatically", you're onto something pretty special. ;-)

On 10/07/07, Andrew Coppin
Hugh Perkins wrote:
Yeah I agree with this. C# totally rocks, but threading is an unsolved problem.
I have repeatedly attempted to discover what C# actually is, all to no avail. Still, that probably means I don't need it...
If you can get to the stage where you can get a non-optimized, readable/maintainable Haskell program to run at more than say 30% of the speed of a non-optimized, readable/maintainable C# program, but automatically runs across 16 or 256 cores, then you're on to a winner.
Hint: If you can get readable/maintainable Haskell to run on more than one core "automatically", you're onto something pretty special. ;-)
Soon, have a little patience :-) See for example: http://research.microsoft.com/~simonpj/papers/ndp/NdpSlides.pdf http://research.microsoft.com/~tharris/papers/2007-fdip.pdf -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Sebastian Sylvan wrote:
On 10/07/07, Andrew Coppin
wrote: Hint: If you can get readable/maintainable Haskell to run on more than one core "automatically", you're onto something pretty special. ;-)
Soon, have a little patience :-)
See for example: http://research.microsoft.com/~simonpj/papers/ndp/NdpSlides.pdf http://research.microsoft.com/~tharris/papers/2007-fdip.pdf
Mmm... it'll be damn good if any of this ever actually happens. But until then... heh. ;-)

On Tuesday 10 July 2007 21:19:42 Andrew Coppin wrote:
Hugh Perkins wrote:
Yeah I agree with this. C# totally rocks, but threading is an unsolved problem.
I have repeatedly attempted to discover what C# actually is...
Take Java. Make it Windows only. Fix some mistakes. Tweak performance. Add a little functionality (e.g. operator overloading). That is C#. Both are designed for GUI and web programming, so they don't fare well for massive concurrency, high-performance numerics or allocation-intensive algorithms (e.g. idiomatic functional programming).
Hint: If you can get readable/maintainable Haskell to run on more than one core "automatically", you're onto something pretty special. ;-)
If you're using a Unix, just fork the process and pass messages via a pipe. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?h

On 10/07/07, Jon Harrop
On Tuesday 10 July 2007 21:19:42 Andrew Coppin wrote:
Hugh Perkins wrote:
Yeah I agree with this. C# totally rocks, but threading is an unsolved problem.
I have repeatedly attempted to discover what C# actually is...
Take Java. Make it Windows only. Fix some mistakes. Tweak performance. Add a little functionality (e.g. operator overloading). That is C#.
Both are designed for GUI and web programming, so they don't fare well for massive concurrency, high-performance numerics or allocation-intensive algorithms (e.g. idiomatic functional programming).
C# 3.0 gets it a bit closer, though. I wonder what C# 4.0 will look like, though I worry about the complexity of the language when they keep tacking stuff on like that. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Sebastian Sylvan wrote:
On 10/07/07, Jon Harrop
wrote: Both are designed for GUI and web programming, so they don't fare well for massive concurrency, high-performance numerics or allocation-intensive algorithms (e.g. idiomatic functional programming).
C# 3.0 gets it a bit closer, though. I wonder what C# 4.0 will look like, though I worry about the complexity of the language when they keep tacking stuff on like that.
The thing that I *like* about Haskell is that it manages to be very powerful, and yet elegently simple. :-D (...and then they add MPTCs, GADTs, ATs, ETs, Rank-N polymorphism, FDs...)

Jon Harrop wrote:
On Tuesday 10 July 2007 21:19:42 Andrew Coppin wrote:
Hugh Perkins wrote:
Yeah I agree with this. C# totally rocks, but threading is an unsolved problem.
I have repeatedly attempted to discover what C# actually is...
Take Java. Make it Windows only. Fix some mistakes. Tweak performance. Add a little functionality (e.g. operator overloading). That is C#.
Both are designed for GUI and web programming, so they don't fare well for massive concurrency, high-performance numerics or allocation-intensive algorithms (e.g. idiomatic functional programming).
Really? I thought Java was designed to take over the world! LOL. There are certainly *a lot* of mistakes to fix. OTOH, this is Micro$oft we're talking about... how good can it possibly be? (Windoze-only, you say? Perhaps I misunderstood - I thought this is what Mono is all about...)
Hint: If you can get readable/maintainable Haskell to run on more than one core "automatically", you're onto something pretty special. ;-)
If you're using a Unix, just fork the process and pass messages via a pipe.
Yeah, but it's hardly trivial to decide which things are "big enough" to be worth forking for, and which aren't. (Take, for example, a quicksort. You could run the recursions in parallel - but once you get down to small enough lists, it isn't worth the overhead. But how to know when to stop? You end up adding more processing overhead analysing whether to fork than you actually gain by forking...)

On Jul 11, 2007, at 13:37 , Andrew Coppin wrote:
(Windoze-only, you say? Perhaps I misunderstood - I thought this is what Mono is all about...)
As someone else pointed out earlier, the real power is the libraries, which provide a complete and powerful GUI environment. Mono provides the VM and C# but duplicating the libraries is a much bigger and harder job. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Tue, 10 Jul 2007, Jon Harrop wrote:
On Tuesday 10 July 2007 21:19:42 Andrew Coppin wrote:
Hugh Perkins wrote:
Yeah I agree with this. C# totally rocks, but threading is an unsolved problem.
I have repeatedly attempted to discover what C# actually is...
Take Java. Make it Windows only. Fix some mistakes. Tweak performance. Add a little functionality (e.g. operator overloading). That is C#.
... still talking about "In-place modification" ?

On Jul 10, 2007, at 16:19 , Andrew Coppin wrote:
I have repeatedly attempted to discover what C# actually is, all to no avail. Still, that probably means I don't need it...
C# is Microsoft's answer to Java: a vaguely Java/C++-like language that runs in Microsoft's ".net" virtual machine. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Tuesday 10 July 2007 21:02:45 Sebastian Sylvan wrote:
While I personally think that the productivity argument should be enough to "make the switch", the killer-app (the app that will kill C, that is :-)) is concurrency. C is just not a tractable tool to program highly concurrent programs, unless the problem happens to be highly amenable to concurrency (web servers etc.). We need *something* else. It may not be Haskell, but it will be something (and it will probably be closer to Haskell than C!).
As long as your C-killer language has a run-time that is written in C, it won't be killing C. :-) -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?e

On Tue, 2007-07-10 at 21:02 +0100, Sebastian Sylvan wrote:
On 10/07/07, Alex Queiroz
wrote: Hallo,
On 7/10/07, Hugh Perkins
wrote: On 7/8/07, Andrew Coppin
wrote: I was wittering on about stream fusion and how great it is, and I got a message from Mr C++.
(Mr C++ develops commercial games, and is obsessed with performance. For him, the only way to achieve the best performance is to have total control over every minute detail of the implementation. He sees Haskell is a stupid language that can never be fast. It seems he's not alone...)
Just a random observation: the competition for Haskell is not really C or C++. C is basically dead;
20 years from now people will still be saying this...
I highly doubt that. For two reasons: 1. People can only cling to unproductive and clumsy tools for so long (we don't write much assembly any more...). Capitalism works to ensure this; people who are willing to switch to more efficient tools will put the rest out of business (if they really are more efficient). 2. The many-core revolution that's on the horizon.
While I personally think that the productivity argument should be enough to "make the switch", the killer-app (the app that will kill C, that is :-)) is concurrency. C is just not a tractable tool to program highly concurrent programs, unless the problem happens to be highly amenable to concurrency (web servers etc.). We need *something* else. It may not be Haskell, but it will be something (and it will probably be closer to Haskell than C!).
Single Assignment C! (http://www.sac-home.org/)

On 11 Jul 2007, at 8:02 am, Sebastian Sylvan wrote:
On 10/07/07, Alex Queiroz
wrote: 20 years from now people will still be saying this...
I highly doubt that. For two reasons: 1. People can only cling to unproductive and clumsy tools for so long (we don't write much assembly any more...). Capitalism works to ensure this; people who are willing to switch to more efficient tools will put the rest out of business (if they really are more efficient).
We are still seeing terabytes of assembly code written; it is just being written by compilers. As for me, I still sometimes write Fortran. (Mind you, F95 is not your grandfather's Fortran. But it does still beat the pants off C.) People have been predicting the death of Fortran for a long time.
2. The many-core revolution that's on the horizon.
You cannot have played with Cilk if you think that will kill C. Actually, it's not on the horizon. The revolution has happened. Some people have seen the flash; others haven't heard the bang yet.

On 2007-07-11, ok
As for me, I still sometimes write Fortran. (Mind you, F95 is not your grandfather's Fortran. But it does still beat the pants off C.) People have been predicting the death of Fortran for a long time.
"I don't know what the programming language of the year 2000 will look like, but I know it will be called FORTRAN." -- Tony Hoare. -- Aaron Denney -><-

Hello Hugh, Tuesday, July 10, 2007, 11:26:43 PM, you wrote:
Just a random observation: the competition for Haskell is not really C or C++. C is basically dead; C++ is only really useful when you dont want people to be able to read your source code.
btw, in his HOPL paper http://www.research.att.com/~bs/hopl-almost-final.pdf Bjarne emphasize that now C++ is language for system programming rather than application programming
Java comes close to being competition, but it's slow and eats memory
never checked it myself, but people say that modern Java implementations are as fast as C# and close to C++ speeds -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 2007-07-11, Bulat Ziganshin
Hello Hugh,
Tuesday, July 10, 2007, 11:26:43 PM, you wrote:
Just a random observation: the competition for Haskell is not really C or C++. C is basically dead; C++ is only really useful when you dont want people to be able to read your source code.Â
btw, in his HOPL paper http://www.research.att.com/~bs/hopl-almost-final.pdf Bjarne emphasize that now C++ is language for system programming rather than application programming
It's been tried -- see BeOS. Maintaining a stable ABI was difficult. Some of that can be chalked up to the language changing, and some to compiler-writers learning better how to deal with the language, but it's still an issue now. And just try mixing objects from different vendors... It just gets worse in (externally-visible compiled) languages with a an actual runtime. The ABIs for C are (fairly) simple, because of the (fairly) close match between the language and architecture. And this does make a difference. -- Aaron Denney -><-

On 11 Jul 2007, at 9:56 pm, Bulat Ziganshin wrote:Java comes close to being competition, but it's slow and eats memory
never checked it myself, but people say that modern Java implementations are as fast as C# and close to C++ speeds
People will say anything, but before believing this particular one you will have to do your own measurements. Here's mine: I'm one of two lecturers in a 4th year information retrieval paper. The main programming exercise is one where the students write their own simple IR engine. It's really pretty simple. My model answer in C is two separate programs, an index builder and a query engine, and is 804 SLOC in 1075 total lines. Each year, despite our advice, some student does it in Java. I have one such from last year: 1611 SLOC in 2531 total lines. (Yes, this does mean programmer X writing C can be twice as productive as programmer Y writing Java.) The real killer is performance: the Java program is 150 times slower than the C one. Let me say that slowly: one hundred and fifty times slower. What was that about "close to C++ speeds" again? The reason I have the program is that the student was distressed by this and wanted me to help. The problem was the Java libraries, one class in particular. By replacing that class with home brew code (at the cost of several days coding and experimenting) it was possible to speed the Java program up by a factor of about 15, at which point it was *still* slower than AWK (using 'mawk 1.3.3'). I mentioned this in another context, and got a reply from someone who has worked on large commercial projects in Java, who said that typically half of their code was application-specific and half of it was rewrites of Java library facilities to work around serious performance problems. The lesson here is that productivity and performance are NOT solely a matter of language; they are also a matter of libraries. The Haskell language is wonderful and I enjoy experiencing the reality of "there is nothing as practical as a good theory". But if we just had the Haskell 98 report and libraries, it would NOT be a productive tool for many real problems. The growing collection of amazing stuff in the hackage collection is extremely important.

Hello ok, Thursday, July 12, 2007, 3:29:25 AM, you wrote:
own simple IR engine. It's really pretty simple. My model answer in C is two separate programs, an index builder and a query engine, and is 804 SLOC in 1075 total lines. Each year, despite our advice, some student does it in Java.
using student's work, it's easy to proof that Basic is faster than assembler (and "haskell is as fast and memory-efficient as C", citing haskell-cafe) -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

I wrote [student code in Java twice the size of C code, 150 times slower]. On 12 Jul 2007, at 7:04 pm, Bulat Ziganshin wrote:
using student's work, it's easy to proof that Basic is faster than assembler (and "haskell is as fast and memory-efficient as C", citing haskell-cafe)
This completely ignores everything else I wrote. The first point is that IT WAS NOT THE STUDENT'S FAULT. The performance bottleneck was ENTIRELY in code provided by Sun. And the second point of my message, which has also been ignored, is that languages are NOT the sole determiner of productivity &c, but libraries also. My post was not about Java-the-language, but about java.io the library, and about the fact that libraries can have far more effect than anything the compiler does. So no, using the form of my argument, it is NOT possible to prove anything about Haskell -vs- C. It is ONLY possible to make claims about Haskell *libraries* -vs- C *libraries*.

So no, using the form of my argument, it is NOT possible to prove anything about Haskell -vs- C. It is ONLY possible to make claims about Haskell *libraries* -vs- C *libraries*.
You can claim anything you like, but if you want people to believe it you'd be best providing the code used so that others can reproduce the tests, comment, optimize ;-) I've done some simple benchmarks between C#/Java/C++ before. Here are two, one for a prime number algorithm, one for OpenGl, which is more of a "real world" situation, and entirely relevant to the original thread, which was talking about performance for games programming. I havent run the same benchmarks in Haskell, because I'm not experienced enough to be able to write an optimized Haskell solution to either problem, but I'm sure I've come to the right place to find someone who can do this ;-) Algorithms ======== For algorithms, the results are so close that simple compiler switch changes can make the difference between C++ or Java or C# being the fastest. Here's a prime-number benchmark between C++ and C# (along with code): http://spring.clan-sy.com/phpbb/viewtopic.php?t=7634&postdays=0&postorder=asc&start=60 post Thu Nov 02, 2006 3:18 am " C++ version: H:\dev\test\CSharpAI>H:\dev\test\testperf\prime.exe number of primes: 664579 elapsed time: 2.265 C# version: H:\dev\test\CSharpAI>H:\dev\test\testperf\primecs.exe number of primes: 664579 elapsed time: 2.03125 Admittedly, optimizations are turned off by default in cl, and turned on by default in C#. Nevertheless, these execution times are clearly not miles apart." OpenGl ====== Here's an OpenGl benchmark between C# vertex3f and C# drawArrays, and between C# and Java. http://spring.clan-sy.com/phpbb/viewtopic.php?t=7634&postdays=0&postorder=asc&start=100 OpenGl C#: post of Sun Dec 03, 2006 5:54 am OpenGl Java: post of Mon Dec 18, 2006 12:35 pm I used a simple OpenGl application to benchmark Java and C#, because my target was to write an OpenGl application. You have to benchmark the things that you will be doing with your final application ;-) OpenGl is very stressful for anything other than C/C++, because it involves a lot of calls across the C / <other-language> interface, along with associated marshalling costs and so on. By comparing the time to run with using a brute-force Vertex3f call per vertex, with the time to call DrawArrays, after pushing everything to a vertex array, you can get a measure of the inter-language boundary overhead involved. (DrawArrays runs either in the graphics card, or in the graphics card driver, or both, but either way it's happening on the native side of the native/<other-language> boundary) I was going to run the same benchmark on Haskell, but I gave up on seeing Haskell opengl doesnt support alpha-blending yet, so I'm guessing Haskell opengl has not been used very much just yet ;-) If someone can provide an optimized version of the prime number algorithm in Haskell, I'll run it in both Haskell and C# and print the results. The algorithm is a simple sieve of Eratosthenes (I think). This is basically the kind of algorithm Haskell is built for ;-) so if it doesn't win this it's go-home time ;-) Well, until automatic threading arrives of course.

There was some discussion about prime number generators earlier this year:
http://www.haskell.org/pipermail/haskell-cafe/2007-February/022347.html
http://www.haskell.org/pipermail/haskell-cafe/2007-February/022699.html
You can find several sources there.
Met vriendelijke groet,
Henk-Jan van Tuyl
--
http://functor.bamikanarie.com
http://Van.Tuyl.eu/
--
On Sat, 14 Jul 2007 08:07:34 +0200, Hugh Perkins
If someone can provide an optimized version of the prime number algorithm in Haskell, I'll run it in both Haskell and C# and print the results. The algorithm is a simple sieve of Eratosthenes (I think). This is basically the kind of algorithm Haskell is built for ;-) so if it doesn't win this it's go-home time ;-)
Well, until automatic threading arrives of course.
--

On 7/14/07, Henk-Jan van Tuyl
There was some discussion about prime number generators earlier this year: http://www.haskell.org/pipermail/haskell-cafe/2007-February/022347.html http://www.haskell.org/pipermail/haskell-cafe/2007-February/022699.html
Ok, so using the following code:
module Main where import IO import Char import GHC.Float import List import Control.Monad import System.Time import System.Locale sieve :: [Int] -> [Int] sieve [] = [] sieve (p : xs) = p : sieve [x | x <- xs, x `mod` p > 0] calculateNumberOfPrimes :: Int -> Int calculateNumberOfPrimes max = 1 + length( sieve [ 3,5.. (max -1) ] ) gettime :: IO ClockTime gettime = getClockTime main = do starttime <- gettime let numberOfPrimes = (calculateNumberOfPrimes 200000) putStrLn( "number of primes: " ++ show( numberOfPrimes ) ) endtime <- gettime let timediff = diffClockTimes endtime starttime let secondsfloat = realToFrac( tdSec timediff ) + realToFrac(tdPicosec timediff) / 1000000000000 putStrLn( show(secondsfloat) ) return () With "200000" as the upper limit on the sieve, this gives: O:\dev\haskell>ghc -fglasgow-exts -O2 -o prime.exe Prime.hs O:\dev\haskell>prime number of primes: 17984 8.734 For comparison, on the same machine, in C# this gives: O:\dev\test\testperf>csc /nologo primecs.cs O:\dev\test\testperf>primecs number of primes: 17984 elapsed time: 0,015625 That's over 500 times faster ;-) Here's the code in C# using System; class Primes { public int CalculateNumberOfPrimes( int maxprime ) { bool[]IsNotPrime = new bool[ maxprime ]; int NumberOfPrimes = 1; for( int i = 3; i < maxprime; i+= 2 ) { if( !IsNotPrime [i] ) { NumberOfPrimes++; for( int j = ( i << 1 ); j < maxprime; j+= i ) { if( !IsNotPrime [j] ) IsNotPrime [ j] = true; } } } return NumberOfPrimes; } } class EntryPoint { public static void Main() { System.DateTime start = System.DateTime.Now; int NumberOfPrimes = new Primes().CalculateNumberOfPrimes( 200000 ); System.DateTime finish = System.DateTime.Now; double time = finish.Subtract( start ).TotalMilliseconds;; Console.WriteLine( "number of primes: " + NumberOfPrimes ); Console.WriteLine( "elapsed time: " + ( time / 1000 ) ); } }

That's over 500 times faster ;-)
... Did you really read the Haskell code ? You're comparing two completely unrelated algorithms, talk about a fair comparison ! Maybe a reading of http://en.literateprograms.org/Sieve_of_Eratosthenes_(Haskell) would help you ? Note that you C# code algorithm could be written in Haskell quite compactly, it would already be a better match ! -- Jedaï

As I say, I'm not a Haskell expert, so feel free to provide a better
implementation.
On 7/15/07, Chaddaï Fouché
... Did you really read the Haskell code ? You're comparing two completely unrelated algorithms, talk about a fair comparison !

Read http://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf On Sun, 2007-07-15 at 00:38 +0200, Hugh Perkins wrote:
As I say, I'm not a Haskell expert, so feel free to provide a better implementation.
On 7/15/07, Chaddaï Fouché
wrote: ... Did you really read the Haskell code ? You're comparing two completely unrelated algorithms, talk about a fair comparison ! _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

There's really a tendency in this newsgroup to point people to huge
documents, when a small copy and paste would make the answer so much more
accessible ;-)
Anyway... so reading through the paper, it looks like its using a priority
queue? Which basically is changing the algorithm somewhat compared to the
C# version.
Anyway, if you can provide a working Haskell version, I'll be happy to run
it. I sortof suspect that if it gave results within 30% of the C# version
someone would already have done so ;-)
On 7/15/07, Derek Elkins

2007/7/15, Hugh Perkins
There's really a tendency in this newsgroup to point people to huge documents, when a small copy and paste would make the answer so much more accessible ;-)
I was pointing you on a document of quite honest size in my opinion, and not really hard to read... But well if you want a piece of code, here it is : -- merge xs@(x:xt) ys@(y:yt) = case compare x y of LT -> x : (merge xt ys) EQ -> x : (merge xt yt) GT -> y : (merge xs yt) diff xs@(x:xt) ys@(y:yt) = case compare x y of LT -> x : (diff xt ys) EQ -> diff xt yt GT -> diff xs yt primes, nonprimes :: [Int] primes = [2,3,5] ++ (diff [7,9..] nonprimes) nonprimes = foldr1 f . map g $ tail primes where f (x:xt) ys = x : (merge xt ys) g p = [ n*p | n <- [p,p+2..]] -- The HaskellWiki repertory it under "primes" and it's at least 170 times faster than the extra-naive sieve you used in your comparison on my computer... (I have some doubts on the accuracy of the benchmark and System.Time at this level of precision, especially on Windows) An implementation with DiffArray (not even DiffUArray since there's no instance for Bool ? Is there a bitvector implementation somewhere to make up for this ?) and exactly your algorithm is already more than 20 times faster than the naive sieve. Note that this kind of benchmark is pretty pointless anyway since you're not really using C# in your program : you're using a subset of C# that's almost immediately translatable in the corresponding C code, and indeed the JIT compiler must compile to the same code as the equivalent C code, so it's no surprise at all that the performances are similar ! Due to the nature of Haskell, it's not so easy to do the same thing (write a C program in Haskell as you wrote a C program in C#), so the conclusion is obviously to Haskell disadvantage. -- Jedaï

Chaddai, Unfortunately, your program doesnt work ;-) The function needs to take a parameter, which is the upper limit on our sieve, and return a single value, which is the number of primes in that interval. Complex requirements I know ;-)

Well, I see, it is indeed very complex requirement... Maybe you could do the very complex following operation to at least test the speed of this implementation : let lastPrime = primes !! 17983 -- Jedaï

2007/7/15, Chaddaï Fouché
Well, I see, it is indeed very complex requirement... Maybe you could do the very complex following operation to at least test the speed of this implementation : let lastPrime = primes !! 17983
Or if you really want a function with your requirement, maybe you could take the painful steps needed to write : let numberOfPrimes = length $ takeWhile (< 200000) primes ? -- Jedaï

(Random observation: Hmmm, strange, in the Data.Map version of primes above,
we are missing 5 primes?)
Hi Chaddai,
Your algorithm does work significantly better than the others I've posted
here :-)
So much so, that we're going for a grid of 10000000 to get the timings in an
easy-to-measure range. Here are the results:
J:\dev\haskell>ghc -O2 -fglasgow-exts -o PrimeChaddai.exe PrimeChaddai.hs
J:\dev\haskell>primechaddai
number of primes: 664579
30.984
J:\dev\test\testperf>csc /nologo primecs.cs
J:\dev\test\testperf>primecs
number of primes: 664579
elapsed time: 0,859375
So, only 30 times faster now, which is quite a lot better :-D
Here's the full .hs code:
module Main
where
import IO
import Char
import GHC.Float
import List
import qualified Data.Map as Map
import Control.Monad
import System.Time
import System.Locale
merge xs@(x:xt) ys@(y:yt) = case compare x y of
LT -> x : (merge xt ys)
EQ -> x : (merge xt yt)
GT -> y : (merge xs yt)
diff xs@(x:xt) ys@(y:yt) = case compare x y of
LT -> x : (diff xt ys)
EQ -> diff xt yt
GT -> diff xs yt
primes, nonprimes :: [Int]
primes = [2,3,5] ++ (diff [7,9..] (nonprimes))
nonprimes = foldr1 f . map g $ tail (primes)
where f (x:xt) ys = x : (merge xt ys)
g p = [ n*p | n <- [p,p+2..]]
calculateNumberOfPrimes max = length $ takeWhile ( < max ) primes
gettime :: IO ClockTime
gettime = getClockTime
main = do starttime <- gettime
let numberOfPrimes = (calculateNumberOfPrimes 10000000)
putStrLn( "number of primes: " ++ show( numberOfPrimes ) )
endtime <- gettime
let timediff = diffClockTimes endtime starttime
let secondsfloat = realToFrac( tdSec timediff ) +
realToFrac(tdPicosec timediff) / 1000000000000
putStrLn( show(secondsfloat) )
return ()
On 7/15/07, Chaddaï Fouché
Or if you really want a function with your requirement, maybe you could take the painful steps needed to write : let numberOfPrimes = length $ takeWhile (< 200000) primes ?

On Sunday 15 July 2007 00:31:12 Chaddaï Fouché wrote:
The HaskellWiki repertory it under "primes" and it's at least 170 times faster than the extra-naive sieve you used in your comparison on my computer... (I have some doubts on the accuracy of the benchmark and System.Time at this level of precision, especially on Windows)
What is the URL for that page? I can only find this page: http://www.haskell.org/hawiki/SieveOfEratosthenes which seems to consist largely of misinformation from newbies... -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?e

On Sun, Jul 15, 2007 at 03:39:27AM +0100, Jon Harrop wrote:
On Sunday 15 July 2007 00:31:12 Chaddaï Fouché wrote:
The HaskellWiki repertory it under "primes" and it's at least 170 times faster than the extra-naive sieve you used in your comparison on my computer... (I have some doubts on the accuracy of the benchmark and System.Time at this level of precision, especially on Windows)
What is the URL for that page?
I can only find this page:
http://www.haskell.org/hawiki/SieveOfEratosthenes
which seems to consist largely of misinformation from newbies...
That's not even on the current wiki :) (don't feel bad, Google still hasn't indexed the whole site yet) http://haskell.org/haskellwiki/Prime_numbers Stefan

Hello Chaddai, Sunday, July 15, 2007, 3:31:12 AM, you wrote:
Due to the nature of Haskell, it's not so easy to do the same thing (write a C program in Haskell as you wrote a C program in C#), so the conclusion is obviously to Haskell disadvantage.
it's possible to directly rewrite C code in haskell. unboxed arrays is all that you need to do this -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Sun, 2007-07-15 at 00:53 +0200, Hugh Perkins wrote:
There's really a tendency in this newsgroup to point people to huge documents, when a small copy and paste would make the answer so much more accessible ;-)
Anyway... so reading through the paper, it looks like its using a priority queue? Which basically is changing the algorithm somewhat compared to the C# version.
Anyway, if you can provide a working Haskell version, I'll be happy to run it. I sortof suspect that if it gave results within 30% of the C# version someone would already have done so ;-)
It's a six page document, and the point was in the first paragraph. Here's part of a giant thread about this that occurred in February: http://www.haskell.org/pipermail/haskell-cafe/2007-February/022854.html There is a zip file with 15 Haskell implementations and timing comparisons to a C version. The author of that email is, incidentally, the author of the paper I referred to.

derek.a.elkins:
On Sun, 2007-07-15 at 00:53 +0200, Hugh Perkins wrote:
There's really a tendency in this newsgroup to point people to huge documents, when a small copy and paste would make the answer so much more accessible ;-)
Anyway... so reading through the paper, it looks like its using a priority queue? Which basically is changing the algorithm somewhat compared to the C# version.
Anyway, if you can provide a working Haskell version, I'll be happy to run it. I sortof suspect that if it gave results within 30% of the C# version someone would already have done so ;-)
It's a six page document, and the point was in the first paragraph.
Here's part of a giant thread about this that occurred in February: http://www.haskell.org/pipermail/haskell-cafe/2007-February/022854.html
There is a zip file with 15 Haskell implementations and timing comparisons to a C version. The author of that email is, incidentally, the author of the paper I referred to.
And those implementations are also in darcs now, http://www.cse.unsw.edu.au/~dons/code/nobench/spectral/primes/ along with some others.

On 7/15/07, Derek Elkins
Ok, so switched to using the Data.Map version from this paper, which looks like a lazy, but real, version of the sieve of Arostothenes. This does run quite a lot faster, so we're going to run on a sieve of 1000000 to increase the timings a bit (timings on 200000 in C# are a bit inaccurate...). Here are the results: J:\dev\haskell>ghc -O2 -fglasgow-exts -o Prime2.exe Prime2.hs J:\dev\haskell>prime2 number of primes: 78493 19.547 J:\dev\test\testperf>csc /nologo primecs.cs J:\dev\test\testperf>primecs number of primes: 78498 elapsed time: 0,0625 So, only 300 times faster this time ;-) Here's the Haskell code: module Main where import IO import Char import GHC.Float import List import qualified Data.Map as Map import Control.Monad import System.Time import System.Locale sieve xs = sieve' xs Map.empty where sieve' [] table = [] sieve' (x:xs) table = case Map.lookup x table of Nothing -> ( x : sieve' xs (Map.insert (x*x) [x] table) ) Just facts -> (sieve' xs (foldl reinsert (Map.delete x table) facts)) where reinsert table prime = Map.insertWith (++) (x+prime) [prime] table calculateNumberOfPrimes :: Int -> Int calculateNumberOfPrimes max = length (sieve [ 2.. max ]) gettime :: IO ClockTime gettime = getClockTime main = do starttime <- gettime let numberOfPrimes = (calculateNumberOfPrimes 1000000) putStrLn( "number of primes: " ++ show( numberOfPrimes ) ) endtime <- gettime let timediff = diffClockTimes endtime starttime let secondsfloat = realToFrac( tdSec timediff ) + realToFrac(tdPicosec timediff) / 1000000000000 putStrLn( show(secondsfloat) ) return ()

On Saturday 14 July 2007 23:45:57 Derek Elkins wrote:
Wow, that is a really enlightening paper. :-) -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?e

On 14/07/07, Hugh Perkins
As I say, I'm not a Haskell expert, so feel free to provide a better implementation.
It's not really about providing a "better implementation" as that would imply that the algorithms are the same, which they are not. You're comparing two quite different algorithms (see the paper to which you've already been pointed). You need to specify what you actually want to compare. Maybe you need to write a C# version of the Haskell version? using System; using System.Collections.Generic; namespace ConsoleApplication1 { class Program { static IEnumerable<int> FromTo(int from, int to) { for (int i = from; i <= to; ++i) { yield return i; } } static IEnumerable<int> Drop( int x, IEnumerable<int> list ) { IEnumerator<int> it = list.GetEnumerator(); for (int i = 0; i < x; ++i) { if (!it.MoveNext()) { yield break; } } while ( it.MoveNext() ) { yield return it.Current; } } static IEnumerable<int> FilterDivisible(int x, IEnumerable<int> list) { foreach (int i in list) { if (i % x != 0) { yield return i; } } } static IEnumerable<int> Sieve(IEnumerable<int> list) { IEnumerator<int> it = list.GetEnumerator(); if (!it.MoveNext()) { yield break; } int p = it.Current; yield return p; foreach (int i in Sieve( FilterDivisible( p, Drop( 1, list ) ) ) ) { yield return i; } } static IEnumerable<int> Primes( int max ) { return Sieve( FromTo(2, max) ); } static int MaxPrime( int max ) { int count = 0; foreach (int i in Primes(max)) { ++count; } return count; } static void Main(string[] args) { Console.WriteLine("Count {0}", MaxPrime(17984)); Console.ReadLine(); } } } And yes, this one uses lazy streams just like the Haskell version, and it also uses the same algorithm so it should be a much more fair comparison. I gave up waiting for this one to terminate on large inputs, btw. :-) So yeah, apples to apples, the difference is hardly ordes of magnitude. But when you construct a micro-benchmark where you write one of the version in a way which essentially maps directly to hardware, you're going to see that version be a lot faster. If you actually *use* the languages a bit more fairly (e.g. use lazy streams in C#, since you used them in Haskell -- or by all means, write a Haskell version using unboxed ST arrays) you'll see something a bit more useful. And that's if you manage to use the same algorithm for both languages, of course. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Sebastian, Why would I write a slow, complicated algorithm in C#? I'm not making these comparisons for some academic paper, I'm trying to get a feel for how the languages run in practice. And really in practice, I'm never going to write a prime algorithm using merge and so on, I'd just use the original naive Haskell algorithm, that runs 500 times slower (at least) than my naive C# algo. I'm just allowing you guys to optimize to see how close you can get. Note that the C# algo is not something created by C# experts, it's just something I hacked together in like 2 minutes.

hughperkins:
Sebastian, Why would I write a slow, complicated algorithm in C#? I'm not making these comparisons for some academic paper, I'm trying to get a feel for how the languages run in practice. And really in practice, I'm never going to write a prime algorithm using merge and so on, I'd just use the original naive Haskell algorithm, that runs 500 times slower (at least) than my naive C# algo. I'm just allowing you guys to optimize to see how close you can get. Note that the C# algo is not something created by C# experts, it's just something I hacked together in like 2 minutes.
For fast, mutable prime sieves, see the shootout: http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsievebits&lang=ghc&id=4 (a bit sieve) is pretty fast, 1.8x highly optimised C, and also readable, for what it does: import Data.Array.IO import Data.Array.Base import System import Text.Printf main = do n <- getArgs >>= readIO . head :: IO Int mapM_ (sieve . (10000 *) . (2 ^)) [n, n-1, n-2] sieve n = do a <- newArray (2,n) True :: IO (IOUArray Int Bool) -- an array of Bool r <- go a n 2 0 printf "Primes up to %8d %8d\n" (n::Int) (r::Int) :: IO () go !a !m !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then let loop !j | j <= m = unsafeWrite a j False >> loop (j+n) | otherwise = go a m (n+1) (c+1) in loop (n+n) else go a m (n+1) c So perhaps just code up a mutable array version the same as for C# ? -- Don

On 7/15/07, Donald Bruce Stewart
[snip] unsafeWrite[snip] [snip]unsafeRead[snip]
Hi Donald, the idea is to use this for operational code, so avoiding unsafe operations is preferable ;-) You'll note that the C# version is not using unsafe operations, although to be fair that's because they worked out slower than the safe version ;-) Also, the whole algorithm is bound to the IO Monad, which is something I'd like to avoid if possible, since my entire interest in Haskell stems from the possibilites of running programs easily on 1 megacore processors in the future. Initial compilation gives an error: PrimeDonald.hs:18:3: Illegal bang-pattern (use -fbang-patterns) Ok, I'm ok with compiler patches, that's half the point of such a competition really, to encourage compiler optimization. Anyway, congrats you got nearly as fast as C#! J:\dev\haskell>ghc -fglasgow-exts -fbang-patterns -O2 -o PrimeDonald.exePrimeDo nald.hs J:\dev\haskell>primedonald number of primes: 664579 Elapsed time: 0.797 J:\dev\test\testperf>erase primecs.exe J:\dev\test\testperf>gmcs primecs.cs J:\dev\test\testperf>mono primecs.exe number of primes: 664579 elapsed time: 0,719 J:\dev\test\testperf>erase primecs.exe J:\dev\test\testperf>csc /nologo primecs.cs J:\dev\test\testperf>primecs number of primes: 664579 elapsed time: 0,6875 Here is the Haskell code: module Main where import Data.Array.IO import Data.Array.Base import System import System.Time import System.Locale calculateNumberOfPrimes n = do a <- newArray (2,n) True :: IO (IOUArray Int Bool) -- an array of Bool go a n 2 0 go !a !m !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then let loop !j | j <= m = unsafeWrite a j False >> loop (j+n) | otherwise = go a m (n+1) (c+1) in loop (n+n) else go a m (n+1) c gettime :: IO ClockTime gettime = getClockTime main = do starttime <- gettime numberOfPrimes <- (calculateNumberOfPrimes 10000000) putStrLn( "number of primes: " ++ show( numberOfPrimes ) ) endtime <- gettime let timediff = diffClockTimes endtime starttime let secondsfloat = realToFrac( tdSec timediff ) + realToFrac(tdPicosec timediff) / 1000000000000 putStrLn( "Elapsed time: " ++ show(secondsfloat) ) return ()

hughperkins:
[snip] unsafeWrite[snip] [snip]unsafeRead[snip] Hi Donald, the idea is to use this for operational code, so avoiding unsafe operations is preferable ;-) You'll note
On 7/15/07, Donald Bruce Stewart <[1]dons@cse.unsw.edu.au> wrote: that the C# version is not using unsafe operations, although to be fair that's because they worked out slower than the safe version ;-)
"unsafe"' here just means direct array indexing. Same as the other languages. Haskell's 'unsafe' is a little more paranoid that other languages.
Also, the whole algorithm is bound to the IO Monad, which is something I'd like to avoid if possible, since my entire interest in Haskell stems from the possibilites of running programs easily on 1 megacore processors in the future.
You're deciding that on a cache-thrashing primes benchmark? Since the goal is to flip bits very quickly in the cache, you could localise this to the ST monad then, as its perfectly pure on the outside.
Anyway, congrats you got nearly as fast as C#!
Try the other version I just sent. This one trashes cache lines needlessly. What C# version are you using, by the way? (So I can check if it does any tricks). -- Don

On 15/07/07, Donald Bruce Stewart
hughperkins:
[snip] unsafeWrite[snip] [snip]unsafeRead[snip] Hi Donald, the idea is to use this for operational code, so avoiding unsafe operations is preferable ;-) You'll note
On 7/15/07, Donald Bruce Stewart <[1]dons@cse.unsw.edu.au> wrote: that the C# version is not using unsafe operations, although to be fair that's because they worked out slower than the safe version ;-)
"unsafe"' here just means direct array indexing. Same as the other languages. Haskell's 'unsafe' is a little more paranoid that other languages.
Also, the whole algorithm is bound to the IO Monad, which is something I'd like to avoid if possible, since my entire interest in Haskell stems from the possibilites of running programs easily on 1 megacore processors in the future.
You're deciding that on a cache-thrashing primes benchmark?
Since the goal is to flip bits very quickly in the cache, you could localise this to the ST monad then, as its perfectly pure on the outside.
Yep: {-# OPTIONS -O2 -fbang-patterns #-} import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits main = print ( pureSieve 17984 ) pureSieve :: Int -> Int pureSieve n = runST( sieve n ) sieve n = do a <- newArray (2,n) True :: ST s (STUArray s Int Bool) -- an array of Bool go a n 2 0 go !a !m !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then let loop !j | j < m = do x <- unsafeRead a j when x (unsafeWrite a j False) loop (j+n) | otherwise = go a m (n+1) (c+1) in loop (n `shiftL` 1) else go a m (n+1) c -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan
"unsafe"' here just means direct array indexing. Same as the other languages. Haskell's 'unsafe' is a little more paranoid that other languages.
Yes, I was kindof hoping it was something like that. Cool :-)
Since the goal is to flip bits very quickly in the cache, you could localise this to the ST monad then, as its perfectly pure on the outside.
Ok, awesome! J:\dev\haskell>ghc -fglasgow-exts -O2 -o PrimeDonald2.exe PrimeDonald2.hs J:\dev\haskell>primedonald2 number of primes: 664579 Elapsed time: 0.7030000000000001 {-# OPTIONS -O2 -fbang-patterns #-} import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits import System.Time import System.Locale pureSieve :: Int -> Int pureSieve n = runST( sieve n ) sieve n = do a <- newArray (2,n) True :: ST s (STUArray s Int Bool) -- an array of Bool go a n 2 0 go !a !m !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then let loop !j | j < m = do x <- unsafeRead a j when x (unsafeWrite a j False) loop (j+n) | otherwise = go a m (n+1) (c+1) in loop (n `shiftL` 1) else go a m (n+1) c calculateNumberOfPrimes :: Int -> Int calculateNumberOfPrimes = pureSieve gettime :: IO ClockTime gettime = getClockTime main = do starttime <- gettime let numberOfPrimes = (calculateNumberOfPrimes 10000000) putStrLn( "number of primes: " ++ show( numberOfPrimes ) ) endtime <- gettime let timediff = diffClockTimes endtime starttime let secondsfloat = realToFrac( tdSec timediff ) + realToFrac(tdPicosec timediff) / 1000000000000 putStrLn( "Elapsed time: " ++ show(secondsfloat) ) return ()

On 7/15/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: "unsafe"' here just means direct array indexing. Same as the other languages. Haskell's 'unsafe' is a little more paranoid that other languages.
Yes, I was kindof hoping it was something like that. Cool :-)
Errr ... wait... when you say "direct array indexing", you mean that this does or doesnt continue to do bounds checking on the array access? I could imagine that it is "unsafe" simply because direct array indexing prevents mathematically proving that the program wont crash (?), or it could be unsafe in the C++ way, where going off the end of the array corrupts your stack/heap?

On 15/07/07, Hugh Perkins
On 7/15/07, Hugh Perkins
wrote: On 7/15/07, Sebastian Sylvan
wrote: "unsafe"' here just means direct array indexing. Same as the other languages. Haskell's 'unsafe' is a little more paranoid that other languages.
Yes, I was kindof hoping it was something like that. Cool :-)
Errr ... wait... when you say "direct array indexing", you mean that this does or doesnt continue to do bounds checking on the array access?
I could imagine that it is "unsafe" simply because direct array indexing prevents mathematically proving that the program wont crash (?), or it could be unsafe in the C++ way, where going off the end of the array corrupts your stack/heap?
Well, *I* didn't say it but yes. Unsafe disables bounds checking (which in this case is safe). I think you can just stick an unsafe{} in the C# version to disable them. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan
Well, *I* didn't say it but yes. Unsafe disables bounds checking (which in this case is safe). I think you can just stick an unsafe{} in the C# version to disable them.
Oh well that's not good. Yes, you can use unsafe in C# too, but you know there are reason why we use C# instead of C++, and one of those reasons is precisely to avoid stack/heap corruption.

Hey, I just realized I can shave off another 30% in C# ;-) So now the timings become: Safe Haskell ========= J:\dev\haskell>ghc -O2 -o primechaddai.exe PrimeChaddai.hs J:\dev\haskell>primechaddai number of primes: 664579 Elapsed time: 26.234 Unsafe Haskell =========== J:\dev\haskell>ghc -fglasgow-exts -O2 -o PrimeDonald2.exe PrimeDonald2.hs J:\dev\haskell>primedonald2 number of primes: 664579 Elapsed time: 0.7030000000000001 mono ==== J:\dev\test\testperf>erase primecs.exe & gmcs primecs.cs J:\dev\test\testperf>mono primecs.exe number of primes: 664579 elapsed time: 0,453 Microsoft.Net ========== J:\dev\test\testperf>erase primecs.exe & csc /nologo primecs.cs J:\dev\test\testperf>primecs number of primes: 664579 elapsed time: 0,421875 Here's the fabulously complicated ;-) new C# algorithm: public int CalculateNumberOfPrimes( int maxprime ) { bool[]IsNotPrime = new bool[ maxprime ]; int NumberOfPrimes = 1; int squarecutoff = (Int32)Math.Sqrt( maxprime ) + 1; for( int i = 3; i < maxprime; i+= 2 ) { if( !IsNotPrime [i] ) { NumberOfPrimes++; if( i < squarecutoff ) { for( int j = ( i << 1 ); j < maxprime; j+= i ) { if( !IsNotPrime [j] ) IsNotPrime [ j] = true; } } } } return NumberOfPrimes; } (basically, we only cross off primes up to the square root of the size of the grid. I think this is a standard part of the standard Arostophenes grid algorithm, so like I say the C# version is seriously unoptimized ;-) )

On 15/07/07, Hugh Perkins
Hey, I just realized I can shave off another 30% in C# ;-)
So now the timings become:
Safe Haskell =========
J:\dev\haskell>ghc -O2 -o primechaddai.exe PrimeChaddai.hs
J:\dev\haskell>primechaddai number of primes: 664579 Elapsed time: 26.234
Unsafe Haskell ===========
J:\dev\haskell>ghc -fglasgow-exts -O2 -o PrimeDonald2.exe PrimeDonald2.hs
J:\dev\haskell>primedonald2 number of primes: 664579 Elapsed time: 0.7030000000000001
mono ====
J:\dev\test\testperf>erase primecs.exe & gmcs primecs.cs
J:\dev\test\testperf>mono primecs.exe number of primes: 664579 elapsed time: 0,453
Microsoft.Net ==========
J:\dev\test\testperf>erase primecs.exe & csc /nologo primecs.cs
J:\dev\test\testperf>primecs number of primes: 664579 elapsed time: 0,421875
Here's the fabulously complicated ;-) new C# algorithm:
public int CalculateNumberOfPrimes( int maxprime ) { bool[]IsNotPrime = new bool[ maxprime ]; int NumberOfPrimes = 1;
int squarecutoff = (Int32)Math.Sqrt( maxprime ) + 1; for( int i = 3; i < maxprime; i+= 2 ) { if( !IsNotPrime [i] ) { NumberOfPrimes++; if( i < squarecutoff ) { for( int j = ( i << 1 ); j < maxprime; j+= i ) { if( !IsNotPrime [j] ) IsNotPrime [ j] = true; } }
} } return NumberOfPrimes; }
(basically, we only cross off primes up to the square root of the size of the grid. I think this is a standard part of the standard Arostophenes grid algorithm, so like I say the C# version is seriously unoptimized ;-) )
I don't see what the point of this is? Why do timings of different algorithms? Of course you could do the same optimization in any language, so why do you think it's relevant to change the algorithm in *one* of the languages and then make comparisons? BTW, I think a better optimization is to start the inner loop at the square of the current prime, since any numbers smaller than that would've already been crossed off by the other prime factors (which must be smaller than the current prime). Nevertheless, there's no point doing optimizations to this unless you do them to all implementations! -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan
I don't see what the point of this is? Why do timings of different algorithms? Of course you could do the same optimization in any language, so why do you think it's relevant to change the algorithm in *one* of the languages and then make comparisons?
Sebastien, Well, as you yourself said, different languages work differently, so there's no point in trying to directly implement the C# algorithm in Haskell: it just wont work, or it will be slow. The same works from Haskell to C#. So, you guys are Haskell experts, show the world what Haskell is capable of. Come up with algorithms to calculate prime numbers in Haskell that are: - safe - easy to understand/read/maintain - fast I'll ditch the "sieve of arastophenes" rule if you like. Use any algorithm you like. Now that is fair I think? I in turn will do my part to keep the C# version a step ahead of the Haskell version. It seems this is pretty easy :-D

On 15/07/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: I don't see what the point of this is? Why do timings of different algorithms? Of course you could do the same optimization in any language, so why do you think it's relevant to change the algorithm in *one* of the languages and then make comparisons?
Sebastien,
Well, as you yourself said, different languages work differently, so there's no point in trying to directly implement the C# algorithm in Haskell: it just wont work, or it will be slow. The same works from Haskell to C#.
So, you guys are Haskell experts, show the world what Haskell is capable of. Come up with algorithms to calculate prime numbers in Haskell that are: - safe - easy to understand/read/maintain - fast
I'll ditch the "sieve of arastophenes" rule if you like. Use any algorithm you like. Now that is fair I think?
I in turn will do my part to keep the C# version a step ahead of the Haskell version. It seems this is pretty easy :-D
Try this one then. I removed the unsafe reads... Still, I think youre methodology sucks. If you want to compare languages you should implement the same algorithm. Dons implemented a Haskell version of your C++ algorithm, even though it wasn't optimal. He didn't go off an implement some state-of-the-art primes algorithm that was completey different now did he? If this is about comparing languages, you should compare them fairly. {-# OPTIONS -O2 -fbang-patterns #-} import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import System.Time import System.Locale main = do starttime <- getClockTime let numberOfPrimes = (pureSieve 17984) putStrLn( "number of primes: " ++ show( numberOfPrimes ) ) endtime <- getClockTime let timediff = diffClockTimes endtime starttime let secondsfloat = realToFrac( tdSec timediff ) + realToFrac(tdPicosec timediff) / 1000000000000 putStrLn( "Elapsed time: " ++ show(secondsfloat) ) return () pureSieve :: Int -> Int pureSieve n = runST( sieve n ) sieve n = do a <- newArray (2,n) True :: ST s (STUArray s Int Bool) -- an array of Bool go a n 2 0 go !a !m !n !c | n == m = return c | otherwise = do e <- readArray a n if e then let loop !j | j < m = do writeArray a j False loop (j+n) | otherwise = go a m (n+1) (c+1) in loop (n * n) else go a m (n+1) c -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 15/07/07, Sebastian Sylvan
On 15/07/07, Hugh Perkins
wrote: On 7/15/07, Sebastian Sylvan
wrote: I don't see what the point of this is? Why do timings of different algorithms? Of course you could do the same optimization in any language, so why do you think it's relevant to change the algorithm in *one* of the languages and then make comparisons?
Sebastien,
Well, as you yourself said, different languages work differently, so there's no point in trying to directly implement the C# algorithm in Haskell: it just wont work, or it will be slow. The same works from Haskell to C#.
So, you guys are Haskell experts, show the world what Haskell is capable of. Come up with algorithms to calculate prime numbers in Haskell that are: - safe - easy to understand/read/maintain - fast
I'll ditch the "sieve of arastophenes" rule if you like. Use any algorithm you like. Now that is fair I think?
I in turn will do my part to keep the C# version a step ahead of the Haskell version. It seems this is pretty easy :-D
Try this one then. I removed the unsafe reads... Still, I think youre methodology sucks. If you want to compare languages you should implement the same algorithm. Dons implemented a Haskell version of your C++ algorithm, even though it wasn't optimal. He didn't go off an implement some state-of-the-art primes algorithm that was completey different now did he? If this is about comparing languages, you should compare them fairly.
{-# OPTIONS -O2 -fbang-patterns #-}
import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad
import System.Time import System.Locale
main = do starttime <- getClockTime let numberOfPrimes = (pureSieve 17984) putStrLn( "number of primes: " ++ show( numberOfPrimes ) ) endtime <- getClockTime let timediff = diffClockTimes endtime starttime let secondsfloat = realToFrac( tdSec timediff ) + realToFrac(tdPicosec timediff) / 1000000000000 putStrLn( "Elapsed time: " ++ show(secondsfloat) ) return ()
pureSieve :: Int -> Int pureSieve n = runST( sieve n )
sieve n = do a <- newArray (2,n) True :: ST s (STUArray s Int Bool) -- an array of Bool go a n 2 0
go !a !m !n !c | n == m = return c | otherwise = do e <- readArray a n if e then let loop !j | j < m = do writeArray a j False loop (j+n)
| otherwise = go a m (n+1) (c+1) in loop (n * n) else go a m (n+1) c
-- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862
This will fail for large inputs btw, since there are only 32 bits in an int.. This version makes sure it doesn't try to square something which would go cause an int overflow: {-# OPTIONS -O2 -fbang-patterns #-} import Control.Monad.ST import Data.Array.ST import System import Control.Monad import Data.Bits import System.Time import System.Locale main = do starttime <- getClockTime let numberOfPrimes = (pureSieve 17984) putStrLn( "number of primes: " ++ show( numberOfPrimes ) ) endtime <- getClockTime let timediff = diffClockTimes endtime starttime let secondsfloat = realToFrac( tdSec timediff ) + realToFrac(tdPicosec timediff) / 1000000000000 putStrLn( "Elapsed time: " ++ show(secondsfloat) ) return () pureSieve :: Int -> Int pureSieve n = runST( sieve n ) sieve n = do a <- newArray (2,n) True :: ST s (STUArray s Int Bool) -- an array of Bool go a n 2 0 go !a !m !n !c | n == m = return c | otherwise = do e <- readArray a n if e then let loop !j | j < m = do writeArray a j False loop (j+n) | otherwise = go a m (n+1) (c+1) in loop ( if n < 46340 then n * n else n `shiftL` 1) else go a m (n+1) c My GHC compiler is broken, I only have GHCi, but this is about twice for me as fast as the previous version you benchmarked, btw. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan
My GHC compiler is broken, I only have GHCi, but this is about twice for me as fast as the previous version you benchmarked, btw.
Hi Sebastian, Here are the results: Haskell (Safe Haskell right?) ====== J:\dev\haskell>ghc -fglasgow-exts -O2 -o PrimeSebastian.exe PrimeSebastian.hs J:\dev\haskell>primesebastian number of primes: 664579 Elapsed time: 1.375 mono ==== J:\dev\test\testperf>erase primecs.exe & gmcs primecs.cs J:\dev\test\testperf>mono primecs.exe number of primes: 664579 elapsed time: 0,438 Microsoft .Net ========== J:\dev\test\testperf>erase primecs.exe & csc /nologo primecs.cs J:\dev\test\testperf>primecs number of primes: 664579 elapsed time: 0,390625 (I incorporated your suggestion for the innerloop, which shaved off another 20% or so in the C# classes)

On Jul 15, 2007, at 7:53 , Sebastian Sylvan wrote:
Still, I think youre methodology sucks. If you want to compare languages you should implement the same algorithm. (...) If this is about comparing languages, you should compare them fairly.
But is it comparing them fairly if you use an algorithm which favors direct naive procedural implementation (i.e. favors C#), or one which favors a lazy mathematical formalism (i.e. Haskell)? Seems to me you get the best picture by picking two algorithms, one which favors C# and one which favors Haskell, and implementing both in both languages. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

hughperkins:
On 7/15/07, Sebastian Sylvan <[1]sebastian.sylvan@gmail.com> wrote:
I don't see what the point of this is? Why do timings of different algorithms? Of course you could do the same optimization in any language, so why do you think it's relevant to change the algorithm in *one* of the languages and then make comparisons?
Sebastien, Well, as you yourself said, different languages work differently, so there's no point in trying to directly implement the C# algorithm in Haskell: it just wont work, or
In this case it is fine. You're setting bits in the cache. Please use the same algorithm, or any conclusions are meaningless. -- Don

On 7/15/07, Donald Bruce Stewart
In this case it is fine. You're setting bits in the cache. Please use the same algorithm, or any conclusions are meaningless.
No, I'm counting prime numbers. Somewhat faster it seems ;-) Let's put this into the real world a moment. You get a job, or you're at your job, you spend several weeks writing some super-dooper program, telling everyone how awesome it is because it's Haskell and it's l33t! You run it, and it takes 24 hours to run. It's ok, it's Haskell. Some guy with long hair and no degree comes along, and rewrites your code in C#. Takes him like 30 minutes because he doesnt have to optimize it, it's done automatically. His program runs in 25 minutes (60 times faster, see the benchmarks above to note that this is realistic). Guess who gets fired?

On 15/07/07, Hugh Perkins
On 7/15/07, Donald Bruce Stewart
wrote: In this case it is fine. You're setting bits in the cache. Please use the same algorithm, or any conclusions are meaningless.
No, I'm counting prime numbers. Somewhat faster it seems ;-)
Let's put this into the real world a moment. You get a job, or you're at your job, you spend several weeks writing some super-dooper program, telling everyone how awesome it is because it's Haskell and it's l33t!
You run it, and it takes 24 hours to run. It's ok, it's Haskell.
Some guy with long hair and no degree comes along, and rewrites your code in C#. Takes him like 30 minutes because he doesnt have to optimize it, it's done automatically.
His program runs in 25 minutes (60 times faster, see the benchmarks above to note that this is realistic).
Guess who gets fired?
Oh come on! If you really want to calculate how many primes there are in 32 bits int you would precalculate it in a table. This *isn't* a real world application! Stop pretending it is! If you want to compare various features of the languages to see how they fare you need to use fair comparisons! So two options if you are really interested in knowing how fast Haskell is compared to C# (and I think it's becoming clear that you're not, so why bother?): * Write a real application. Not some micro benchmarks that is essentially assembly level programming. Btw, If this sieve program is representative of how most of your C# programs look, I'd wager you are the one who will get fired! This program isn't C#. It doesn't use any of the neat C# features. It's essentially C, in C# syntax. How does that say anything about the /general/ speed of C# versus Haskell? * Try micro benchmarks of various different features. For example write a sieves version using lazy streams to see how fast lazy streams are. As I've demonstrated, Haskell is quite a bit faster there (see my C# implementation of the original Haskell code). The important thing to note here is that you are comparing *specific* features of the language. Changing the algorithm in one language but not the other is cheating. For the latter you can always consult the shootout: http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=csharp&lang2=ghc Looks like Haskell doesn't fare to poorly against C#. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan

On 15/07/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: [Argh, no way can a Microsoft language be better than Haskell!!!!] Well, if you scan higher in the thread, there are two benchmarks. The prime numbers benchmark was a simple 10 minute benchmark to compare the computational speed (something which Haskell ought to do well in?)
The other benchmark is OpenGl. I'll try that one soon.
Nice, so because I think your methodology is undiciplined and tells you nothing useful, I must be biased against microsoft products right? Btw, guess what company name is on my paycheck every month? That's right, microsoft. I'm not biased in any way against microsoft products (since I make them myself!). Nobody is claiming that Haskell is the best language for writing tight inner assembly like loops. So what's the point in making those comparisons? How about you write some programs that do algebraic manipulations of data structures? Something a bit more representative of real world code, and a bit more high level? Or how about you just take a look at the shootout I linked you to? They have far more benchmarks than you'll have time to implement, I'm sure. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On Sun, 15 Jul 2007 14:15:03 +0200, you wrote:
...a simple 10 minute benchmark to compare the computational speed...
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." - Donald Knuth (paraphrasing Tony Hoare) Haskell is about improving software quality. A meaningful benchmark would be one that compares end-to-end software development lifecycles, including not only runtime performance, but also development costs, debugging and maintenance time, reliability, etc. Steve Schafer Fenestra Technologies Corp. http://www.fenestra.com/

hughperkins:
Brandon wrote:
Seems to me you get the best picture by picking two algorithms, one which favors C# and one which favors Haskell, and implementing both in both languages. Sounds good to me. What is a good problem that favors Haskell?
NO. We just *did* this. Firstly, to compare GHC Haskell and C#, look at the shootout: http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=ghc&lang2=csharp C# does better than I expected! 2.6x faster than one Haskell program, usually 2-4x slower. Really poor at lightweight concurrency. Don't do toy benchmarks here, fix the C# ones on the shootout! Secondly, we just did this for prime sieves: * imperative bit sieves on this list, in C# and Haskell, roughly identical runtimes, though Hugh didn't benchmark the fastest Haskell ones. This is to be expected, every compiled languages runs into the cache on this benchmark anyway, see here: http://shootout.alioth.debian.org/gp4/benchmark.php?test=nsieve&lang=all * lazy sieves, C# was 100x slower than the naive Haskell implementation. That's the real story here, laziness is just hard and painful in C#. However, if you're keen, and agreeing to implement the same algorithm on both systems, I'd have a go in C# at 'chameneos', a concurrency benchmark, http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all or maybe 'pidigits', a lazy pi generator, http://shootout.alioth.debian.org/gp4/benchmark.php?test=pidigits&lang=ghc&id=0 Should be a challenge in C#. -- Don

Hello Donald, Sunday, July 15, 2007, 8:22:33 PM, you wrote:
usually 2-4x slower. Really poor at lightweight concurrency.
both systems, I'd have a go in C# at 'chameneos', a concurrency benchmark,
shootout contains only two or three tests that compare quality of code generated by compiler rather than bundled libraries (and its maintainer doesn't allow to include 3rd-party libs in tests but can't track inclusion of required libs into language distribution) in particular, you have selected two tests that relates to the speed of ghc-bundled l.c. support and nothing else. it would be fool to make assumptions about speed of number-crunching code based on these tests -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 7/15/07, Donald Bruce Stewart
However, if you're keen, and agreeing to implement the same algorithm on both systems,
Sorry, the rule is you use what's available in your chosen language, otherwise I have to restrict myself only to use things available in Haskell, which is kindof silly dont you think? The issue with the shootout is there's no room for creativity, it's like working for a manager who micro-manages. My ideal testing environment is something like http://topcoder.com , but topcoder doesnt support Haskell at this time (probably because it would lose all the time, so noone would use it) I'd quite like to see a version of topcoder for Haskell, but shootout is not it. I'd have a go in C# at 'chameneos', a concurrency
benchmark,
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all
I'll have a look. The GHC solution seems to be safe I think? so it seems fair.

On 7/15/07, Hugh Perkins
I'd have a go in C# at 'chameneos', a concurrency
benchmark,
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all
Errr this is kindof a strange problem, the answer is always N * 2? And there we see why I dislike shootout ;-)

Just for laughs, here's my solution to Chameneos. J:\dev\haskell>csc /nologo Chameneos2.cs J:\dev\haskell>chameneos2 200 elapsed time: 0 Compares quite favorably to the Haskell solution: J:\dev\haskell>ghc -fglasgow-exts -O2 -o Chameneos.exe Chameneos.hs J:\dev\haskell>chameneos 2000000 number of primes: () Elapsed time: 1.2810000000000001 Think outside the box people ;-) using System; class Chameneos { public void Go(int N) { Console.WriteLine( N * 2 ); } } class EntryPoint { public static void Main() { System.DateTime start = System.DateTime.Now; new Chameneos().Go(100); System.DateTime finish = System.DateTime.Now; double time = finish.Subtract( start ).TotalMilliseconds;; Console.WriteLine( "elapsed time: " + ( time / 1000 ) ); } }

Oh wait, hmmm, might have misread the question :-D

Nope, the answer really is 200...
http://shootout.alioth.debian.org/gp4/iofile.php?test=chameneos&lang=all&file=output
On 7/15/07, Hugh Perkins
Oh wait, hmmm, might have misread the question :-D

On 7/15/07, Hugh Perkins
On 7/15/07, Hugh Perkins
wrote: I'd have a go in C# at 'chameneos', a concurrency
benchmark,
http://shootout.alioth.debian.org/gp4/benchmark.php?test=chameneos&lang=all
Errr this is kindof a strange problem, the answer is always N * 2?
And there we see why I dislike shootout ;-)
By the way, sortof felt guilty for not really creating imaginary Chameneos, and there's also the race condition to think about. Race condition =========== (or: why is it N * 2) There is an insane race condition in the question, which really renders the question rather uncool, but let's accept for now that it's ok to have insane race conditions in our programs. There are two possibilities: 1. The insane race condition means that each test condition produces a different output, and the question is entirely invalid 2. The race condition has no affect on the output If 1 is true, the question is entirely invalid. We're going to assume that the question is valid, which means the race condition has no affect on the result. That means that we can imagine the chaemeneos meeting in any order we like. So we send c1 then c2. That's one meeting, and each chameneos has met other other chameneos. We iterate N times, so C1 and c2 both met another chameneos N times, giving a total of N * 2. The colors are entirely irrelevant, and do not have any affect on the meetings, because they do not affect whether meetings take place or not. Imaginary Chameneos ================ The question asks us to imagine N chameneos of different colors meeting. Chameneos do not really exist, they're an imaginary creation. So we have to imagine N imaginary chameneos meeting imaginarily. Rest assured, as I pressed the "Enter" key I did duly imagine the 100 chameneos in many colors all meeting and changing colors and becoming faded. Oh, you want the computer to imagine the imaginary chameneos meeting and changing colors? Well now, a computer doesnt really imagine anything, we ascribe meaning to arbitrary symbols in our code. So, just to put your mind at rest, the function "Chameneos.Go( int N )" in my code represents N chameneos in many colors meeting and changing colors, so to the extent that a computer can imagine things, the computer really did imagine imaginary chameneos imaginarily meeting :-D

On 15/07/07, Hugh Perkins
On 7/15/07, Donald Bruce Stewart
wrote: However, if you're keen, and agreeing to implement the same algorithm on both systems,
Sorry, the rule is you use what's available in your chosen language, otherwise I have to restrict myself only to use things available in Haskell, which is kindof silly dont you think?
Nobody is saying anything different. We're just saying that you have to implement the same thing in both languages. Don't compare insertion sort with merge sort, for example, compare insertion sort with insertion sort and merge sort with merge sort. If you don't, any differences you detect have nothing to do with the language itself, and is as such completely useless! If you have a neat way of doing something due to some language feature, then do use it (see lazy lists in Haskell, compared to the awkward lazy streams in C#, or imperative array updates in C# compared to the somewhat awkward readArray/writeArray etc.)! But use the same algorithm if you want to draw any valid conclusions! As we've demonstrated there's nothing stopping you from writing imperative "C-like" algorithms in Haskell (just like C#), and there certainly wasn't any major performance difference (did you even benchmark Dons latest entry? Not the template haskell one, but the one before that? If you want you can replace the unsafeRead/unsafeWrite in the same way I did in mine -- I'd still like to see how it fares with the better cache performance and the other optimizations that you had in the C# version). If you want to compare two languages, then implement the same algorithm in each. Don't implement a naive and elegant, or a flexible and general algorithm in one of the languages, and then compare it with a completely different algorithm in the other language (written specifically for the benchmark itself, rather than as a general/flexible algorithm). It just doesn't give you any useful data, and you'd be wasting your time, and we wouldn't want that would we? I think dons summed it up nicely. We tried both approaches, lazy/general, and imperative/fast. In the former verion C# was horrible compared to Haskell, in the latter the performance was about the same (again, please benchmark Dons latest entry if you haven't). Hardly a resounding victory for C#!
The issue with the shootout is there's no room for creativity, it's like working for a manager who micro-manages.
I thought the point wasn't to compare programmer's creativity, but to compare languages? -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan
I thought the point wasn't to compare programmer's creativity, but to compare languages?
Sebastian, you cant directly compare languages, you can only compare the results of a pairing between developers and those languages. There's no absolute way to make the comparison, you just have to average out, over scenarios that are relevant to whatever you're doing. If we assume that Haskell programmers are at least as creative as C# programmers - and most Haskell programmers seem to take this as written ;-) - then this should balance out ok.

On 15/07/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: I thought the point wasn't to compare programmer's creativity, but to compare languages?
Sebastian, you cant directly compare languages, you can only compare the results of a pairing between developers and those languages.
Sure you can. Keep the programmer and the algorithm constant, and swap out the languages. Or have multiple programmers collaborating/competing on implementing the same algorithm on both side, converging on an "optimum" for each language, and compare the results (this is what the shootout does). Kudos on managing to completely sidestep the two suggested benchmarks by complaining that one of them isimaginary. Because, you know, your primes program was such a great example of a real-world application. You don't think that multiple agents interacting in a concurrent setting is representative for real programs? It is, and by simplifying it down to the core problem, you can test the "difficult bit" far more easily than if you were to require everyone to write a full-on telecom operatings sytem, or any other application that's concurrent in the same style, for each language you want to compare. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan
You don't think that multiple agents interacting in a concurrent setting is representative for real programs? It is, and by simplifying it down to the core problem, you can test the "difficult bit" far more easily than if you were to require everyone to write a full-on telecom operatings sytem, or any other application that's concurrent in the same style, for each language you want to compare.
Yes, actually that's exactly the problem I have at work that I'm looking for a solution for.

On 15/07/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: Hi Sebastian,
There are literally thousands of problems at http://topcoder.com. I'm totally fine with using any of these as a benchmark.
Can you find one that shows off the strengths of Haskell?
Can you find me a list of the contests? I find that site impossible to navigate efficiently. Roughly speaking the strenths of Haskell are mainly increased productivity, safety, and realiability. Not something you can test for by meassuring performance of finished solutions (who knows how long the C version took to create?). However, in terms of performance Haskell is pretty good at laziness, algebraic manipulations of (possibly infinite) data structures, and concurrency (especially if you require atomic operations on shared data - since you can use STM rather than locks, which also gives you exception safety for free). Also, I believe you already have two suggested benchmarks that you could try, so why not start with those since they already have implementations in lots of languages, including Haskell and C#?
You don't think that multiple agents interacting in a concurrent setting is representative for real programs? It is, and by simplifying it down to the core problem, you can test the "difficult bit" far more easily than if you were to require everyone to write a full-on telecom operatings sytem, or any other application that's concurrent in the same style, for each language you want to compare.
Yes, actually that's exactly the problem I have at work that I'm looking for a solution for.
Okay, so what's the problem then? Why not implement this one very simple application that tests just this thing and see what happens? In fact, there already is a C# implementation for it, if you can't find anything wrong with it, then don't you have the information you need? If you can find something wrong with it, fix it (and submit the fix!), and see how the improved version fares (remember, you can't change the algorithm as it would void the comparison!). Point is, you have at least two suggested problems that you could try out, and you now say that one of them is even a great fit for your specific problem domain, so get cracking and you'll have your answer shortly! -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan

On 15/07/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: [Lots of stuff] Ok, Sebastian, there's such a thing as analysing products along multiple orthogonal axes.
And yet you don't seem willing to do so? Why is that? You asked for problems where Haskell does well, performance-wise, and then you make up ridiculous excuses to avoid accepting the problems presented!
So... benchmarking comes into play to find out how fast a pure computational function actually runs (point 2), and how well opengl runs (point 3.4). I didnt try opengl yet, I'm not holding my breath, but I'll give it a shot and see what happens.
OpenGL is mostly written in C, so most of the code will likely run the exact same bits. It's just an interface to a C library.
For the pure computation, FWIW my personal conclusions at the moment: - Haskell can get up to C# speeds, by using imperative algorithms - what does this say about lazy algorithms???
It says that lazy algorithms are often slower than low-level imperative algorithms. This is true in both Haskell and C#, as shown. But what do we also know about lazy algorithms? That they are far more modular and flexible! An interesting point is that lazy algorithms were (in this benchmark) two orders of magnitude faster in Haskell than in C#! So maybe if you want to focus on high-level, maintainable, and /correct/ code, you can do so at far less performance cost using Haskell?
- intuitively written, maintainable Haskell algorithms run at far from C# speeds
Actually no, it was about 100x faster! Again, you make the error of comparing a lazy stream based version with an imperative low-level version. If you want to make claims, then use either of the two apples-to-apples comparisons that we've done here. Either compare the high-level lazy versions, or the low-level imperative versions. You're comparing a low-level inflexible highly specialized algorithm to a completely different high-level and flexible (and a bit naive!) algorithm. That's not a fair comparison. It may be that Haskell is very good at writing this flexible and high level code, but that doesn't mean you get to compare it to inflexible and low level code and then claim Haskell is slow! C# was faster originally because you had to do far more work to to implement the algorithm, and the end result is hardly reusable at all, and in larger application the cost of debugging and maintenance also goes way up. But guess what? As shown in this thread, you can pay this cost in Haskell too /where needed/ and get similar speeds as C#! -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On Sunday 15 July 2007 21:57:40 Sebastian Sylvan wrote:
OpenGL is mostly written in C, so most of the code will likely run the exact same bits. It's just an interface to a C library.
Benchmarking OpenGL is certainly of little to no interest. However, showing how easily OpenGL can be interfaced to is of huge interest to me. Functional languages tend to have very poor FFIs, so I for one would like to see whether or not non-trivial graphical programs can be written in Haskell. I was amazed to find that Frag corrupts memory in 64-bit, for example. Why are there malloc calls in a Haskell program?! -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?e

On Sun, 2007-07-15 at 23:05 +0100, Jon Harrop wrote:
On Sunday 15 July 2007 21:57:40 Sebastian Sylvan wrote:
OpenGL is mostly written in C, so most of the code will likely run the exact same bits. It's just an interface to a C library.
Benchmarking OpenGL is certainly of little to no interest. However, showing how easily OpenGL can be interfaced to is of huge interest to me. Functional languages tend to have very poor FFIs, so I for one would like to see whether or not non-trivial graphical programs can be written in Haskell.
I was amazed to find that Frag corrupts memory in 64-bit, for example. Why are there malloc calls in a Haskell program?!
Because Haskell has a very nice FFI and the Haskell philosophy is to try to do as much on the Haskell-side as possible.

If you are so sure that C# will be better than haskell why not prove it at the ICFP (http://www.icfpcontest.org/).
That should be as fair as possible with your requests for comparison.
It is running next weekend (20th-23rd of July), so you can prove how superior your C# is.
If you win, peobably everyone here will recognize it.
So go and register.
----- Original Message -----
From: Hugh Perkins
To: Sebastian Sylvan
Cc: haskell-cafe@haskell.org
Sent: Sunday, July 15, 2007 10:39 PM
Subject: Re: Re[4]: [Haskell-cafe] In-place modification
On 7/15/07, Sebastian Sylvan

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hugh Perkins wrote:
On 7/15/07, Sebastian Sylvan
wrote: [Lots of stuff]
<blah blah blah> Can we stop this rubbish before I unsubscribe? Seriously, get a clue before propagating nonsense across the mail clients of a large audience. "Just ignore the rubbish and the rubbish will go away" usually works for me, but it isn't in this case. How about we all try it at once? <annoyed> Tony Morris http://tmorris.net/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD4DBQFGmqh9mnpgrYe6r60RAlS+AJ9jIQ+MPwV2oL+vzmeHLXisH3mMtQCY+zXt CAvanZ6el/x3Ja449AKM2Q== =mX2J -----END PGP SIGNATURE-----

On Sunday 15 July 2007 21:26:49 Sebastian Sylvan wrote:
Can you find me a list of the contests?
No. I've been staring at that site for about 15 minutes and could only find a single challenge that was a trivial function from graph theory. You might like this symbolic simplifier benchmark: http://www.lambdassociates.org/studies/study10.htm The OCaml program is simply: let rec ( +: ) f g = match f, g with | `Int n, `Int m -> `Int (n +/ m) | `Int (Int 0), e | e, `Int (Int 0) -> e | f, `Add(g, h) -> f +: g +: h | f, g -> `Add(f, g) let rec ( *: ) f g = match f, g with | `Int n, `Int m -> `Int (n */ m) | `Int (Int 0), e | e, `Int (Int 0) -> `Int (Int 0) | `Int (Int 1), e | e, `Int (Int 1) -> e | f, `Mul(g, h) -> f *: g *: h | f, g -> `Mul(f, g) let rec simplify = function | `Int _ | `Var _ as f -> f | `Add (f, g) -> simplify f +: simplify g | `Mul (f, g) -> simplify f *: simplify g -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?e

Hello Sebastian, Sunday, July 15, 2007, 9:05:14 PM, you wrote:
As we've demonstrated there's nothing stopping you from writing imperative "C-like" algorithms in Haskell (just like C#), and there certainly wasn't any major performance difference
as Donald mentioned, this test is just limited by cache speed, not by speed of code generated. -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 16/07/07, Bulat Ziganshin
Hello Sebastian,
Sunday, July 15, 2007, 9:05:14 PM, you wrote:
As we've demonstrated there's nothing stopping you from writing imperative "C-like" algorithms in Haskell (just like C#), and there certainly wasn't any major performance difference
as Donald mentioned, this test is just limited by cache speed, not by speed of code generated.
But wouldn't you say that in general, if you spend the effort you can write low-level imperative algorithms in Haskell that perform reasonably well? Especially compared to e.g. C#? I think your own libraries demonstrate this! I'm not saying it's as convenient (see the recent thread about "monad splices") to write low-level imperative code in Haskell, but using laziness in C# was hardly a walk on the beach either! So my point is that Haskell isn't geared towards low-level optimizations and performance, but in the few places where you do need it, you *can* get it (IMO for only moderately more inconvenience than you pay for *everything* in a low-level imperative language). Whereas C# is a bit the other way around (easy to modify state, inconvenient to write high-level/lazy/concurrent/etc. code), though something like C is even more the other way around. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Hello Sebastian, Monday, July 16, 2007, 2:53:36 PM, you wrote:
But wouldn't you say that in general, if you spend the effort you can write low-level imperative algorithms in Haskell that perform reasonably well? Especially compared to e.g. C#? I think your own libraries demonstrate this!
i've said that 1) low-level programming in Haskell is possible, although not as convenient as in C 2) low-level code will be much faster than high-level one, but not as fast as C code once i summed up my experience - you may either write 1) high-level code, which is written 10x faster than in C but works 100x slower 2) low-level code that is written 3x slower than in C and works 3x slower too if you think that Haskell code may be as fast as C one - try to rewrite sha1 in any haskell style and compare it to highly optimized C versions it's all about small self-contained "number-crunching" algorithms - for larger ones and especially for whole applications i got very different results - code is, say, 3x slower while written 3x faster. it's probably because OS calls, C libraries and highly-optimized libraries written by other people are taken into account; also ghc imho has better global-level optimization than C++ compilers, for example it has better inlining policy
I'm not saying it's as convenient (see the recent thread about "monad splices") to write low-level imperative code in Haskell, but using laziness in C# was hardly a walk on the beach either! So my point is that Haskell isn't geared towards low-level optimizations and performance, but in the few places where you do need it, you *can* get it (IMO for only moderately more inconvenience than you pay for *everything* in a low-level imperative language).
are you really wrote such code or just believe to Haskell advertizing company? :D -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 2007-07-16, Bulat Ziganshin
once i summed up my experience - you may either write 1) high-level code, which is written 10x faster than in C but works 100x slower 2) low-level code that is written 3x slower than in C and works 3x slower too
Well, replace "you may" with "Bulat may", and I'll agree. I'd call the low-level code about equal writing time. People's milage will vary enormously. -- Aaron Denney -><-

On Mon, 2007-07-16 at 11:53 +0100, Sebastian Sylvan wrote:
On 16/07/07, Bulat Ziganshin
wrote: Hello Sebastian,
Sunday, July 15, 2007, 9:05:14 PM, you wrote:
As we've demonstrated there's nothing stopping you from writing imperative "C-like" algorithms in Haskell (just like C#), and there certainly wasn't any major performance difference
as Donald mentioned, this test is just limited by cache speed, not by speed of code generated.
But wouldn't you say that in general, if you spend the effort you can write low-level imperative algorithms in Haskell that perform reasonably well? Especially compared to e.g. C#? I think your own libraries demonstrate this!
I'm not saying it's as convenient (see the recent thread about "monad splices") to write low-level imperative code in Haskell, but using laziness in C# was hardly a walk on the beach either! So my point is that Haskell isn't geared towards low-level optimizations and performance, but in the few places where you do need it, you *can* get it (IMO for only moderately more inconvenience than you pay for *everything* in a low-level imperative language). Whereas C# is a bit the other way around (easy to modify state, inconvenient to write high-level/lazy/concurrent/etc. code), though something like C is even more the other way around.
Ah, the secret of Haskell is to make low-level-looking code run slower than high level code so that people write high-level code.

Ah, the secret of Haskell is to make low-level-looking code run slower than high level code so that people write high-level code.
The secret of programming is to know which tools to use for which job. If you're writing device drivers in Visual Basic, you've made a strategic misstep and need to re-evaluate. Martin

On Mon, 2007-07-16 at 17:41 +0100, Martin Coxall wrote:
Ah, the secret of Haskell is to make low-level-looking code run slower than high level code so that people write high-level code.
The secret of programming is to know which tools to use for which job. If you're writing device drivers in Visual Basic, you've made a strategic misstep and need to re-evaluate.
That sounds like a challenge.

On 16/07/07, Derek Elkins
On Mon, 2007-07-16 at 17:41 +0100, Martin Coxall wrote:
Ah, the secret of Haskell is to make low-level-looking code run slower than high level code so that people write high-level code.
The secret of programming is to know which tools to use for which job. If you're writing device drivers in Visual Basic, you've made a strategic misstep and need to re-evaluate.
That sounds like a challenge.
Well they've been written in both Haskell[1], and C#[2], so VB might not be out of the realm of possibility (in fact, I think any language that compiles to CIL is fine for [2])! [1] http://programatica.cs.pdx.edu/House/ [2] http://programatica.cs.pdx.edu/House/ -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 16/07/07, Sebastian Sylvan
On 16/07/07, Derek Elkins
wrote: On Mon, 2007-07-16 at 17:41 +0100, Martin Coxall wrote:
Ah, the secret of Haskell is to make low-level-looking code run slower than high level code so that people write high-level code.
The secret of programming is to know which tools to use for which job. If you're writing device drivers in Visual Basic, you've made a strategic misstep and need to re-evaluate.
That sounds like a challenge.
Well they've been written in both Haskell[1], and C#[2], so VB might not be out of the realm of possibility (in fact, I think any language that compiles to CIL is fine for [2])!
[1] http://programatica.cs.pdx.edu/House/ [2] http://programatica.cs.pdx.edu/House/
[2] http://research.microsoft.com/os/singularity/ -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

Well they've been written in both Haskell[1], and C#[2], so VB might not be out of the realm of possibility (in fact, I think any language that compiles to CIL is fine for [2])!
Ah, but that's really VB.Net rather than proper Old School VB. VB.Net is just C# in a flowery frock. My point stands though, although you can write any program in any Turing-complete language, doesn't mean you *should*. Martin

Hugh Perkins
My ideal testing environment is something like http://topcoder.com , but topcoder doesnt support Haskell at this time (probably because it would lose all the time, so noone would use it)
I'd quite like to see a version of topcoder for Haskell, but shootout is not it.
??? Topcoder certainly isn't about benchmarking. Undoubtedly, it would be absolutely awesome to be able to use Haskell in topcoder... but it wouldn't say anything about speed. My guess is that practically no topcoder submissions fail by exceeding the allowable time limit. The competition (the alg one, which is the only one anyone really cares about) is about solving problems quickly (in programmer time) and accurately. But wow, that would be great! I suppose there's not much chance of lobbying them for the change. :) -- Chris Smith

Hello Chris, Monday, July 16, 2007, 8:46:37 AM, you wrote:
Topcoder certainly isn't about benchmarking. Undoubtedly, it would be absolutely awesome to be able to use Haskell in topcoder... but it wouldn't say anything about speed. My guess is that practically no topcoder submissions fail by exceeding the allowable time limit. The competition (the alg one, which is the only one anyone really cares about) is about solving problems quickly (in programmer time) and accurately.
that's ideal for haskell. like ICFP, if they will allow haskell code, then all winer solutions will be written using it -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 7/16/07, Bulat Ziganshin
Topcoder certainly isn't about benchmarking. Undoubtedly, it would be absolutely awesome to be able to use Haskell in topcoder... but it wouldn't say anything about speed. My guess is that practically no topcoder submissions fail by exceeding the allowable time limit. The competition (the alg one, which is the only one anyone really cares about) is about solving problems quickly (in programmer time) and accurately.
that's ideal for haskell. like ICFP, if they will allow haskell code, then all winer solutions will be written using it
Careful. Although writer's solution (the "reference" solution) must run in under 1 second in Java, many problems run really close to the 2-second limit in practice. That means if your language is inherently 30% slower, you may fail the harder problems.

Hugh Perkins
Careful. Although writer's solution (the "reference" solution) must run in under 1 second in Java, many problems run really close to the 2-second limit in practice. That means if your language is inherently 30% slower, you may fail the harder problems.
As I said, in my experience with TopCoder (mainly a couple years ago), I never ran into a problem with the time limit. I've run into plenty of problems where I wasn't able to complete the problems due to time limit there. I suspect most others have similar experiences. If you've competed and your programs timed out a lot, you're probably better off with better algorithms than hunting for a faster language. -- Chris Smith

On 7/15/07, Donald Bruce Stewart
or maybe 'pidigits', a lazy pi generator,
http://shootout.alioth.debian.org/gp4/benchmark.php?test=pidigits&lang=ghc&id=0
This is I/O bound, which isnt interesting, unless you really want to benchmark I/O to console? We can improve it by instead of printing the digits to I/O, returning the number of occurrences of a specific digit, let's say 3.

hughperkins:
Hey, I just realized I can shave off another 30% in C# ;-) So now the timings become:
Ok. So do the same thing to the Haskell program. The compilers should produce pretty much identical assembly. {-# OPTIONS -O2 -optc-O -fbang-patterns #-} import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits main = print (pureSieve 10000000) pureSieve :: Int -> Int pureSieve n = runST( sieve n ) sieve n = do a <- newArray (0,n-1) True :: ST s (STUArray s Int Bool) let cutoff = truncate (sqrt (fromIntegral n)) + 1 go a n cutoff 2 0 go !a !m cutoff !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then if n < cutoff then let loop !j | j < m = do x <- unsafeRead a j when x $ unsafeWrite a j False loop (j+n) | otherwise = go a m cutoff (n+1) (c+1) in loop ( if n < 46340 then n * n else n `shiftL` 1) else go a m cutoff (n+1) (c+1) else go a m cutoff (n+1) c $ ghc -o primes primes.hs $ time ./primes 664579 ./primes 0.38s user 0.00s system 95% cpu 0.392 total And indeed, it runs nearly 50% faster. All this benchmark does is thrash the cache, so every write that avoids dirtying the cache is worth avoiding, hence you should always check if you need to set a bit. Given the same algorithm, any native code compiler should produce roughly the same result, since its really a hardware benchmark. -- Don

On 15/07/07, Donald Bruce Stewart
hughperkins:
Hey, I just realized I can shave off another 30% in C# ;-) So now the timings become:
Ok. So do the same thing to the Haskell program. The compilers should produce pretty much identical assembly.
{-# OPTIONS -O2 -optc-O -fbang-patterns #-}
import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits
main = print (pureSieve 10000000)
pureSieve :: Int -> Int pureSieve n = runST( sieve n )
sieve n = do a <- newArray (0,n-1) True :: ST s (STUArray s Int Bool) let cutoff = truncate (sqrt (fromIntegral n)) + 1 go a n cutoff 2 0
go !a !m cutoff !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then if n < cutoff then let loop !j | j < m = do x <- unsafeRead a j when x $ unsafeWrite a j False
Surely you can remove the read here, and just always do the write? -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan
Surely you can remove the read here, and just always do the write?
Ah you'd think so, but if it's anything like the C# version, strangely that would be slower. In his last message Don explains that this is because the write dirties the cache, which is a Bad Move (tm).

On 15/07/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: Surely you can remove the read here, and just always do the write?
Ah you'd think so, but if it's anything like the C# version, strangely that would be slower. In his last message Don explains that this is because the write dirties the cache, which is a Bad Move (tm).
Ah, it was faster in GHCi, and I couldn't test it compiled because my GHC install is a bit messed up at the moment... -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

dons:
hughperkins:
Hey, I just realized I can shave off another 30% in C# ;-) So now the timings become:
Ok. So do the same thing to the Haskell program. The compilers should produce pretty much identical assembly.
Oh, and I forgot you count up by two now. Here's the Haskell transliteration (again). {-# OPTIONS -O2 -optc-O -fbang-patterns #-} import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits main = print (pureSieve 10000000) pureSieve :: Int -> Int pureSieve n = runST( sieve n ) sieve n = do a <- newArray (3,n) True :: ST s (STUArray s Int Bool) let cutoff = truncate (sqrt (fromIntegral n)) + 1 go a n cutoff 3 1 go !a !m cutoff !n !c | n >= m = return c | otherwise = do e <- unsafeRead a n if e then if n < cutoff then let loop !j | j < m = do x <- unsafeRead a j when x $ unsafeWrite a j False loop (j+n) | otherwise = go a m cutoff (n+2) (c+1) in loop ( if n < 46340 then n * n else n `shiftL` 1) else go a m cutoff (n+2) (c+1) else go a m cutoff (n+2) c Marginally faster: $ time ./primes 664579 ./primes 0.34s user 0.00s system 89% cpu 0.385 total Very cache-dependent though, so widely varying runtimes could be expected. -- Don

dons:
dons:
hughperkins:
Hey, I just realized I can shave off another 30% in C# ;-) So now the timings become:
Ok. So do the same thing to the Haskell program. The compilers should produce pretty much identical assembly.
Oh, and I forgot you count up by two now. Here's the Haskell transliteration (again).
Oh, also, I was using the wrong brackets in the last program! Stick with me, because this makes the program go at least 100x faster. First, we'll move the pureSieve into a library module: {-# OPTIONS -O2 -optc-O -fbang-patterns #-} module Primes (pureSieve) where import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits pureSieve :: Int -> Int pureSieve n = runST ( sieve n ) sieve n = do a <- newArray (3,n) True :: ST s (STUArray s Int Bool) let cutoff = truncate (sqrt (fromIntegral n)) + 1 go a n cutoff 3 1 go !a !m cutoff !n !c | n >= m = return c | otherwise = do e <- unsafeRead a n if e then if n < cutoff then let loop !j | j < m = do x <- unsafeRead a j when x $ unsafeWrite a j False loop (j+n) | otherwise = go a m cutoff (n+2) (c+1) in loop ( if n < 46340 then n * n else n `shiftL` 1) else go a m cutoff (n+2) (c+1) else go a m cutoff (n+2) c And now just a module to call it: {-# OPTIONS -fth #-} import Primes main = print $( let x = pureSieve 10000000 in [| x |] ) Pretty simple to compile and run this now: $ ghc --make -o primes Main.hs $ time ./primes 664579 ./primes 0.00s user 0.01s system 228% cpu 0.003 total Oh! Much faster. Looks like Haskell is 100x faster than C#. Who gets fired? :) -- Don

On 15/07/07, Donald Bruce Stewart
dons:
dons:
hughperkins:
Hey, I just realized I can shave off another 30% in C# ;-) So now the timings become:
Ok. So do the same thing to the Haskell program. The compilers should produce pretty much identical assembly.
Oh, and I forgot you count up by two now. Here's the Haskell transliteration (again).
Oh, also, I was using the wrong brackets in the last program! Stick with me, because this makes the program go at least 100x faster.
First, we'll move the pureSieve into a library module:
{-# OPTIONS -O2 -optc-O -fbang-patterns #-}
module Primes (pureSieve) where
import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits
pureSieve :: Int -> Int pureSieve n = runST ( sieve n )
sieve n = do a <- newArray (3,n) True :: ST s (STUArray s Int Bool) let cutoff = truncate (sqrt (fromIntegral n)) + 1 go a n cutoff 3 1
go !a !m cutoff !n !c | n >= m = return c | otherwise = do e <- unsafeRead a n if e then if n < cutoff then let loop !j | j < m = do x <- unsafeRead a j when x $ unsafeWrite a j False loop (j+n)
| otherwise = go a m cutoff (n+2) (c+1)
in loop ( if n < 46340 then n * n else n `shiftL` 1) else go a m cutoff (n+2) (c+1)
else go a m cutoff (n+2) c
And now just a module to call it:
{-# OPTIONS -fth #-}
import Primes
main = print $( let x = pureSieve 10000000 in [| x |] )
Pretty simple to compile and run this now:
$ ghc --make -o primes Main.hs $ time ./primes 664579 ./primes 0.00s user 0.01s system 228% cpu 0.003 total
Oh! Much faster. Looks like Haskell is 100x faster than C#. Who gets fired? :)
Oooh, I love it! -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 15/07/07, Donald Bruce Stewart
Oh! Much faster. Looks like Haskell is 100x faster than C#. Who gets fired? :)
Well, you've switched back to using unsafe operations there, Donald ;-) Anyway, before you guys get too narked at me ;-) I'd just like to say that I'm very impressed by the work that is going on in Haskell. I wouldnt be here otherwise. Haskell stands a very good chance of solving one of the great unsolved issues at this time, which is: managing threading. Just dont want you guys to get too complacent ;-) Running benchmarks against Java and so on is much more motivational than running against g++, because you can no longer say "oh well Java is faster because it doesnt have a GC!", because clearly it does; and bounds-checking, and many other useful goodies besides. So keep up the good work. I'm quite happy for my tax dollars to be being spent on Haskell research :-) and I'm really looking forward to seeing what comes out of it.

On Jul 15, 2007, at 8:45 , Donald Bruce Stewart wrote:
main = print $( let x = pureSieve 10000000 in [| x |] )
I'm reminded of the C++ expert in CMU SCS who used to amuse himself by making template expansion do all the real work at compile time. (Yes, including a prime sieve.) -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On 7/15/07, Donald Bruce Stewart
Oh, and I forgot you count up by two now. Here's the Haskell transliteration (again).
{-# OPTIONS -O2 -optc-O -fbang-patterns #-}
import Control.Monad.ST import Data.Array.ST import Data.Array.Base import System import Control.Monad import Data.Bits
main = print (pureSieve 10000000)
pureSieve :: Int -> Int pureSieve n = runST( sieve n )
sieve n = do a <- newArray (3,n) True :: ST s (STUArray s Int Bool) let cutoff = truncate (sqrt (fromIntegral n)) + 1 go a n cutoff 3 1
go !a !m cutoff !n !c | n >= m = return c | otherwise = do e <- unsafeRead a n if e then if n < cutoff then let loop !j | j < m = do x <- unsafeRead a j when x $ unsafeWrite a j False loop (j+n)
| otherwise = go a m cutoff (n+2) (c+1)
in loop ( if n < 46340 then n * n else n `shiftL` 1) else go a m cutoff (n+2) (c+1)
else go a m cutoff (n+2) c
Marginally faster:
$ time ./primes 664579 ./primes 0.34s user 0.00s system 89% cpu 0.385 total
Very cache-dependent though, so widely varying runtimes could be expected.
-- Don
Hi Donald, quick question. So, one of the things that is very interesting about Haskell is it's potential for automatic threading, ie you write a trivial algorithm that looks like it runs in a single thread, and the runtime splits it across multiple cores automatically. It's fairly safe to say that maps, foldrs, foldls, and their derivatives are safe to parallelize? (For example, hand-waving argument, a foldr of (/) on [1,5,7,435,46,2] can be split into a foldr on [1,5,7] and a foldr on [435,46,2], then their results combined). To what extent is the technology you are using in your algorithm parallizable? (I actually cant tell, it's a genuine question). In the case that it is parallelizable, to what extent is it trivial for a runtime to know this? (Again, I dont have enough information to tell)

On 7/15/07, Donald Bruce Stewart
What C# version are you using, by the way? (So I can check if it does any tricks).
- csc is in the Microsoft.Net Framework 2.0 runtime, which you can download from microsoft.com (free download). - gmcs/mono are from Mono 1.2.2.1 , which you can download from http://www.mono-project.com/Main_Page

On Sunday 15 July 2007 11:39:32 Hugh Perkins wrote:
On 7/15/07, Donald Bruce Stewart
wrote: [snip] unsafeWrite[snip] [snip]unsafeRead[snip]
Hi Donald, the idea is to use this for operational code, so avoiding unsafe operations is preferable ;-)
Then you should use bounds checked access in C++. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?e

Hey, guys, I just realized this test is not really fair! I've been using the Microsoft .Net compiler ,which is a proprietary closed-source compiler. To be fair to Haskell, we should probably compare it to other open source products, such as g++ and mono? Here are the timings ;-) Haskell ====== J:\dev\haskell>ghc -O2 -o primechaddai.exe PrimeChaddai.hs J:\dev\haskell>primechaddai number of primes: 664579 Elapsed time: 26.234 g++ === J:\dev\test\testperf>g++ -O2 -o prime.exe prime.cpp J:\dev\test\testperf>prime number of primes: 664579 elapsed time: 0.984 mono ==== J:\dev\test\testperf>erase primecs.exe J:\dev\test\testperf>gmcs primecs.cs J:\dev\test\testperf>mono primecs.exe number of primes: 664579 elapsed time: 0,719 Microsoft C# ========= J:\dev\test\testperf>csc /nologo primecs.cs J:\dev\test\testperf>primecs number of primes: 664579 elapsed time: 0,6875 Not only does mono come close to the Microsoft .Net time, both mono and Microsoft .Net are faster than g++ ;-) and whack Haskell. Here's the C++ code for completeness: #include <iostream> #include <ctime> using namespace std; int CalculateNumberOfPrimes( int maxprime ) { bool *IsPrime = new bool[ maxprime ]; for( int i = 0; i < maxprime; i++ ) { IsPrime[i] = true; } int NumberOfPrimes = 0; for( int i = 2; i < maxprime; i++ ) { if( IsPrime[i] ) { NumberOfPrimes++; for( int j = ( i << 1 ); j < maxprime; j+= i ) { IsPrime[ j] = false; } } } return NumberOfPrimes; } int main( int argc, char *argv[] ) { clock_t start = clock(); int NumberOfPrimes = CalculateNumberOfPrimes( 10000000 ); cout << "number of primes: " << NumberOfPrimes << endl; clock_t finish = clock(); double time = (double(finish)-double(start))/CLOCKS_PER_SEC; cout << "elapsed time: " << time << endl; return 0; }

hughperkins:
Hey, guys, I just realized this test is not really fair! I've been using the Microsoft .Net compiler ,which is a proprietary closed-source compiler. To be fair to Haskell, we should probably compare it to other open source products, such as g++ and mono? Here are the timings ;-) Haskell ====== J:\dev\haskell>ghc -O2 -o primechaddai.exe PrimeChaddai.hs J:\dev\haskell>primechaddai number of primes: 664579 Elapsed time: 26.234
Oh, I think we can do a bit better than that.... See below.
g++ === J:\dev\test\testperf>g++ -O2 -o prime.exe prime.cpp J:\dev\test\testperf>prime number of primes: 664579 elapsed time: 0.984 mono ==== J:\dev\test\testperf>erase primecs.exe J:\dev\test\testperf>gmcs primecs.cs J:\dev\test\testperf>mono primecs.exe number of primes: 664579 elapsed time: 0,719 Microsoft C# ========= J:\dev\test\testperf>csc /nologo primecs.cs J:\dev\test\testperf>primecs number of primes: 664579 elapsed time: 0,6875 Not only does mono come close to the Microsoft .Net time, both mono and Microsoft .Net are faster than g++ ;-) and whack Haskell.
I reimplemented your C++ program, could you time this please? {-# OPTIONS -O2 -fbang-patterns #-} import Data.Array.IO import Data.Array.Base import System import Control.Monad import Data.Bits import Text.Printf main = sieve 10000000 sieve n = do a <- newArray (2,n) True :: IO (IOUArray Int Bool) -- an array of Bool r <- go a n 2 0 printf "Primes up to %8d %8d\n" (n::Int) (r::Int) :: IO () go !a !m !n !c | n == m = return c | otherwise = do e <- unsafeRead a n if e then let loop !j | j < m = do x <- unsafeRead a j when x (unsafeWrite a j False) loop (j+n) | otherwise = go a m (n+1) (c+1) in loop (n `shiftL` 1) else go a m (n+1) c On my machine: $ ghc -o primes primes.hs $ time ./primes Primes up to 10000000 664579 ./primes 0.52s user 0.01s system 99% cpu 0.533 total 0.5s. So rather fast. Enjoy! Don

On Sunday 15 July 2007 11:19:29 Hugh Perkins wrote:
Not only does mono come close to the Microsoft .Net time, both mono and Microsoft .Net are faster than g++ ;-) and whack Haskell.
This should tell you that your C++ is not very good. This is several times faster, for example: #include <iostream> #include <ctime> #include <vector> using namespace std; int CalculateNumberOfPrimes(int maxprime) { vector<bool> NotPrime(maxprime+1); int NumberOfPrimes = 0; for (int i = 2; i <= maxprime; i++ ) if (!NotPrime[i]) { ++NumberOfPrimes; for (int j = 2*i; j <= maxprime; j += i) if (!NotPrime[j]) NotPrime[j] = true; } return NumberOfPrimes; } For some reason you were using C-style allocation rather than the C++ STL to implement a bit vector. The STL implementation is optimized. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. The OCaml Journal http://www.ffconsultancy.com/products/ocaml_journal/?e

Hello Jon, Sunday, July 15, 2007, 9:46:42 PM, you wrote:
This should tell you that your C++ is not very good. This is several times faster, for example:
For some reason you were using C-style allocation rather than the C++ STL to implement a bit vector. The STL implementation is optimized.
i bet that this version allocates exactly one bit per element, like Haskell version. so *this* comparison is fair, while old was unfair -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On 15/07/07, Hugh Perkins
Sebastian,
Why would I write a slow, complicated algorithm in C#?
I'm not making these comparisons for some academic paper, I'm trying to get a feel for how the languages run in practice.
And really in practice, I'm never going to write a prime algorithm using merge and so on, I'd just use the original naive Haskell algorithm, that runs 500 times slower (at least) than my naive C# algo. I'm just allowing you guys to optimize to see how close you can get.
Note that the C# algo is not something created by C# experts, it's just something I hacked together in like 2 minutes.
I take it you really are using the Sieve as a final program then? And not just as a benchmark? Because if you *are* trying to compare the two languages fairly, then your reply just doesn't make sense. Just because you aren't using the laziness of your data structure in the Haskell version for this benchmark doesn't mean it's not there, and could be exploited in a *real* program. Therefore if you want to compare it to C# you can't just ignore the main properties of the Haskell version in order to make it faster in C#! Heck, take that to its extreme (you don't seem to care much about using the same algorithm in both languages anyway) you could just hard code the answer for all 32 bit numbers in a large table for the C# version and claim it's millions of times faster! The C# algorithm is the same as the Haskell algorithm! You can't just pick and choose two different algorithms and say that one of the languages is better based on that! It's like saying C is faster than Erlang on your computer, even though Erlang scales to hundreds of threads ("Why would I write something using threads in C?"). Stick with one algorithm, and implement it in the same way in both languages, and then compare. If one of the languages manages to handles this more gracefully and elegantly, that's beside the point and doesn't give you the license to go off and do something completely different in the other language and still think you have anything useful on your hands. The fact is that languages are different, with different strenghts and weaknesses. Whatever comparison you come up with, chances are that one of the languages will deal with it more gracefully. If you implement something using lazy streams, Haskell will be more elegant, if you want to use low level pointer arithmetic than C will do better etc. So either you have to implement a Haskell version of your C# code, or implement a C# version of your Haskell code (which I did). Just because the benchmark is simple, doesn't mean you can tailor-fit the algorithm to the specific properties of the benchmark in one of the instances, while keeping it general and nice in the other. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862

On 7/15/07, Sebastian Sylvan

On 15/07/07, Hugh Perkins
On 7/15/07, Sebastian Sylvan
wrote: [me thinks he doth protest too much] ;-) The rules of the competition are quite fair: both sides make an optimal algorithm using their preferred language. It's ok to hardcode the first 3 or 4 primes if you must, hardcoding the entire resultset is out ;-)
Why? How may primes are there amont the first 2^32-1 numbers? Shouldn't be unreasonable to just stick it in a table/map if all you're interested in is getting the fastest result regardless of algorithm... Seems quite arbitrary that you allow various tricks to make C# look better (it might still be faster, but I haven't seen you do a fair comparison yet!), but won't take it to its logical conclusion! If you *are* interested in a fair comparison I recommend you don't cheat in either language, and try to write a more general algorithm for both, that have the same properties (e.g. lazy streams of primes in both, or a fixed up-front allocated array of primes in both). -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862
participants (26)
-
Aaron Denney
-
ajb@spamcop.net
-
Alex Queiroz
-
Andreas Marth
-
Andrew Coppin
-
brad clawsie
-
Brandon S. Allbery KF8NH
-
Bulat Ziganshin
-
Chaddaï Fouché
-
Chris Smith
-
Claus Reinke
-
Creighton Hogg
-
Derek Elkins
-
dons@cse.unsw.edu.au
-
Henk-Jan van Tuyl
-
Henning Thielemann
-
Hugh Perkins
-
Ian Lynagh
-
Jon Harrop
-
Martin Coxall
-
Neil Mitchell
-
ok
-
Sebastian Sylvan
-
Stefan O'Rear
-
Steve Schafer
-
Tony Morris