
Hi all, I am a Haskell newbie, but have been interested in Haskell (and generally speaking ML-derivates) for some time. I am currently evaluating different languages for implementing an application which will have to manipulate large graphs representing the structure of programs and their evolution. In this respect, I need to find a language which offers a proper paradigm for implementation of graph algorithms (possibly involing some AI techniques), while offering great speed. Speed is in fact a crucial criterium for the language choice. I once read that Haskell was rather slow, especially when compared to Standard ML, which seems to be confirmed by "The Great Computer Language Shootout" (http://www.bagley.org/~doug/shootout/craps.shtml). However, this performance comparison seems rather outdated, so I wanted to know if GHC had improved in performance in the meantime. For now, the (somewhat unpleasing) conclusion I came with is that a Java (app core), Python (proto, UI) plus some interfaced C++ libs would be OK... but I would largely prefer to use a fonctional language for my application core. OCaml would have been just great, if only it had a properly designed library and a decent syntax... like Haskell ! What do you Haskell-experts could say about using Haskell for an application requiring graph processing performance ? TIA, -- Sébastien

Hello S\'ebastien,
I am a Haskell newbie, but have been interested in Haskell (and generally speaking ML-derivates) for some time. I am currently evaluating different languages for implementing an application which will have to manipulate large graphs representing the structure of programs and their evolution.
Maybe you should consider using a combination of Haskell and C++. I bet there are good graph libraries in C++ available so that would save a lot of work. Using Haskell for the front end is a lot more fun than C++ or Java, I think. As you say you are a Haskell newbie I must warn you that linking to a C++ library is not trivial and that's not a Haskell problem but more a compiler-options-problem. What I suggest is exactly what I am doing at the moment; I am building a tool to edit Bayesian networks and to apply inference algorithms to them. The GUI is written in Haskell (wxHaskell) and the inference is a C++ library I am reusing. The choice for C++ was not speed but reuse. I have never come across a problem for which Haskell was too slow, but maybe that depends on my choice of problems. The tool I am writing now runs smoothly on an ancient (700Mhz) computer. Good luck, Arjan

Sébastien Pierre
I am currently evaluating different languages for implementing an application which will have to manipulate large graphs representing the structure of programs and their evolution.
Speed is in fact a crucial criterium for the language choice.
In my experience, Haskell is slow on a couple of accounts. First, I have problems getting IO to run at anything resembling C speeds. I've messed around with array IO and whatnot, but there still seems to be an order of magnitude or two. I don't know how to fix this, there have been a few remarks that indicate people can solve it, but there are still unanswered requests. If you figure it out, please let me know. Second, GHC tends to gobble a lot of memory. I think (somebody correct me if I'm wrong) a data structure like data Foo = Bar Bool Bool Bool will typically use four machine words (16 bytes) in addition to the data contents itself. You can improve things if you can make the data structure stricter, or if you can store things in UArrays instead of lists, and so on. I'd love to have a compiler that was, aggressively packing my data structures, but in the meantime, I'll just hide behind an abstract interface, and store things directly in Word datatypes. Related to this, some algorithms are slower when implemented in Haskell due to boxing and more indirection. For complex data structures (like your graph stuff probably needs), you will need much of this anyway, and implementing this in C is going to be error-prone and time consuming, and of course equally slow. Third (fourth?), Haskell is often too lazy by default, a bit of strictness will often help speed things up tremendously. Be prepared to spend a lot of time profiling time and memory to squeeze the last drop of performance from your program.
For now, the (somewhat unpleasing) conclusion I came with is that a Java (app core), Python (proto, UI) plus some interfaced C++ libs would be OK...
I think these are extraordinary choices if, as you say, speed is the most important factor. I would recommend Haskell for speed of development and correctness of implementation, but (probably) C for speed. You can of course combine the two with the FFI, but I don't know how trivial it is to pass Haskell graph structures to C and back.
What do you Haskell-experts
Oh, did I mention that I'm not an expert? I just play one on mailing lists :-) -kzm -- If I haven't seen further, it is by standing in the footprints of giants

I would recommend Haskell for speed of development and correctness of implementation, but (probably) C for speed. You can of course combine the two with the FFI, but I don't know how trivial it is to pass Haskell graph structures to C and back.
If you use a C library for speed, you want to design the junction so that you make relatively few calls across the border and pass only a small amount of data across the border. For example, if manipulating bitmaps, you would try to keep the bitmap object in the C world rather than copying it back and forth between C at every step. A nice property of this kind of interface is that you wouldn't have a Haskell graph structure to pass back and forth - the graph would live entirely in C. The only real complication you'll run into is that you'll probably need to use ForeignObj(ects) so that the graph objects can be released when the Haskell garbage collector finds it can't access them from the Haskell heap. Not really hard to do - just need to be careful. -- Alastair Reid

Hi all, Thanks for all your answers :) I am still unsure of whether Haskell would be a good competitor against other languages in my case, but it seems like if it does the best option would be to reuse C++ graph libraries and carefully write a wrapper around them to minimize passing values between C and Haskell worlds. In fact, I would like to know how Haskell compares in performance to other languages because if I refer to the page I mentioned (http://www.bagley.org/~doug/shootout/craps.shtml) it does not even compete with Python (which is rather... slow). This is kind of scary, and a quick search on the net turned out to confirm this : The test shows the performance ratio CC++ / Haskell (ghc-4.04) between 10 and 15 from http://www.dcs.gla.ac.uk/mail-www/haskell/msg02153.html. It would be nice if I could get some papers or benchmarks on how Haskell performs and scales. Cheers, -- Sébastien. PS: Regarding Java, it may be surprising, but when it comes to computation (non-GUI stuff) it behaves surprisingly fast, and when used properly can really compete with C++.

Hi Sébastien! On Thu, Mar 18, 2004 at 11:30:26AM +0100, Sébastien Pierre wrote:
In fact, I would like to know how Haskell compares in performance to other languages because if I refer to the page I mentioned (http://www.bagley.org/~doug/shootout/craps.shtml) it does not even compete with Python (which is rather... slow).
You should look at the individual examples and see how relevant their results are for you. And keep in mind that strings tend to be slow in Haskell (being lazy lists of characters), a fact that may influence some of the tests. Greetings, Carsten -- Carsten Schultz (2:38, 33:47), FB Mathematik, FU Berlin http://carsten.codimi.de/ PGP/GPG key on the pgp.net key servers, fingerprint on my home page.

On Thu, 18 Mar 2004, Carsten Schultz wrote:
Hi Sébastien!
On Thu, Mar 18, 2004 at 11:30:26AM +0100, Sébastien Pierre wrote:
In fact, I would like to know how Haskell compares in performance to other languages because if I refer to the page I mentioned (http://www.bagley.org/~doug/shootout/craps.shtml) it does not even compete with Python (which is rather... slow).
You should look at the individual examples and see how relevant their results are for you. And keep in mind that strings tend to be slow in Haskell (being lazy lists of characters), a fact that may influence some of the tests.
I agree completely. Here's a little more detailed analysis to make the comparison fair. The language shootout has 25 benchmarks to compare languages with. One thing which is not Haskell's in favour is the fact that Haskell isn't represented in all of these benchmarks. So to get a better comparison we should set the score to zero on those benchmarks where we don't have a Haskell program. I zeroed out the following benchmarks: List Processing, Method Calls, Object Instantiation, String Concatenation, Hashes Part II, and Regular Expression Matching. Suddenly Haskell climbs from 19'th place to 14'th place. Not dramatic but at least the comparison is more fair now. It is also interesting to see what happens if you set the lines of code multiplier to one instead of zero. Haskell now gets 7'th place. Just before Python as it happens.... Well, I think this shows that one should be very careful when reading these kinds of benchmarks. It is very easy to jump to conclusions. But I do believe there are some conclusions to be drawn if we look close enough. The main problem for Haskell on these benchmarks is I/O. In all benchmarks with I/O in them Haskell (or I should really say ghc) gets a really bad score. This is, however, a known problem. Simon PJ mentioned it in a previous mail. All I can say is that for my purposes it hasn't been a problem, so I really don't care about these benchmarks. Another thing with these benchmarks is that some of them are so tightly specified that the idiomatic Haskell solution didn't fit in. See for example the String Concatentation benchmark. The programs have to concatenate the string in such a way that the Haskell program gets a quadratic behaviour. I submitted a solution using concat which behaved OK but it was not allowed to compete. You can find it on the page of the benchmark. Well, enough about benchmarks. Just make sure you read them carefully. S'ebastien, I encourage you to choose the Haskell + C solution. It has worked well for me and many others. Paradox, a model finder for first order logic is written in Haskell plus some C for the performance sensitive parts. It won the SAT/Models class of last years CASC competition. There only speed counts (after correctness ofcourse). See: http://www.cs.chalmers.se/~koen/paradox/ All the best, /Josef

Josef Svenningsson
[Doug Bagley's Language Shootout]
You should look at the individual examples and see how relevant their results are for you.
Well, I think this shows that one should be very careful when reading these kinds of benchmarks.
And don't forget that the performance results are probably quite out-of-date now in any case. I took a quick look at just one benchmark from the Shootout: Nested Loops. The CPU times given are: gcc 0.10 ghc 0.34 Directly taking the given source code and compiling it afresh today, with -O turned on for both compilers, like in the original tests, I get results like these: gcc 0.90 ghc 1.70 Since I'm using a faster machine, I needed to increase the input size from N=16 to N=30 to get measurable timings, but the ratio between ghc and ghc today is now consistently about 2x for this benchmark, a big improvement over 3.4x. I guess the ghc implementers are to be congratulated. Regards, Malcolm

Hi again, Well, it seems like my little question raised an interesting thread, and brought me some valuable information. I am pleased to see that the Haskell community is particularily aware of the fact that being a fast language is far from being the most important criterium in most languages choice. Still, the application I am working on has major performance constraints, which if unmet may result in the project to be stopped. I also have to convince both clients and managers of the adequation of my choice regarding these constraints, and I must say that I would already have trouble to introduce something different than C or C++ (they are rather oldschool corporate people ;). For now, I assume that Haskell is very expressive, but has the speed of most interpreted language, which makes it an average choice for my application. However, I think that Haskell laziness could be of great value : I will have most of the application data stored in a database, so I could gain a lot of time in only loading what is useful for a processing. So I guess that everything lies in how I easily (and efficiently) I can mix C++ and Haskell. Thanks to all who answered :) -- Sébastien

G'day all.
Quoting Sébastien Pierre
I am still unsure of whether Haskell would be a good competitor against other languages in my case, but it seems like if it does the best option would be to reuse C++ graph libraries and carefully write a wrapper around them to minimize passing values between C and Haskell worlds.
As a matter of curiosity, have you considered prototyping your program in Haskell first, and then subsequently translating it into C++? (Boost.Graph is supposed to be quite good, incidentally.) If Haskell provides more rapid development (and I think it does), then you can spend time in Haskell making your algorithms work, finding the "hot spots" and generally optimising it, then translating parts of it into C++ as needed. Cheers, Andrew Bromage

Hi ! ajb@spamcop.net wrote:
I am still unsure of whether Haskell would be a good competitor against
other languages in my case, but it seems like if it does the best option would be to reuse C++ graph libraries and carefully write a wrapper around them to minimize passing values between C and Haskell worlds.
As a matter of curiosity, have you considered prototyping your program in Haskell first, and then subsequently translating it into C++? (Boost.Graph is supposed to be quite good, incidentally.)
I actually considered to prototype the program in Python and then use Java when things get settled... the fact is that I don't know Haskell much (I never wrote a program with it) and try to avoid C++ when I can (too error-prone in my opinion). Most of the program will be made of algorithms that manipulate RDF graphs using the Redland RDF library (C based), and this library happens to have both Python and Java bindings.
If Haskell provides more rapid development (and I think it does), then you can spend time in Haskell making your algorithms work, finding the "hot spots" and generally optimising it, then translating parts of it into C++ as needed.
Johan "Mahogny" made a point when relating his problems with implenting complex algorithms in C and Haskell. I personnaly had a rather frustrating experience of implementing a Ford-Fulkerson graph operation with Java... the imperative style plus the OO paradigm did not put my in the appropriate perspective to properly implement this. I truely think that functional languages are much more suitable for implementing complex graph algorithms than imperative languages, though I can't really explain why... (I may say that I consider functional/declarative programming good for expressing processing/behaviour and OO/imperative good for expressing data). Cheers, -- Sébastien
participants (8)
-
ajb@spamcop.net
-
Alastair Reid
-
Arjan van IJzendoorn
-
Carsten Schultz
-
Josef Svenningsson
-
Ketil Malde
-
Malcolm Wallace
-
Sébastien Pierre