
I'm not a Haskell expert-- in fact, I'm a beginner, and that's why I'm here, but I read your blog post on the subject of this hash table program, and it seems to me from reading the comments in reply that you're just here trolling, because you're using an algorithm that is fundamentally not purely functional, and of course that's going to be slower, just like asking Joel Zumaya to pitch left handed. If you were being honest about your complaint, you'd make an apples-to-apples comparison, and a number of commenters on your blog have proposed implementations that perform much better than your example. Sorry if I'm just being a jerk, Nathan M. Holden, Haskell Beginner On Saturday 21 November 2009 12:55:03 pm beginners-request@haskell.org wrote:
Message: 4 Date: Sat, 21 Nov 2009 18:02:33 +0000 From: Jon Harrop
Subject: Re: [Haskell-beginners] Re: Is Haskell for me? To: Ben Lippmeier Cc: beginners Message-ID: <200911211802.33494.jon@ffconsultancy.com> Content-Type: text/plain; charset="iso-8859-1" On Saturday 21 November 2009 11:56:09 Ben Lippmeier wrote:
Hmm. I'd be careful about conflating algorithmic complexity with memory management issues.
No need. Just look at how badly the performance scales for Haskell vs other languages. For example, inserting 1-16 million floating point key/values into a hash table:
Haskell OCaml F# 1M: 3.198s 1.0x 1.129s 1.0x 0.080s 1.0x 2M: 8.498s 2.7x 2.313s 2.0x 0.138s 1.7x 4M: 25.697s 8.0x 4.567s 4.0x 0.281s 3.5x 8M: 97.994s 30.6x 10.450s 9.3x 0.637s 8.0x 16M: 388.080s 121.4x 23.261s 20.6x 1.338s 16.7x
Note that Haskell is 290x slower than F# on that last test.
In practice, you would turn to a purely functional dictionary in Haskell based upon balanced binary trees in order to work around this long-standing bug in the GC but those trees incur O(log n) indirections and typically run orders of magnitude slower than a decent hash table.
Suffice to say, Haskell is nowhere near being in the ballpark of C++'s performance for basic functionality like dictionaries and sorting.
By the above reasoning, if I were to run any program using arrays on a system with a two space garbage collector (which copies all live objects during each GC) I could say the worst case algorithmic complexity was O(n). That doesn't sound right.
Can you write a program that demonstrates this effect as I did?
I could take this further and say that in a virtual memory system, there is a chance that the whole heap gets copied to the disk and back between each array update.
Can you write a program that demonstrates this effect as I did?

On Saturday 21 November 2009 18:57:21 you wrote:
I'm not a Haskell expert-- in fact, I'm a beginner, and that's why I'm here, but I read your blog post on the subject of this hash table program, and it seems to me from reading the comments in reply that you're just here trolling, because you're using an algorithm that is fundamentally not purely functional, and of course that's going to be slower, just like asking Joel Zumaya to pitch left handed.
If you were being honest about your complaint, you'd make an apples-to-apples comparison, and a number of commenters on your blog have proposed implementations that perform much better than your example.
Sorry if I'm just being a jerk,
Not at all, that is a perfectly reasonable concern but I did already try to address it in my previous post:
In practice, you would turn to a purely functional dictionary in Haskell based upon balanced binary trees in order to work around this long-standing bug in the GC but those trees incur O(log n) indirections and typically run orders of magnitude slower than a decent hash table.
For example, the following Haskell program builds a purely functional Data.Map: module Main where import Prelude hiding (lookup) import Data.Map (empty, insert, lookup, size) n = 1000000 build m 0 = m build m n = build (insert x (1.0 / x) m) (n-1) where x = fromIntegral n :: Double main = do let m = build empty n (Just v) = lookup 100 m print $ size m print v Running this program with different "n" gives: Data.Map 1M: 2.797s 1.0x 2M: 6.090s 2.2x 4M: 14.226s 5.1x 8M: 28.449s 10.2x 16M: 83.171s 29.7x This is several times faster than Haskell's Data.Hashtable (because of the long-standing bug in their GC that I described) and is scaling better. However, if you compare with the timings I gave before: Data.Hashtable OCaml F# 1M: 3.198s 1.0x 1.129s 1.0x 0.080s 1.0x 2M: 8.498s 2.7x 2.313s 2.0x 0.138s 1.7x 4M: 25.697s 8.0x 4.567s 4.0x 0.281s 3.5x 8M: 97.994s 30.6x 10.450s 9.3x 0.637s 8.0x 16M: 388.080s 121.4x 23.261s 20.6x 1.338s 16.7x you'll see that the absolute performance of this idiomatic Haskell solution is still absolutely awful: consistently about 50x slower than the F#. This is also true in the context of sorting: Haskell's standard library routines for sorting are orders of magnitude slower than those found in most other compiled languages. Suffice to say, idiomatic Haskell is also nowhere near being in the same ballpark as C++ with respect to performance. Realistically, with enough expertise you should be able to optimize most of your Haskell programs to beat Python's performance but there are some important cases where you will not even be able to do that. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e
participants (2)
-
Jon Harrop
-
Nathan M. Holden