
On Mon, 30 Jul 2001, Simon Marlow wrote:
I've looked at a few. The most common factor in the worst performers seems to be the performance of String-related operations. I tried converting a few to use PackedStrings but that didn't help much. I suspect our PackedString implementation could do with some tuning, and we could gain considerable benefit from having an hGetLinePS function (like hGetLine but gets a PackedString).
Well, I think the common factor is I/O. The lazy I/O seems to be a real bottleneck here. Trying to improve that would gain even more, I think. I'm saying this because I did quite a bit of fiddling with some of the examples (including trying PackedStrings). The conclusion I drew from doing this was that the I/O performance was the problem.
Agreed - I/O is indeed a bottleneck, but I believe it's the strings-as-lists-of-characters aspect of I/O rather than the lazy aspect that's the killer. Enclosed is a version of the spell-checker program that is about a factor of 3 faster than the one that Doug currently has; it uses PackedStrings and a home-grown hash table. The only down side is that it reads the entire input into a big PackedString before producing anything, and profiling suggests that it spends about half its time splitting the big string into lines. I think if I hack up an hGetLinePS I might be able to get another factor of 2 out of it. Cheers, Simon
participants (1)
-
Simon Marlow