
On Tue, Oct 08, 2002 at 07:52:52AM -0700, Hal Daume III wrote:
I already saw one reply mentioning using state threaded arrays. This is probably your best (in terms of performance) option.
Keep in mind that Haskell arrays are *not* your standard imperative arrays. Like imperative arrays, they give you O(1) lookup, but only O(n) insert. If you want to keep with a functional style, I'd suggest using a different data structure for the data. A FiniteMap should work fine and should give you pretty good (at least much better) performance. And you won't have to give up the FP style (for whatever that's worth).
Thanks everyone for the advice! I think I now understand pretty well why my old code didn't work. I ended up switching to a binary tree structure, since I didn't really need O(1) lookup anyways (since I had to do a binary search in the first place to decide which element if any to modify. This changes the algorithm as a whole (I believe) from being O(n^2) back to O(n log n) as it should be. I get a speedup of about a factor of 6 for the test files I was using (of course, this would depend on file size), and find that now only 2% of my time is spent in that function. I'm still something like 100 times slower than GNU diff, so I think (hope, certainly!) there's still room for more improvement (another day). I'm sure I'll have more questions in a few days, once I've tracked down what the new bottlenecks are. -- David Roundy http://civet.berkeley.edu/droundy/