Re: need help optimizing a function

9 Oct 2002


      On Tue, Oct 08, 2002 at 07:52:52AM -0700, Hal Daume III wrote:
...
I already saw one reply mentioning using state threaded arrays.  This is
probably your best (in terms of performance) option.
Keep in mind that Haskell arrays are *not* your standard imperative
arrays.  Like imperative arrays, they give you O(1) lookup, but only
O(n) insert.  If you want to keep with a functional style, I'd suggest
using a different data structure for the data.  A FiniteMap should work
fine and should give you pretty good (at least much
better) performance.  And you won't have to give up the FP style (for
whatever that's worth).
Thanks everyone for the advice! I think I now understand pretty well why my
old code didn't work.  I ended up switching to a binary tree structure,
since I didn't really need O(1) lookup anyways (since I had to do a binary
search in the first place to decide which element if any to modify.  This
changes the algorithm as a whole (I believe) from being O(n^2) back to O(n
log n) as it should be.

I get a speedup of about a factor of 6 for the test files I was using (of
course, this would depend on file size), and find that now only 2% of my
time is spent in that function.  I'm still something like 100 times slower
than GNU diff, so I think (hope, certainly!) there's still room for more
improvement (another day).  I'm sure I'll have more questions in a few
days, once I've tracked down what the new bottlenecks are.
-- 
David Roundy
http://civet.berkeley.edu/droundy/