
Hi Pete,
Recently I read an interesting article by Peter Norvig[1] on how to write a spelling corrector in 21-lines of Python. I wanted to try and implement it in Haskell. My implementation is terribly slow and I was hoping for some tips on how to improve it and make it idiomatic.
I had a quick look at this. One way to speed up your program is by replacing the sets of altered words by lists. Filtering out doubles is a noble cause, but in this case it takes more time than keeping them around and doing some extra lookups in your frequency map. (I tried this, it gave a speedup factor of ~3.)
I'd love to see other Haskell implementations as well if anyone has a few moments to spare. Admittedly, it took me several hours to get my version working, but I'm a Haskell newbie. Unfortunately, I think it runs as slow as it took me to write it! There is definitely something wrong with it, a memory leak, because I can't correct more than a few words without a great deal of memory being consumed.
Be careful when you apply the |train| function to more than one word; in this form it may compute the frequency map from start for each invocation. (It is better to let |train| take a frequency map, instead of a list of words.) Also be sure to compile your program with full optimisation ('-O2' for ghc).