
kamil:
Hi all,
I want to learn how to optimize Haskell code on the Norvig's spelling correction program [1]. Peter Norvig has written the program in Python, quite a literate translation to Haskell (thanks to Grzegorz Chrupala) [2] was also available.
Both versions are about 23 LOCs, and Haskell version is four times slower, 2m25s versus 36s. All the numbers I give come from running the program on a list of 806 misspellings, Haskell version compiled with - O2 and -fforce-recomp flags.
I started with trial and error approach. Just preventing a repetitive call to keysSet brought the running time down to 1m48s. I also restructured the edits1 function and dropped all uses of sets, which haven't been pulling their own weight. This brought the running time down to 55s. Not quite there yet but close.
Reading the Profiling chapter in the RWH book enabled me to do some more informed optimizing. At this point the GC time was about 2% of the overall runnig time, Grzegorz has done a good job of using strict versions of functions where necessary. The timing per expression however showed something useful, 70% of the time was being spent around determining membership in the known words dictionary (Data.Map). There are about 30 000 items in the dictionary.
What are the keys in your Map? Strict ByteStrings? And you're using ghc 6.10.x? -- Don