
On 4/22/07, Pete Kazmier
Worse - and this is true for ByteStrings, too - String comparisons are O(n), which means lookups in Sets and Maps are expensive. A trie (i.e, a search tree where each internal node corresponds to a word prefix, and has one branch per letter in the alphabet) will give you lookup that scales with word size (and possibly alphabet size).
Right. My first version was just a direct translation of Norvig's code with an emphasis on trying to keep the complexity and size of code to a minimum.
AFAIK, Python's dictionaries are implemented using hashtables, so lookups with them are much faster than with M.Map String Int. So should a direct translation use a Data.HashTable (despite its unpure interface)? But probably a trie should be better than both (just gambling ;-). Cheers, -- Felipe.