
Ketil Malde wrote:
Lyle Kopnicky
writes: If you have some other metric other than prefix in mind for partial matches, then things probably get a lot more complicated. You're probably looking at calculating minimum distances in some feature-space, which calls for pretty sophisticated algorithms if you need good performance.
Yes, that's the kind of thing I'm looking at doing. Looking at edit distance.
Do you really need that to search for movie titles? At any rate, an exact-match finite-map implementation is a good start - to get good performance, you probably will need to use some kind of index to reduce the amount of data to search exhaustively (all-against-all).
For text searching I think it is effective to use an index that maps from words (so that looking up a word gives you all the movies with that word in the title).
-k
If you really need to do edit distance, perhaps something like "Low Distortion Embeddings for Edit Distance" http://www.cs.ucla.edu/~rafail/PUBLIC/68.pdf would help? This is just one of the first result from a search, probably there are plenty of things out there. Make up from any constant factors from your language choice with advanced algorithms! (and any constant is getting pretty small these days) Brandon