On Nov 29, 2007 5:31 AM, Reinier Lamers <reinier.lamers@phil.uu.nl> wrote:
Especially in the fuzzy cases like this one, NLP often turns to machine
learning models. One could try to train a hidden Markov model or support
vector machines to label parts of the string as "name", "street",
"number", "city", etc. These techniques work very well for part of
speech tagging in natural language, and this seems similar. However, you
need a manually annotated set of examples to train the models. If you
really have a big load of data and it seems like a good solution, you
could use an off-the-shelf part-of-speech tagger like SVMTool
(http://www.lsi.upc.edu/~nlp/SVMTool/ ) to do it.
Reinier