
On Nov 29, 2007 5:31 AM, Reinier Lamers
Especially in the fuzzy cases like this one, NLP often turns to machine learning models. One could try to train a hidden Markov model or support vector machines to label parts of the string as "name", "street", "number", "city", etc. These techniques work very well for part of speech tagging in natural language, and this seems similar. However, you need a manually annotated set of examples to train the models. If you really have a big load of data and it seems like a good solution, you could use an off-the-shelf part-of-speech tagger like SVMTool (http://www.lsi.upc.edu/~nlp/SVMTool/http://www.lsi.upc.edu/%7Enlp/SVMTool/) to do it.
Reinier
Hi Reinier, Thanks for the link to SVMTool. I don't have the basis to understand most of the NLP articles I found and get stuck on the first NLP's slang words. For me using an existing tool will be easier than build a new one. I'm currently looking at the tool's documentation and it looks quite promising. It seems to be very generic and highly reusable. Cheers, Olivier.