Re: [Haskell-cafe] Parsing unstructured data

5 Dec 2007

      On Nov 29, 2007 5:31 AM, Reinier Lamers  wrote:
...
Especially in the fuzzy cases like this one, NLP often turns to machine
learning models. One could try to train a hidden Markov model or support
vector machines to label parts of the string as "name", "street",
"number", "city", etc. These techniques work very well for part of
speech tagging in natural language, and this seems similar. However, you
need a manually annotated set of examples to train the models. If you
really have a big load of data and it seems like a good solution, you
could use an off-the-shelf part-of-speech tagger like SVMTool
(http://www.lsi.upc.edu/~nlp/SVMTool/http://www.lsi.upc.edu/%7Enlp/SVMTool/)
to do it.
Reinier
Hi Reinier,

Thanks for the link to SVMTool. I don't have the basis to understand most of
the NLP articles I found and get stuck on the first NLP's slang words. For
me using an existing tool will be easier than build a new one. I'm currently
looking at the tool's documentation and it looks quite promising. It seems
to be very generic and highly reusable.

Cheers,

Olivier.

Re: [Haskell-cafe] Parsing unstructured data

Olivier Boudry