Re: [Haskell-cafe] NLP libraries and tools?

On 7/6/11 8:46 PM, Richard O'Keefe wrote:
I've been working over the last year+ on an optimized HMM-based POS tagger/supertagger with online tagging and anytime n-best tagging. I'm planning to release it this summer (i.e., by the end of August), though there are a few things I'd like to polish up before doing so. In particular, I want to make the package less monolithic. When I release it I'll make announcements here and on the nlp@ list.
One of the issues I've had with a POS tagger I've been using is that it makes some really stupid decisions which can be patched up with a few simple rules, but since it's distributed as a .jar file I cannot add those rules.
How horrid. I assume the problem is really that the trained model is in the jar and you can't do your own training? Or is this a Brill-like tagger where you really mean to add new rules? If an HMM-based tagger is amenable, you could try switching to Daniël de Kok's Java port of TnT: https://github.com/danieldk/jitar The tagger I'm working on does support being hooked up to a Java client (i.e., consumer of tagging info), but it's fairly ugly due to Java's refusal to believe in IPC. -- Live well, ~wren
participants (1)
-
wren ng thornton