
For starters, I'm considering writing a polyparse [1] instance for Text. However, even with the current Bytestring instances for polyparse there seems to be an emphasis on character-based parsing. Polyparse is not very character-oriented at all; I tend to write a separate lexer, and then write the parsers over an application-specific token-stream. But ByteStrings are certainly character-oriented, and since many people like to mix lexing with parsing, I included an instance for BS. I imagine if someone wants to lex directly from a Text rather than a String or a BS, then that process is likely to be very character-oriented as well. Having said that, on those occasions when I do parse direct from a String-like input, almost all of the parsers use a "word" parser (i.e. multi-character, space-separated) as if it were a primitive. Such a word parser would almost certainly make heavy use of (break isSpace), unless there is a better alternative in Text? Is that the correct way of doing things? For example, what would be the best way to try to parse a text value when you don't care about case? When case is irrelevant, I tend to (map toUpper) over both the input stream, and any textual arguments to individual parsers. Regards, Malcolm