
Hi folks, I'm interested in writing a library to work with IMAP servers. I'm interested in thoughts people have on parsing libraries and methods. I'm a huge fan of Parsec overall -- it lets me have a single-stage parser, for instance. But it isn't sufficiently lazy for this task, and I probably will need to deal with ByteStrings instead of Strings, since some IMAP messages may be 30MB or more. So to give a very, very brief rundown of RFC3501, there are lots of ways that an IMAP server can encode things. For instance, we could see this: A283 SEARCH "TEXT" "string not in mailbox" which is the same as: A283 SEARCH TEXT "string not in mailbox" and the same as: A283 SEARCH {4} "string not in mailbox" TEXT The braces mean that the given number of octets follows after the CRLF at the end of the given line. We could even see: A283 SEARCH {4} {21} TEXTstring not in mailbox Note that when downloading messages, I would fully expect to see things like * FETCH {10485760} representing a 10MB message. Also, quoted strings have escaping rules. [ please note that the above is paraphrased and isn't really true RFC3501 for simplicity sake ] Now then... some goals. 1) Ideally I could parse stuff lazily. I have tried this with FTP and it is more complex than it seems at first, due to making sure you never, never, never consume too much data. But being able to parse lazily would make it so incredibly easy to issue a command saying "download all new mail", and things get written to disk as they come in, with no buffer at all. 2) Avoiding Strings wherever possible. 3) Avoiding complex buffering schemes where I have to manually buffer data packets. Thoughts and ideas? BTW, if any of you have heard of OfflineIMAP, yes I am considering rewriting OfflineIMAP in Haskell. -- John