
For a toy project I want to parse the output of a program. The program runs on someone else's machine and mails me the results, so I only have access to the output it generates, Unfortunately, the output is intended to be human-readable, and this makes parsing it a bit of a pain. Here are some sample lines from its output: France: Army Marseilles SUPPORT Army Paris -> Burgundy. Russia: Fleet St Petersburg (south coast) -> Gulf of Bothnia. England: 4 Supply centers, 3 Units: Builds 1 unit. The next phase of 'dip' will be Movement for Fall of 1901. I've been using Parsec and it's felt rather complicated. For example, a "location" is a series of words and possibly parenthesis, except if the word is SUPPORT. And that "Supply centers" line ends up being code filled with stuff lie "char ':'; skipMany space". I actually have a separate parser that's Javascript with a bunch of regular expressions and it's far shorter than my Haskell one, which makes sense as munging this sort of text feels to me more like a regexp job than a careful parsing job. I'm considering writing a preprocessing stage in Ruby or Perl that munges those output lines into something a bit more "machine-readable", but before I did that I thought I'd ask here if anyone had any pointers, hints, or better ideas.