Hello,

I have some input which is divided up into segments like this:


["foo", "hi", "there", "world"]

And I want to use parsec to parse the segments. I am looking for a way to be able to use Char parsers on each segment, but also parse the list of segments as a whole.


I have an implementation which works, but I am not feeling that satisfied with it. An example parser looks like this:

testp :: GenParser String () (Char, String, String)
testp =  
  do segment "foo"
     st <- p2u (char 'h' >> char 'o')
     sg <- anySegment
     sg' <- anySegment
     return (st,sg, sg')

If you use it to parse ["foo", "hi", "there", "world"] you get:

*Main> test
["foo","hi","there","world"] (segment 2 character 2): 
unexpected "i"
expecting "o"

This is good -- the error tells you the segment and the offset. The combinators, segment and anySegment match on String tokens. The p2u function converts a Char parser into a String parser.

The primary issue is that I can not figure out how to implement p2u so that it works under Parsec 2 and 3 with out changes. 

The secondary issue is that I feel like the DSL is not that great. I would rather write the above parser like this:

testp :: GenParser String () (Char, String, String)
testp =  
  do string "foo"
       nextSegment
       char 'h' >> char 'o'
       nextSegment
       sg <- many anyChar
       nextSegment
       sg' <- many anyChar
       return (st,sg, sg')

so, the combinators like 'many' would only see to the end of the current segment. nextSegment would bring the next segment into context (or possibly fail if the current segment was not completely consumed). I could also, perhaps, reimplement anySegment as:

anySegment =
   do s <- many anyChar
        nextSegment
        return s

and write:

testp :: GenParser String () (Char, String, String)
testp =  
  do string "foo"
       nextSegment
       char 'h' >> char 'o'
       nextSegment
       sg <- anySegment
       sg' <- anySegment
       return (st,sg, sg')

I don't see a way to implement this on top of parsec though.

Am I missing some clever ideas here?

I have attached my code that implements the first method. Any ideas how to make it parsec 2/3 agnostic ? The primary issue is that if I use runParser on the inner Char parser, I need some way to stick a ParseError back into the parent context (with out just converting it to a String and calling fail). I can see how to get/set the inputState, etc, in a portable way, but I don't see how to set the ParseError in a portable way.The attached code requires 3.1.0 because it uses mkPT / runParsecT :(

thanks!
- jeremy