
Hello, I have some input which is divided up into segments like this: ["foo", "hi", "there", "world"] And I want to use parsec to parse the segments. I am looking for a way to be able to use Char parsers on each segment, but also parse the list of segments as a whole. I have an implementation which works, but I am not feeling that satisfied with it. An example parser looks like this: testp :: GenParser String () (Char, String, String) testp = do segment "foo" st <- p2u (char 'h' >> char 'o') sg <- anySegment sg' <- anySegment return (st,sg, sg') If you use it to parse ["foo", "hi", "there", "world"] you get: *Main> test ["foo","hi","there","world"] (segment 2 character 2): unexpected "i" expecting "o" This is good -- the error tells you the segment and the offset. The combinators, segment and anySegment match on String tokens. The p2u function converts a Char parser into a String parser. The primary issue is that I can not figure out how to implement p2u so that it works under Parsec 2 and 3 with out changes. The secondary issue is that I feel like the DSL is not that great. I would rather write the above parser like this: testp :: GenParser String () (Char, String, String) testp = do string "foo" nextSegment char 'h' >> char 'o' nextSegment sg <- many anyChar nextSegment sg' <- many anyChar return (st,sg, sg') so, the combinators like 'many' would only see to the end of the current segment. nextSegment would bring the next segment into context (or possibly fail if the current segment was not completely consumed). I could also, perhaps, reimplement anySegment as: anySegment = do s <- many anyChar nextSegment return s and write: testp :: GenParser String () (Char, String, String) testp = do string "foo" nextSegment char 'h' >> char 'o' nextSegment sg <- anySegment sg' <- anySegment return (st,sg, sg') I don't see a way to implement this on top of parsec though. Am I missing some clever ideas here? I have attached my code that implements the first method. Any ideas how to make it parsec 2/3 agnostic ? The primary issue is that if I use runParser on the inner Char parser, I need some way to stick a ParseError back into the parent context (with out just converting it to a String and calling fail). I can see how to get/set the inputState, etc, in a portable way, but I don't see how to set the ParseError in a portable way.The attached code requires 3.1.0 because it uses mkPT / runParsecT :( thanks! - jeremy

Hi Jeremy Have you considered rolling your own parser? If you don't want say the expression parser or the language defs, quite a bit of Parsec's machinery is now standard-ish. For instance, most of the combinators in Text.ParserCombinators.Parsec.Combinator are general control operators and can be made on top of an applicative functor with Alternative. I posted a set of them to haskell-cafe a few months ago, though later I found a bug in one of them. Ross Paterson has put versions of the permutation combinators that need only Applicative / Alternative on Hackage: http://hackage.haskell.org/package/action-permutations If you don't want to go so far, you might still want to make the nextSegment parser higher order - at the moment it looks like you are calling nextSegment to pull a lexeme into some context and using the sequencing of the do-notation to run the next parser on the context. It might be preferable if nextSegment operated more like this: h_then_o <- nextSegment (char 'h' >> char 'o') I'm guessing an implementation something like nextSegment :: GenParser String a -> GenParser String a nextSegment p = do xs <- token let ans = runParserSomehow p xs return ans Best wishes Stephen
participants (2)
-
Jeremy Shaw
-
Stephen Tetley