
Hi, I am using parsec to parse a small programming language. The language is typed and I need to do some type checking, too. I have decided to do the parsing and type checking simultaneously in the my parsec parser. This approach avoids to keep source code positions in the data type in order to produce suitable error messages during type checking. Anyhow, because type errors are usually detected after parsing some code I need produce error messages with an earlier source position. Unfortunately, there is no function that produces an error taking a position as parameter. I tried the following: myFail :: SourcePos -> String -> GenParser tok st a myFail pos msg = setPosition pos >> fail msg This is already a workaround because I am modifying the position in the parser just to produce an error message. But even worse, it does not work. If I use this function as in: test :: Either ParseError () test = runParser (char '(' >> myFail (newPos "" 100 100) "Test") () "" "(" the position of the error is still the original position (line 1, column 2). As far as I can tell setPosition does not take effect until another symbol is read. This could be achieved by simply using anyToken if we are not at the end of the input (as in the example above). I came up with the following: myFail :: SourcePos -> String -> GenParser Char st a myFail pos msg = do { State toks _ st <- getParserState; setParserState $ State ('d':toks) pos st; anyToken; fail msg } This code works but it is not nice at all. The function myFail is not longer polymorphic in the type of tokens since we need an element of this data type in order to add it temporarily to the input ('d' above). My guess is that one has to enforce strictness at some point in order to work with the first approach, but I was not successful. Any ideas? Thanks, Michael

mwinter@brocku.ca wrote:
Hi,
I am using parsec to parse a small programming language. The language is typed and I need to do some type checking, too. I have decided to do the parsing and type checking simultaneously in the my parsec parser. This approach avoids to keep source code positions in the data type in order to produce suitable error messages during type checking. Anyhow, because type errors are usually detected after parsing some code I need produce error messages with an earlier source position. Unfortunately, there is no function that produces an error taking a position as parameter.
If you already know what position you want to report the error at, then why bother calling setPosition to let parsec know? Just do:
fail (show pos ++ ": " ++ msg)
Parsec will then result in a ParseError with its own ideas of location, but you can ignore that. HTH, Martijn.

Thanks, but I want a nice solution not another, even more complicated, workaround. On 5 May 2009 at 17:10, Martijn van Steenbergen wrote:
mwinter@brocku.ca wrote:
Hi,
I am using parsec to parse a small programming language. The language is typed and I need to do some type checking, too. I have decided to do the parsing and type checking simultaneously in the my parsec parser. This approach avoids to keep source code positions in the data type in order to produce suitable error messages during type checking. Anyhow, because type errors are usually detected after parsing some code I need produce error messages with an earlier source position. Unfortunately, there is no function that produces an error taking a position as parameter.
If you already know what position you want to report the error at, then why bother calling setPosition to let parsec know? Just do:
fail (show pos ++ ": " ++ msg)
Parsec will then result in a ParseError with its own ideas of location, but you can ignore that.
HTH,
Martijn.

Am Dienstag 05 Mai 2009 17:38:35 schrieb mwinter@brocku.ca:
Thanks, but I want a nice solution not another, even more complicated, workaround.
I'm afraid you're out of luck there. Parsec carries a ParseError around even for successful parses (where it's a SourcePos and and empty list of messages). When binding two parsers, if the second doesn't consume any input, for the overall result it calls mergeErrorReply, which calls mergeError from Text.ParserCombinators.Parsec.Error: mergeError :: ParseError -> ParseError -> ParseError mergeError (ParseError pos msgs1) (ParseError _ msgs2) = ParseError pos (msgs1 ++ msgs2) So that doesn't look at the position of the second error :( You could change the sources of parsec, the least intrusive would probably be to modify mergeError: mergeError :: ParseError -> ParseError -> ParseError mergeError (ParseError _ []) pe2 = pe2 mergeError (ParseError pos msgs1) (ParseError _ msgs2) = ParseError pos (msgs1 ++ msgs2) or you could employ an ugly workaround like setPosAndFail :: tok -> SourcePos -> String -> GenParser tok st a setPosAndFail dummy pos msg = do setPosition pos inp <- getInput setInput (dummy:inp) tokenPrim (const "") (\p _ _ -> p) Just fail msg myFail :: SourcePos -> String -> GenParser Char st a myFail = setPosAndFail 'a' to pretend you actually consumed some input. It works: *TestWorkAround> test Left (line 100, column 100): Test but isn't pretty.

Hi, When we needed to do something similar with Parsec, we chose to pack the relevant source position into the error string (you can just use Show/Read, plus a special sequence of characters to indicate where the position ends and the real error starts). We then unpack it outside runParser before issuing the error to the user. If there's no packed position found, we just use the Parsec position. Thanks, Neil. mwinter@brocku.ca wrote:
Hi,
I am using parsec to parse a small programming language. The language is typed and I need to do some type checking, too. I have decided to do the parsing and type checking simultaneously in the my parsec parser. This approach avoids to keep source code positions in the data type in order to produce suitable error messages during type checking. Anyhow, because type errors are usually detected after parsing some code I need produce error messages with an earlier source position. Unfortunately, there is no function that produces an error taking a position as parameter.
I tried the following:
myFail :: SourcePos -> String -> GenParser tok st a myFail pos msg = setPosition pos >> fail msg
This is already a workaround because I am modifying the position in the parser just to produce an error message. But even worse, it does not work. If I use this function as in:
test :: Either ParseError () test = runParser (char '(' >> myFail (newPos "" 100 100) "Test") () "" "("
the position of the error is still the original position (line 1, column 2). As far as I can tell setPosition does not take effect until another symbol is read. This could be achieved by simply using anyToken if we are not at the end of the input (as in the example above). I came up with the following:
myFail :: SourcePos -> String -> GenParser Char st a myFail pos msg = do { State toks _ st <- getParserState; setParserState $ State ('d':toks) pos st; anyToken; fail msg }
This code works but it is not nice at all. The function myFail is not longer polymorphic in the type of tokens since we need an element of this data type in order to add it temporarily to the input ('d' above).
My guess is that one has to enforce strictness at some point in order to work with the first approach, but I was not successful. Any ideas?
Thanks, Michael
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (4)
-
Daniel Fischer
-
Martijn van Steenbergen
-
mwinter@brocku.ca
-
Neil Brown