Parsing indentation-based languages with Parsec

Hi, first time list poster :) I've searched around a bit but haven't been able to find any examples of this. I want to be able to parse a language (such as Haskell, Python) which has only EOL as the 'statement' separator and has indentation levels to indicate block structure. Whilst doing this I want to use Parsec's nice library. The first thing I noticed was that Parsec's whiteSpace parser will ignore EOL as just whiteSpace, so I need to redefine that. Is this the correct way to do it? I've only been using Haskell for a week or so so I'm not too sure on the record structures and updating them... lexer :: P.TokenParser () lexer = ( P.makeTokenParser emptyDef { commentLine = "#", nestedComments = True, identStart = letter, identLetter = letter, opStart = oneOf "+*/-=", opLetter = oneOf "+*/-=", reservedNames = [], reservedOpNames = [], caseSensitive = False } ) { --update lexer fields P.whiteSpace = do --just gobble spaces many (char ' ') return () } (I got the basic code from the tutorial contained within the Parsec docs.) For handling the indented blocks I thought I would use something to hold current indentation state, as Parsec has support for threading state through all the parsers. Is this the right way to go about this? Has anyone done the 'groundwork' with parsing such languages so I don't need to reinvent this? Thanks in advance, - porges.
participants (1)
-
George Pollard