Hi all,

I have a few questions about the GHC lexer (currently using GHC Glasgow Haskell Compiler, Version 6.13.20100320, for Haskell 98, stage 2 booted by GHC version 6.12.1)

 

And I notice that when invoking the lexer I get a few unexpected results that I was hoping someone could clarify.

 

The first part has to do with braces. Opening braces are lexed fine, but the closing ones throw an error. I assume the lexer is keeping some kind of state and trying to do brace matching? (but isn’t this supposed to be done by the parser?)

 

?parseLine("{")->tag

cStatelessParseResultSOk

?parseLine("}")->tag

cStatelessParseResultSFailed

 

and

 

?parseLine("{-")->tag

cStatelessParseResultSOk

?parseLine("-}")->tag

cStatelessParseResultSFailed

 

and in a statement "let foo = .." "foo" isn't lexed as an identifier, "f" is lexed as a ITvocurly and the "oo" as an identifier. I find this  rather odd, and maybe it’s me doing something wrong but doubt it.

 

The original Haskell code I call from FFI is

 

lexSourceString :: String -> IO (StatelessParseResult [Located Token])

lexSourceString source = 

 do

   buffer <- stringToStringBuffer source

   let srcLoc  = mkSrcLoc (mkFastString "internal:string") 1 1

   let dynFlag = defaultDynFlags

   let result  = lexTokenStream buffer srcLoc dynFlag

   return $ convert result

 

-- | convert the build in ParseResult to out custom StateLessParseResult

--   mainly because the State is difficult to marshal and we don't really seem to need it

convert :: ParseResult a -> StatelessParseResult a

convert (POk _ ty)        = SOk ty

convert (PFailed src msg) = SFailed src (showSDocDump msg)