
I'm writing a small tool to help to analyse Haddock comments in Haskell source files to help me to indicate any potential breakage to documentation in existing source files. Currently I'm doing the parsing with the GHC's ‘parser’ function with Opt_Haddock set and I filter out everything I don't need. There are problems with this approach: determining any extensions and options used is vital for valid parsing of the files and a large amount of source files outright fails to parse without this information. Fortunately, haskell-src-exts exists and it can deal with all (or most) of this for me. Unfortunately, there doesn't seem to be any way to get it to recognise Haddock comments and the only option available is all comments or no comments at all. It's not easily possible to stitch these together without analysing the whole parse result. Currently I'm thinking of parsing out extensions and pragmas used using haskell-src-exts and then feeding those to GHC, effectively parsing the second time. Is there a way to avoid this? There's ‘lexTokenStream’ but I believe it has similar problems to ‘parser’, that is, needing to know the extensions beforehand. -- Mateusz K.