
On 31/07/13 06:37, Roman Cheplyaka wrote:
Hi Mateusz,
This looks great — I'm especially excited about "List entries no longer have to be separated by empty lines"! Glad to hear that.
However, the decision to use Attoparsec (instead of Parsec, say) strikes me as a bit odd,
Parsec has a dependency on Data.Text that you can't easily get rid of. With Attoparsec, I was able to simply get rid of the modules I was not interested in (anything with Text) and only keep the ByteString part.
as it wasn't intended for parsing source code. We're not parsing source code. As I mention, we get comment content out from GHC and parse the markup there.
In particular, I'm concerned with error messages this parser would produce. Currently there exist only two error messages: one for when module header parsing fails and another one for when parsing of anything else fails. Currently the parsing functions have the type ‘DynFlags -> String -> Maybe (Doc RdrName)’ and if we get out Nothing then you get a generic error message and no guidance. This is also the current behaviour.
Now, I agree that this sounds horrible BUT in actuality, there's not much information we could ever give. This isn't the case of inability to do so: I could simply add a (<|> fail "error message") to relevant parts and it would get propagated up. The reason why I said that this isn't much of a problem is because there are very few cases where parsing actually can fail: in most cases if your markup isn't valid semantically, it's probably valid syntactically as a regular string. I mention in my post that we will now accept a bit wider range of syntax. In the past, this: some text
exampleExpression result
would fail and you would get the unhelpful error messages. With the new parser this will simply be accepted as a regular string. In fact, I actually can't think of a comment that would result in parse error with the new parser. Just to check, I just ran 500 randomly generated strings using QuickCheck through each of the two parsing functions exposed to the rest of the program and none of them caused a parse error. It's up to the developer to visually inspect that their markup produced what they wanted – we can't read minds (and frankly, the rules are fairly simple).
Roman
* Mateusz Kowalczyk
[2013-07-30 23:35:45+0100] Greetings cafe,
As some of you might know, I'm hacking on Haddock as part of Google Summer of Code. I was recently advised to create a blog and document some of what I have been doing recently. You can find the blog at [1] if you're interested. The first post goes over the work from the last month or so. Future posts should be shorter and on more specific topics. There's an overview of what has happened/changed/will change at the bottom of the post if you're short on time.
Thanks.
I would also like to take this opportunity to say that there is one more change that I forgot to mention. Obviously invalid strings between double quotes will no longer be treated as module names and blindly linked. The checking will only be on the syntax of the string so it will still create hyperlinks to syntactically valid module names that might not actually exist. -- Mateusz K.