
2010/10/26 Claus Reinke
Some questions about Haddock usage:
1. Haddock executable and library are a single hackage package, but GHC seems to include only the former (haddock does not even appear as a hidden package anymore). Is that intended?
Yes, I think that's so that GHC maintainers don't need to worry about API changes in Haddock when making new releases. The Haddock API is not very stable.
2. Naively, I'd expect Haddock processing to involve three stages: 1. extract information for each file/package 2. mix and match information batches for crosslinking 3. generate output for each file/package
I would then expect .haddock interface files to repesent the complete per-package information extracted in step 1, so that packages with source can be used interchangeably with packages with .haddock files.
However, I can't seem to use 'haddock --hoogle', say, with only .haddock interface files as input ("No input file(s).").
Haddock currently mostly works on GHC's front-end AST, called HsSyn, which is not stored in the .haddock files, so that's why you need sources. I say mostly, because the one-year old feature that we call cross-package documentation (allowing the user to re-export documentation from other packages), is implemented by taking information from GHC's .hi files, converting that to HsSyn. The syntax used in the .hi files is slightly less detailed than HsSyn so we loose some information about the exact declaration syntax used by the programmer (brackets in type expressions, infix/prefix declaration styles, etc - nothing that is semantically relevant). In theory we could continue along that road and let you build output from a combination of .haddock and .hi files. Or we could do as you say and just put everything in the .haddock files (in which case we could use the HsSyn type).
3. It would be nice if the Haddock executable was just a thin wrapper over the Haddock API, if only to test that the API exposes sufficient functionality for implementing everything Haddock can do.
Yes, good idea. We haven't done that yet since the API started out as something quite experimental, and it's still in that stage although it has gained a lot more functionality recently.
Instead, there is an awful lot of useful code in Haddock's Main.hs, which is not available via the API. So when coding against the API, for instance, to extract information from .haddock files, one has to copy much of that code.
Also, some inportant functionality isn't exported (e.g., the standard form of constructing URLs), so it has to be copied and kept in synch with the in-Haddock version of the code.
Right. We should export that.
It might also be useful to think about the representation of the output of stage 2 above: currently, Haddock directly generates indices in XHtml form, even though much of the index computation should be shareable accross backends. That is, current "backends" seem to do both stage 2 and stage 3, with little reuse of code for stage 2.
True. The index could be factored out of the Xhtml backend and added to the output of stage 2.
It seems that exposing sufficient information in the API, and allowing .haddock interface files as first-class inputs, there should be less need for hardcoding external tools into Haddock (such as --hoogle, or haddock-leksah). Instead, clients should be able to code alternative backends separately, using Haddock to extract information from sources into .haddock files, and the API for processing those .haddock files. Are these expectations reasonable, or am I misreading the intent behind API and .haddock files? Is there any documentation about the role and usage of these two Haddock features, as well as the plans for their development?
No documentation yet, but yes, the long term plan is to be able to split Haddock in parts: one program that creates data from sources, probably resulting in a .haddock file or maybe something text based, and backends that use those files. The API should provide a convenient way to read the files. It's not been fleshed out in detail yet, and the API is quite ad-hoc at the moment so we need think more about this and write documentation on the Haddock trac. Thanks for the input! David