
Hi all, It seems as the discussion has got going and that there are plenty of ideas around. Good! I'll get back with something more substantial in a bit. Trying to structure what's been discussed so far so as to get a clearer picture of the options is probably a good thing, and I also think we should focus a bit more on the goal and design principles before getting too involved in the details of different formats and such. Anyway, just a remark on what Simon Marlow wrote:
If I understand correctly, I think you were proposing a two stage process to get the documentation (similar to the Eiffel approach?):
Haskell source --> interface ---> on-line documentation `--> printed documentation .....
Why not do it in one?
Haskell source ---> on-line documentation `--> printed documentation
I don't know if "you" above referred to me, but that was indeed part of what I suggested at the HI workshop. I do not think that the staged approach is absolutely essential: if we just could come up with a good standard for how to write embedded documentation in Haskell, that would be extremely valuable in its own right. But I think there are a number of really compelling reasons for taking the staged approach and also standardize on the "raw" machine-readable, documentation format which would be the output from the first stage. One reason is that there potentially are a large number of different formats in which one might want to generate documentation. Some might be of general interest, some might have more limited scope. One can even imagine completely different applications in which it is important to have access to "documentation level" information about source code, for example various serach tools. Given a carefully designed "raw" format as a starting point, it is fairly easy to write such tools. There is evidence of this from the past: both the Fudgets documentation tool and the HaskellDoc tool used HBC generated interface files as their "raw" format to good effect. But this also meant that these tools became system specific, and they were also limited by what HBC (ok, Lennart) happened to record in the interface files. E.g. for HaskellDoc, this meant that there are only hyper-links to entire source code files, not to individual functions. (OK, that might be a limitation of HTML as well.) A more recent example, of course, is Jan's source code browser. Thus I view the "raw" format as a way of getting the benefits from creative use of interface files, while avoding getting tied to and restricted by some particular Haskell compiler. One could argue that given a freely available, easy-to-use, documentation-extracting Haskell parser, the above is a non issue. Just incorporate that code into your own. But I can see that approach leading to a number of maintenance problems which a well-specified and extensible file format would insulate against. Also, some peolpe might prefer to write their documentation generator in something else than Haskell (assuming that the above-mentioned parser is a piece of Haskell code). Another reason for why I like the staged approach is that a Haskell compiler conceivably could perform the first stage. Now, I know that this was not to everyones liking. And I'm absoluetly not saying that we in any way should require a Haskell compiler to do this work, or that we should rule out stand-alone tools. But I think there are quite a few good reasons for why one might want to do it that way, and thus I belive it is good if the documentation standard is such that it is possible to do so. Thus I'm saying that I think there should be two related parts of the standard: one for how to write documentation embedded in Haskell source, and one for representing collected documentation in a stand-alone, machine-friendly, format. I'm not saying that we should require tools to work in a staged manner. E.g. I can easily see an augmented HDoc emitting the raw format for the benefit of other tools, as well as keeping its current HTML-emitting capabilities. [Incidentally, note that documentation extracting compilers is not a new thing. For example, I know of two different C/C++ compilers which supported source code browsing by extracting information from source code and storing it in special files.] Finally, in defining the "raw" format, we will have to decide on exactly what Haskell documentation essentially is. I think that will be very helpful during the standardization process. I also think this will be very helpful later for people writing document formating tools. Of course, there are many ways to achieve this. But specifying the context-free syntax of "raw" documentation seems to me to be a good way. Assuming that there will be such a thing as the raw format, then the question is what that format should look like. Simon suggested that it should look like Haskell + pragma-style comments. This is certainly an interesting idea with a number of merits. On the other hand, XML is gaining wide-spread acceptance as a standard on which various exchange formats are based. This means that there are quite a few tools out there that might be put to useful use, including at least one Haskell library, as Armin mentioned. Also, the fact that XML currently IS used for things similar to what we'd like to do, might mean that we can avoid a number a pitfalls by sticking to the standard. But as I said earlier, detailed format discussions can probably wait a little. Best regards, /Henrik -- Henrik Nilsson Yale University Department of Computer Science nilsson@cs.yale.edu