restarting the discussion

malcolm-hs＠cs.york.ac.uk

8 Feb 2001 8 Feb '01

3:54 p.m.

Hello to everyone who has joined the HaskellDoc mailing list. We had a little bit of discussion before announcing the list more widely, but everything now seems to have stopped dead. So it's time to get thoughts rolling again. Do check the list archive on haskell.org to see what has already happened. I'll start by declaring my interest in automatic documentation. I have a proposal (not yet published, sorry, but coming soon) to help organise the Haskell libraries, standard and non-standard. But libraries are no good if you don't know what they contain. Not everyone is either willing or capable to write high-quality documentation for their libraries. Hence, a tool which automatically derives some kind of documentation directly from the source code would be extremely useful as a starting point for exploring a new library. When I first saw Armin Groesslinger's HDoc, I got very excited because it generates very useful summary information about a library, without the programmer having to add a single annotation whatsoever. Or at least that's how it appears at first glance. My immediate aim is to have a tool that produces browsable documentation to be read by the user of a module, not necessarily an author or developer (i.e. external docs, not internal docs). Ideally, I want to see * what useful datatypes are exported; * the type signature of every exported function; * some textual description of each datatype; * some textual description of each exported function; Now obviously if (as all good programmers do :-) there are type signatures in the source code, it is relatively easy to extract them with a tool. Textual descriptions can also be extracted from the programmer's comments. But the programmer will write many kinds of comment - not all are intended for users - many describe the implementation, which is of no interest to the mere user. So we need to distinguish different styles of comments. This can be done lexically or positionally, and we need to agree a standard. I have another big requirement. The source code must remain readable as source code. I absolutely loathe so-called "literate programming" style, because it breaks this rule horribly. From my point of view, any new documentation standard must be as non-intrusive as possible. This almost immediately rules out XML-style tags I'm afraid. Other (less heavy) lexical conventions might be OK though, provided there are only a small number of them to be learnt. I am intrigued by the use of positional cues (e.g. a comment just before or just after a type signature) as a clever way of associating documentation with code, whilst avoiding extra syntax. For those who have not yet looked at Armin's HDoc, can I encourage you to do so, as a concrete example of how some of these ideas have been put into practice. He develops a "special" kind of comment, introduced by {--- rather than {-, and has some small lexical conventions that help to generate nice hyperlinked HTML. Also, look at Jan Skibinski's tool which captures comments by their position with respect to a signature. Both are linked from the haskell.org page on libraries and tools. Regards, Malcolm

Show replies by date

Jan Skibinski

8 Feb 8 Feb

4:50 p.m.

On Thu, 8 Feb 2001 malcolm-hs@cs.york.ac.uk wrote: ....

...

My immediate aim is to have a tool that produces browsable documentation to be read by the user of a module, not necessarily an author or developer (i.e. external docs, not internal docs).

A good tool can produce either thing equally well. See below. ....

...

But the programmer will write many kinds of comment - not all are intended for users - many describe the implementation, which is of no interest to the mere user. So we need to distinguish different styles of comments. This can be done lexically or positionally, and we need to agree a standard.

My opinion is that there should be only one style of comment, the good and important comments. They can, of course belong to different entities, such as functions, classes, methods, datatypes, but all of them should be Good. If an entity is implementation specific then do not export it. But if you wish to make some special implementation notes in the exported entity then make them somewhere in the body of your code, not around the entity header, so the comment extractor will never touch it. But if you do write the comment around the header then be prepared to see it in documentation. The pressure is on you to do it right. You should be able to guarantee readability of the resulting interface.

...

I have another big requirement. The source code must remain readable as source code. I absolutely loathe so-called "literate programming" style, because it breaks this rule horribly. From my point of view, any new documentation standard must be as non-intrusive as possible. This almost immediately rules out XML-style tags I'm afraid. Other (less heavy) lexical conventions might be OK though, provided there are only a small number of them to be learnt.

So you, Henrik and I agree on this point. I have been advocating it quite strongly right here. (BTW, I do write a lot of HTML-based "literate" tutorials but they are not meant as libraries. Libraries should be readable and easy to maintain).

...

I am intrigued by the use of positional cues (e.g. a comment just before or just after a type signature) as a clever way of associating documentation with code, whilst avoiding extra syntax.

For those who have not yet looked at Armin's HDoc, can I encourage you to do so, as a concrete example of how some of these ideas have been put into practice. He develops a "special" kind of comment, introduced by {--- rather than {-, and has some small lexical conventions that help to generate nice hyperlinked HTML.

Also, look at Jan Skibinski's tool which captures comments by their position with respect to a signature. Both are linked from the haskell.org page on libraries and tools.

I have been working very hard to provide some working examples of interfaces that have all of the above features outlined by you. I was hoping that they will help with this discussion. However, few cared to check them out, which disappoints me greatly. In addition, those who have tried must have misunderstood my links since the have not found any of the samples - which is shown in the logs of my website. So here is the careful explanation of those links, or rather the links to new and improved interface samples. ------------------------------------------------------ They are in a temporary location and they may be removed one day. I do not wish them to be indexed from this email message, therefore I provide the information in two stages: 1. The directory is http://www.numeric-quest.com/haskell/ This directory is already indexed, which should be. 2. Append to the above one of the following filenames: Extractor.short.html Extractor.long.html InterfacePrinter.short.html InterfacePrinter.long.html That's all. ----------------------------------------------------- Now, here is some explanation. They are the actual interfaces to a tool I am working on. The tool reads Haskell source code, extracts structure of a module (this is done by module Extractor) and then prints the interface (done by InterfacePrinter) in one of two formats (so far): Ascii or HTML. All the above samples are HTML-"preformatted". I have defined three types of interfaces so far: Short - listing exported features only ("external" iface) Long - listing all the features ("internal" iface) Coded - Long, but with source code included (prettyfied source code) The modules are still under development, and they do not do yet everything I want them to do. There are some unfinished portions of the code; for example, the printer does not handle the classes properly yet, there are some formatting issues, export/import should be improved, etc. You should easily notice those shortcomings, because the samples I provide are not edited - this is what is _really_ produced by the tool. The good news is that all of those have been produced without XML or HTML formatting information - straight from the readable source code. But I use two lightweight helpers: single quoted words within comments become italic and the special banner --: separates groups of functions. I will stop explaining right now. You should be able to understand the tool solely from the above interfaces. If not, then it would mean that I have been wrong all along and that it is time for me to shut up. Jan

Henrik Nilsson

9 Feb 9 Feb

1:13 a.m.

Hi Jan, and Hi HaskellDocers, Jan Skibinski wrote:

...

On Thu, 8 Feb 2001 malcolm-hs@cs.york.ac.uk wrote: ....

...
My immediate aim is to have a tool that produces browsable documentation to be read by the user of a module, not necessarily an author or developer (i.e. external docs, not internal docs).

A good tool can produce either thing equally well. See below. ....

...
But the programmer will write many kinds of comment - not all are intended for users - many describe the implementation, which is of no interest to the mere user. So we need to distinguish different styles of comments. This can be done lexically or positionally, and we need to agree a standard.

My opinion is that there should be only one style of comment, the good and important comments. They can, of course belong to different entities, such as functions, classes, methods, datatypes, but all of them should be Good. If an entity is implementation specific then do not export it. But if you wish to make some special implementation notes in the exported entity then make them somewhere in the body of your code, not around the entity header, so the comment extractor will never touch it. But if you do write the comment around the header then be prepared to see it in documentation. The pressure is on you to do it right. You should be able to guarantee readability of the resulting interface.

As outlined in my previous mail, I think this might be a little too limiting, so my current prefernce is for some kind of comment tagging. But I might be swayed, and I agree that the conventions Jan outline are pleasantly lightweight.

...

...
I have another big requirement. The source code must remain readable as source code. I absolutely loathe so-called "literate programming" style, because it breaks this rule horribly. From my point of view, any new documentation standard must be as non-intrusive as possible. This almost immediately rules out XML-style tags I'm afraid. Other (less heavy) lexical conventions might be OK though, provided there are only a small number of them to be learnt.

So you, Henrik and I agree on this point. I have been advocating it quite strongly right here. (BTW, I do write a lot of HTML-based "literate" tutorials but they are not meant as libraries. Libraries should be readable and easy to maintain).

Yes, at least more or less! ;-)

...

...
I am intrigued by the use of positional cues (e.g. a comment just before or just after a type signature) as a clever way of associating documentation with code, whilst avoiding extra syntax.

For those who have not yet looked at Armin's HDoc, can I encourage you to do so, as a concrete example of how some of these ideas have been put into practice. He develops a "special" kind of comment, introduced by {--- rather than {-, and has some small lexical conventions that help to generate nice hyperlinked HTML.

Also, look at Jan Skibinski's tool which captures comments by their position with respect to a signature. Both are linked from the haskell.org page on libraries and tools.

I have been working very hard to provide some working examples of interfaces that have all of the above features outlined by you. I was hoping that they will help with this discussion. However, few cared to check them out, which disappoints me greatly.

Sorry. I just had other things to attend to too.

...

In addition, those who have tried must have misunderstood my links since the have not found any of the samples - which is shown in the logs of my website. So here is the careful explanation of those links, or rather the links to new and improved interface samples.

------------------------------------------------------ They are in a temporary location and they may be removed one day. I do not wish them to be indexed from this email message, therefore I provide the information in two stages: 1. The directory is http://www.numeric-quest.com/haskell/ This directory is already indexed, which should be. 2. Append to the above one of the following filenames: Extractor.short.html Extractor.long.html InterfacePrinter.short.html InterfacePrinter.long.html That's all.

Thanks, I've now found them.

...

All the above samples are HTML-"preformatted".

I have defined three types of interfaces so far: Short - listing exported features only ("external" iface) Long - listing all the features ("internal" iface) Coded - Long, but with source code included (prettyfied source code)

Jan, one question: what do you do/intend to do with *re*exported entities? As explained in my previous mail, I think external documentation should include everything a module exports, including the related documentation, regardless of where the various entities were originally defined.

...

The modules are still under development, and they do not do yet everything I want them to do. There are some unfinished portions of the code; for example, the printer does not handle the classes properly yet, there are some formatting issues, export/import should be improved, etc. You should easily notice those shortcomings, because the samples I provide are not edited - this is what is _really_ produced by the tool.

The good news is that all of those have been produced without XML or HTML formatting information - straight from the readable source code. But I use two lightweight helpers: single quoted words within comments become italic and the special banner --: separates groups of functions.

I will stop explaining right now. You should be able to understand the tool solely from the above interfaces. If not, then it would mean that I have been wrong all along and that it is time for me to shut up.

I have to say that this already looks like a quite useful tool, and it does show that one can go a long way with very lightweight conventions. A few points, tough. I'd again like to take the opportunity to push for the idea of also defining and intermediate format. The documentation samples Jan have presented are all very good examples of the kind of documentation renderings one might want. But on the other hand, I don't think they cover the entire spectrum of reasonable presentations. (I believe I have some support from at least Jan on this one.) This brings me my main concern: is the *style* of conventions Jan proposes (not necessarily exactly those conventions he currently use) flexible enough to: 1. support other reasonable renderings, 2. avoid creating a feeling among users that the conventions are "dictatorial" ;-) Regarding 2, I'm not entirely serious: by necessity, the conventions must be fairly strict. But after browsing through the Eiffel documentation Jan sent a pointer too, I personally felt that the required conventions were a bit too much. Now, I certainly do not wish to start a discussion about the Eiffel conventions. I'm just mildly concerned that too strict requirements might get in the way of getting wide acceptance for a standard. But maybe this is a non-issue. Regarding 1, below are a few examples of things that I think are reasonable. Please keep in mind that I too favour a fairly lightweight convention, but I am willing to allow a little extra noise for the sake of flexibility. To be concrete, I like the convention 'xxx' to mark a piece of code (such as a variable name) in running text which Jan proposes, I think one or two more such conventions would be useful (e.g. _xxx_ for emphasis). On the other hand, I don't necessarily see anything wrong with HDoc/JavaDoc-style tags for marking large things: the extra piece of information provided could be quite valuable. ----------------------------------------------------------------------- Example 1. I sometimes write functions which take tuples among their arguments and/or return tuples, and I sometimes find it useful to document each field separately. Returning a tuple immediately implies that it might be difficult to start each documentation block with a sentence that describes what the function returns (which seems to be your convention)? Here's an example of the convention I've used for such cases up until now (from the module LambdaLift in my compiler): -- Lambda lifting. -- m .......... The name of the module. -- locally bound functions to a triple of the expression -- to be substituted for the function names, a list of the -- top-level identifiers in the expression (just one), and a -- list of the free variables in the expression in question. -- lvdos ...... List of new (i.e. lifted) top-level definitions with -- associated occurrences of relevant top-level identifiers -- from previous expressions. -- e .......... The expression on which to perform lambda lifting. -- -- Returns a five tuple: -- #1 ......... List of new top-level definitions. -- #2 ......... List of top-level identifiers which occurs in the -- residual expression and in lrvs or which has been -- introduced. -- #3 ......... List of the free variables (not syntactically bound to a -- lambda abstraction) in the expression prior to inner -- lifting. -- #4 ......... List of the free functions (local variables bound to -- lambda abstractions) in the expression prior to inner -- lifting. -- #5 ......... Residual expression. ll :: PackedString -> [Id] -> Env Id -> Env (Exp a, [Id], [Id]) -> [(ValDef a, [Id])] -> Exp a -> NS ([(ValDef a, [Id])], [Id], [Id], [Id], Exp a) I guess I could be accused of writing too complicated functions (in particular if one is to believe what the Eiffel book said :-), but that's beside the point. This is real code, and I think it is reasonable to require that a documentation format should be flexible enough to document it and render the documentation in some nice way. Here's another example, showing a tuple argument: -- Pattern matching compiler. -- p .......... Compilation parameters. -- us ......... Variable vector ("SFIR" variables). -- qs ......... Pattern/RHS matrix. Four tuple: -- #1 .... Clause number. -- #2 .... Substitution for RHS of this clause. -- Extended as pattern matching compilation -- proceeds. -- #3 .... Pattern vector for this clause. -- #4 .... RHS of clause. Three tuple. A boolean indicates -- whether the RHS can fail. -- de ......... Default expression. Result in case of pattern matching -- failure. -- -- Returns a 3-tuple: -- #1 ......... A list of substitutions, one for each RHS. -- Note: This is FIR to SFIR substitutions, so this implies -- a circular definition. -- #2 ......... A fail flag that indicates whether the pattern matching -- may fail so that the value of the default expression is -- used. -- #3 ......... The resulting expression. -- -- Note: Newtypes are handled as any other algebraic type. This results -- in case constructs of the form (case <exp1> of $id <v> -> <exp2>) -- (no default clause). After type checking, these can be replaced by -- (let <v> = <exp1> in <exp2>). ------------------------------------------------------------------------ Example 2: The lambda lifter in the previous example illustrates another problem: how to deal with moands? NS happens to be a monad, so it is not really correct to claim that the function "returns a five tuple". In Armin's HDoc, there is a separte tag for marking monadic return values. (But I think they still renders as "Returns ..."?) In your case you seem to adopt the convention that functions with return type (IO a) should be described in imperative terms. This makes sense for IO and many other monads, but not necessarily for all monads. Do we need more structure here, or are monads just such a general concept that there is not much one can do about it? ------------------------------------------------------------------------ Example 3: This is really a variation on 2. Consider combinator libraries. Sometimes it makes sense to adopt special commenting conventions. Again, the Fudget GUI combinator library can serve as a useful example. The central abstraction in the Fudget GUI is the Fudget, which can be understood as a component with one user-visible input and one user-visible output. Such a component may or may not have a graphical rendering in the form of some GUI widget. Each fudget also has an invisible, low-level input/output pair which connects it to the outside world. An example could be a text input field. On the input it would accept strings which are shown as default values to the user. On the output, a user-entered string would appear as soon as the user presses return. The documentation for a function like foo :: a -> b -> F c d where (F c d) is a fudget with input of type d(!) and output of type c, would be something along the following lines: DESCRIPTION A foo fudget is a button that ... INPUT: True or False to switch the button on or off under program control. OUTPUT: True when the button is on, False when the button is off. ARGUMENTS: xxx :: a, ... yyy :: b, ... The point here is that describing foo as a function returning a Fudget isn't very helpful. Instead special conventions approporiate for the domainin question were adopted. I'm currently involved in the development of a framework for domain-specific languages called FRP (Functional Reactive Programming). One of its incarnations is as a combinator library. A central abstraction is (Behavior a b) which represents a "transformer" of signals of type a to signals of type b. I.e. there are some similarities to a fudget (F b a). Again, when we document a behaviour, it would be nice tobe able to talk about its inputs and outputs (which furthermore often happen to carry elements of tuple types). So, can we find conventions which are flexible enough to support documentation of this type of code? BTW, the Fudget library manual is available on line. It might be worth checking it out. For example: http://www.cs.chalmers.se/Cs/Research/Functional/Fudgets/Manual/current/smal... A concrete example illustrating the points above is: http://www.cs.chalmers.se/Cs/Research/Functional/Fudgets/Manual/current/togg... ----------------------------------------------------------------------- Example 4 Well, not an example actually, more of a "laundry list". I touched on most of thses in my previous mail: * A possibility to include pictures does not seem entirely unreasonable. * A convenient way of including a piece of code, e.g. a useage example. <code> </code> tags? * Explicit cross references Cf. the man-page "see also" style. Quite useful to be able to refer the reader to closely related functions in a large library, for instance. * Hints for generating indices at various level of detail (beginner's index, programmer's index, ...) ----------------------------------------------------------------------- Maybe there are other Haskell-isms we need to take into account. Compared to most imperative langauges, langauges like Haskell are extremely flexible, and pepole tend to use this flexibility in all kinds of innovative ways. Thus, documentation conventions that work very for an imperative language, might be too rigid for our porposes. I think Armin already has done a quite good job of identifying some of the issues in his HDoc tool through the selection of markup tags. Best regards, /Henrik -- Henrik Nilsson Yale University Department of Computer Science nilsson@cs.yale.edu

Manuel M. T. Chakravarty

12:21 a.m.

malcolm-hs@cs.york.ac.uk wrote,

...

I have another big requirement. The source code must remain readable as source code. I absolutely loathe so-called "literate programming" style, because it breaks this rule horribly. From my point of view, any new documentation standard must be as non-intrusive as possible. This almost immediately rules out XML-style tags I'm afraid. Other (less heavy) lexical conventions might be OK though, provided there are only a small number of them to be learnt.

I am 100% with you here. If a documentation standard makes the code in my editor only the slightest bit less readable, I won't use it. This, btw, also means that the tool has to understand -- bla bla bla -- blub blub blub kind of comments and not only something like {- bla bla bla blub blub blub -} Nested comments don't go well with syntax highlighting.

...

I am intrigued by the use of positional cues (e.g. a comment just before or just after a type signature) as a clever way of associating documentation with code, whilst avoiding extra syntax.

I like that, too.

...

For those who have not yet looked at Armin's HDoc, can I encourage you to do so, as a concrete example of how some of these ideas have been put into practice. He develops a "special" kind of comment, introduced by {--- rather than {-, and has some small lexical conventions that help to generate nice hyperlinked HTML.

Does HDoc also grok ---- for a comment block rather than --? Maybe it must actually be something like -- -- to make sure ---- is not read as one token by a Haskell compiler (by longest match rule). Cheers, Manuel

Henrik Nilsson

1:12 a.m.

Hi all, Malcolm Wallace wrote:

...

Hello to everyone who has joined the HaskellDoc mailing list. We had a little bit of discussion before announcing the list more widely, but everything now seems to have stopped dead. So it's time to get thoughts rolling again. Do check the list archive on haskell.org to see what has already happened.

I'll start by declaring my interest in automatic documentation.

Position statements seem to be order of the day (or week, maybe). So here are some points on what I believe a good standard should look like. 1. I agree with Malcolm and Jan Skibinski that the documentation conventions need to be lightweight. (I too dislike literate programming, except possibly when the aim is to write a paper or a book.) 2. I think the documentation standards should be able to support both internal and external documentation. 3. I believe a standardized, intermediate, "raw" documentation format would be useful. 4. I think the intermedite format should be based on XML. I will now discuss each point in turn. ----------------------------------------------------------------------- 1. I agree with Malcolm and Jan that the documentation conventions need to be lightweight. (I too dislike literate programming, except possibly when the aim is to write a paper or a book.) However, I think that relying solely on positional cues might be too constraining and (in te long run) inflexible. So personally, I think HDOc/JavaDoc-like tags is a good compromise. To that, I also see a need to add some lightweight conventions for markup of explanatory text. E.g. I'd like to be able to mark variable names, emphasize a piece of text, and maybe include small code fragments. Jan propses to use conventions like 'xxx' for variable names and "zzz" for emphasis (I think). That's probably reasonable, and I indeed use the 'xxx' convention in my own comments sometimes. But one should be aware that this useage can conflict with the normal meaning of the quote characters. In particular, other lightweight emphasis conventions like _yyy_ or *zzz* spring to mind. I would even find it acceptable with some more heavyweight conventions for marking entire paragraps, such as <code> and </code>. This would be very useful for including useage examples in external documentation, for instance. I would like the possibiility to include pictures (as opposed to having to rely on ASCII graphics). Take a look at the Fudgets documentation for examples showing how useful this can be. Finally, I do agree with Malcolm that XML is far to heavy to be used at this level. ----------------------------------------------------------------------- 2. I think the documentation standards should be able to support both internal and external documentation. By internal documentation, I mean documentation of the source code as such, intended for people who needs to read and understand source code such as developers and maintainers. By external documentation I mean documentation of interfaces, intended for people who needs to use a piece of software but who do not need to know about the internal details. I guess this mainly applies to library interfaces, but one could also consider manpage-style application documentaion (cf. POD from the Perl world). Since the markup needs for internal and external documentation are pretty similar, I don't think it will be very difficult to develop a standard supporting both. The main thing which has to be added is a way of declaring if a piece of documentation is for internal or external (or maybe both) use. Having two different commenting conventions (e.g. "{--" and "{---") would be a possibility. Another possibility, probably more flexible, is to have some initial tag. For external documentation, it may also be useful to have a possibility to generate documentation at different levels of detail. For instance, for a very large library, it might be uesful to have both brief beginner documentation, more extensive programmer documentation, and full documentation (e.g. including obsolete, deprecated features). Again, the Fudget documentation is a good example (and where I picked up the idea). It would seem as if a comment classification scheme based on initial tags easily could be adapted for this kind of use as well. Once the documentation comments have been classified, generating internal or external documentation is rally a tool issue. For internal documentation, a tool would basically just have to extract type signatures (or infer them), type definitions, class definitions, etc. along with all internal documentation comments. For proper external documentation, a good tool also has to take import and export into account. For instance, a library could be made up of a number of modules which are collected and re-exported by one single "top-level" module. The users are not supposed to have to know about the internal library structure, but only sees the one module. Thus, when generating documentation for this module, the tool would have to collect documentation for the re-exported entities from _other_ modules. ------------------------------------------------------------------------ 3. I believe a standardized, intermediate, "raw" documentation format would be useful. I've argued above that it would be desirable to support at least two different types of documentation. Furthermore, documentation could conceivably be rendered in a plethora of different formats: HTML, PDF, postscript, info, LaTeX, DocBook, etc. Different people and organizations may even have specialized formatting needs. For instance, assuming e.g. a HDoc/JavaDoc-like convention where the very first sentence of a doocumentation comment gives a synopsis, someone maintaining a collection of libraries (e.g. on haskell.org) might like a tool that extracts only this information for each library, so that someone browsing through the collection of libraries quickly can determine whether a particular library fits the bill or not. Or imagine an organization where all documentation has to conform to some strictly defined, internal standard. The possibilities are, if not endless, at least extensive. Add to this other applications such as searching through a library (or collection of libraries) based of type information (an old idea which often is quite useful, but sadly neglected in today's functional programming environments). All tools carrying out tasks like those suggested above share a common need: a (preferably easy) way to extract "meta" information from source code. For exaple: * Names of all exported entities (i.e. "canonical", fully expanded export list). * Origin info for exported entities not defined locally. * Names of all top-level entities defined in a module. * For types and classes, their definitions. * Type signatures for functions and method instances. * Author-supplied documentation associated with the various top-level entities. * Maybe source code positions, or at least the name of the file in which something is defined. * Fixity declarations. * Perhaps even strictness signatures. There are different ways to get such information. In some cases, simple matching based on regular expressions might be enough. Unfortunately, such solutions tend to be fragile, in particular for a language whith the lexical and syntactical conventions of Haskell (take nested comments, for one example). It is also unclear to what extent such solutions could be shared between different tools. Another approach would be to provide a (simplified, specialized) Haskell parser with a clearly defined interface making information like what was described above available. This would no doubt prove to be very popular for people wanting to develop various documentation tools. But if this interface was to be standardized, e.g. in the form of an algebraic data type in Haskell, then this would not be directly useful for people wishing to develop using some other language. Also, Haskell types are not very extensible, which would create all sorts of compatibility problems if the standard was to evolve. A third approach would be to define a standard, intermediary documentation format which is easy to generate (once one have the necessary information) and parse. Then, as long as at least one tool generating this format exists, it would be be relatively straightforward to develop all sorts of formatters and other creative applications around this. (Looking back at the history of Haskell documentation tools, this has actually happened at least three times: "FudgetsDoc" and HaskellDoc both used HBC's interface files to get type information, and more recently Jan Skibinski's source code browser which uses GHC's interface files in a similar way. But of course, in all cases, these tools became tied to one (or two) particular compiler(s), they became likely to break if the format of the interface files changed, and they were limited by the information that happened to be available. Hence the need for a standard.) Personally, I think a compiler would be in a good position to generate intermediary documentation files since it has access to all (or at least most) informatin that is needed. (This is also not without precedent: Sun's Workshop C compiler can emit information for a browsing tool, the CenterLinc C compiler used to do something similar, and asking compilers for module dependence information is a basically a simple instance of the same idea.) On the other hand, there are some problems such as the need to respect user-supplied type signatures (as opposed to always using the inferred ones), and the fact that the types of non-exported entities might be thrown away at some inconveniently early point. So not everyone likes this. However, how intermediary documentation is generated is a secondary issue. Having a well-specified format means that anyone who would like to write a tool supplying such information has something to aim at, and that anyone who is manly interested in doing something with such information has a goodplace to start from. Finally, I believe that developing the source-level documentation conventions and an intermediary documentation format in parallel will be mutually beneficial. Defining the intermediary format will force us to think about what documentation *is* (without the need to consider specific renderings) and thus what information that needs to be provided by the commenting conventions. Converesly, practical requirements such as the source code remaining legible with prevent the intermediary format from becoming too unwieldy. ----------------------------------------------------------------------- 4. I think the intermedite format should be based on XML. I think this simply because XML is a rapidly emerging standard which was developed with precicely this kind of appliction (sematic markup) in mind. A large number of tools related to XML is already available, including some Haskell ones. Best regards, /Henrik -- Henrik Nilsson Yale University Department of Computer Science nilsson@cs.yale.edu

8926

Age (days ago)

8927

Last active (days ago)

List overview

Download

4 comments

4 participants

participants (4)

Henrik Nilsson
Jan Skibinski
malcolm-hs＠cs.york.ac.uk
Manuel M. T. Chakravarty