A restricted subset of CPP included in a revision of Haskell 98

Hi, I find it strange that right now almost every Haskell program directly or indirectly (through FPTOOLS) depends on CPP, yet there is no effort to replace CPP with something better or standardize its usage in Haskell. According to the following document, and my own limited experience in reading Haskell code, CPP is the most frequently used extension: http://hackage.haskell.org/trac/haskell-prime/wiki/HaskellExtensions I think that if we accepted that CPP was part of the language, we could then place some restrictions on its use to facilitate easier parsing. Here are some suggestions, off the top of my head: * #define can only be used for parameterless definitions * #define'd symbols are only visible to the preprocessor * #define can only give a symbol a value that is a valid preprocessor expression * #define can only appear above the module declaration * a preprocessor symbol, once defined, cannot be undefined or redefined * #include and #undef are prohibited * The preprocessor can only be used at the top level. In particular, a prepropcessor conditional, #error, #warn, #line would not be allowed within the export list or within a top-level binding. * A Haskell program must assume that any top-level symbol definitions are constant over the entire program. For example, a program must not depend on having one module compiled with one set of command-line preprocessor symbol bindings and another module defined with a different set of bindings. * preprocessor directives must obey Haskell's layout rules. For example, an #if cannot be indented more than the bindings it contains. The result would be: * Syntax can be fully checked without knowing the values of any preprocessor symbols. * Preprocessor syntax can be added easily to a Haskell parser's BNF description of Haskell. * No tool will need to support per-file/module preprocessor symbol bindings. Again, all this is just off the top of my head. I am curious about what problems these restrictions might cause, especially for existing programs. I know that GHC itself uses some features that would be prohibited here. But, GHC is really difficult for tools to handle even with these restrictions on its source code. For now, I am more interested in the libraries in FPTOOLS and users' programs. What libraries/programs cannot easily be reorganizated to meet these restrictions? I suspect "#define'd symbols are only visible to the preprocessor" would be the most troublesome one. Thanks, Brian

On Thu, Aug 17, 2006 at 11:44:17AM -0500, Brian Smith wrote:
Hi,
I find it strange that right now almost every Haskell program directly or indirectly (through FPTOOLS) depends on CPP, yet there is no effort to replace CPP with something better or standardize its usage in Haskell. According to the following document, and my own limited experience in reading Haskell code, CPP is the most frequently used extension: http://hackage.haskell.org/trac/haskell-prime/wiki/HaskellExtensions I think that if we accepted that CPP was part of the language, we could then place some restrictions on its use to facilitate easier parsing. Here are some suggestions, off the top of my head:
see this paper for some interesting work on the subject. http://citeseer.ist.psu.edu/wansbrough99macros.html there would be no need to integrate it with compilers, it could be a stand-alone tool, like hsc2hs. John -- John Meacham - ⑆repetae.net⑆john⑈

On 8/17/06, John Meacham
On Thu, Aug 17, 2006 at 11:44:17AM -0500, Brian Smith wrote:
Hi,
I find it strange that right now almost every Haskell program directly or indirectly (through FPTOOLS) depends on CPP, yet there is no effort to replace CPP with something better or standardize its usage in Haskell.
see this paper for some interesting work on the subject. http://citeseer.ist.psu.edu/wansbrough99macros.html
Thanks for that. I should have not said "there is no effort to replace CPP" before. I hope I did not offend anybody that has worked on this problem previously. I was also mistaken in saying that syntax could be fully checked without knowing any preprocessor symbol bindings. This is only true if one gets rid of the ability to choose between two syntaxes via the preprocessor. But, if we allow syntax that we can't parse (but presumably another implementation can), then the preprocesor must remain a true preprocessor. Then there isn't much reason to place so many restrictions on where the various preprocessor directives may appear. I proposed to limit where #define could appear mostly for asthetic reasons. If #define, #error, and #warn only appear at the beginning of a file, then the rest of the file would only contain Haskell syntax in between #if...#else...#endif. Also, a refactoring tool would not have these directives get in its way. I want to have conditionals limited in their placement to make things easier for refactoring tools. But, I don't have any ideas about how to deal with conditional exports without allowing preprocessor conditionals in the export list. * #define can only be used for parameterless definitions * #define'd symbols are only visible to the preprocessor * #define can only give a symbol a value that is a valid preprocessor expression * #define, #error, and #warn can only appear above the module declaration * a preprocessor symbol, once defined, cannot be undefined or redefined with a different value * #include and #undef are prohibited * The preprocessor can only be used at the top level. In particular, a prepropcessor conditional or #line would not be allowed within the export list or within a top-level binding. * A Haskell program must assume that any top-level symbol definitions are constant over the entire program. For example, a program must not depend on having one module compiled with one set of command-line preprocessor symbol bindings and another module defined with a different set of bindings. * preprocessor directives must loosely obey * #define can only be used for parameterless definitions * #define'd symbols are only visible to the preprocessor * #define can only give a symbol a value that is a valid preprocessor expression * #define can only appear above the module declaration * a preprocessor symbol, once defined, cannot be undefined or redefined * #include and #undef are prohibited * The preprocessor can only be used at the top level. In particular, a prepropcessor conditional, #error, #warn, #line would not be allowed within the export list or within a top-level binding. * A Haskell program must assume that any top-level symbol definitions are constant over the entire program. For example, a program must not depend on having one module compiled with one set of command-line preprocessor symbol bindings and another module defined with a different set of bindings. * preprocessor directives must obey a very simple layout rule: an #if, #else, or #endif cannot be indented more than the bindings it "contains."

On Thursday, August 17, 2006 7:54 PM, Brian Smith wrote:
I want to have conditionals limited in their placement to make things easier for refactoring tools. But, I don't have any ideas about how to deal with conditional exports without allowing preprocessor conditionals in the export list.
It seems to me that all uses of the preprocessor could be avoided except for cases like: #ifdef _SPARC -- sparc code #else #ifdef _INTEL86 -- i86 code #else -- byte code #endif #endif and the above could afaics be dealt with by having a conditional import directive eg: module Platforms (Platform(..)) where data Platform = Sparc | Intel | ByteCode module Client where import Platform import qualified ( case #Platform of Sparc -> Compiler.Sparc.CodeGen Intel -> Compiler.Intel.CodeGen _ -> Compiler.ByteCode.CodeGen ) as CodeGen where a leading '#' denotes a preprocessor symbol (corresponding to the type of the same name) which can only be set outside the program ie on the command line, thus ensuring that the same module can't have multiple interpretations in the same program. Conditions could be formed using case, if, and expressions which can be evaluated at compile time. Of course this would require some effort to modify existing code, but it would have the great advantage that the conditional compilation would be well typed and be part of the normal grammar thus making it easier to write refactoring tools. Regards, (another) Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com _____________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Aug 17, 2006, at 17:11 , Brian Hulley wrote:
On Thursday, August 17, 2006 7:54 PM, Brian Smith wrote:
I want to have conditionals limited in their placement to make things easier for refactoring tools. But, I don't have any ideas about how to deal with conditional exports without allowing preprocessor conditionals in the export list.
It seems to me that all uses of the preprocessor could be avoided except for cases like:
#ifdef _SPARC -- sparc code #else #ifdef _INTEL86 -- i86 code #else -- byte code #endif #endif
That's one of the worst ways to use CPP. The code generator should have a parameter that determines what to generate code for. That's much nicer in many ways. -- Lennart

brianlsmith:
Hi, I find it strange that right now almost every Haskell program directly or indirectly (through FPTOOLS) depends on CPP, yet there is no effort to replace CPP with something better or standardize its usage in Haskell. According to the
Note also cpphs, http://www.cs.york.ac.uk/fp/cpphs/ -- Don

Even though I'm largely responsible for making CPP available in a Haskell compiler I think it's an abomination. It should be avoided. If we standardize it, people will use it even more. I think we should discourage it instead, then looking at exactly what it's used for and supplying sane versions of it. -- Lennart On Aug 17, 2006, at 12:44 , Brian Smith wrote:
Hi,
I find it strange that right now almost every Haskell program directly or indirectly (through FPTOOLS) depends on CPP, yet there is no effort to replace CPP with something better or standardize its usage in Haskell. According to the following document, and my own limited experience in reading Haskell code, CPP is the most frequently used extension: http://hackage.haskell.org/trac/haskell-prime/wiki/ HaskellExtensions I think that if we accepted that CPP was part of the language, we could then place some restrictions on its use to facilitate easier parsing. Here are some suggestions, off the top of my head:
* #define can only be used for parameterless definitions * #define'd symbols are only visible to the preprocessor * #define can only give a symbol a value that is a valid preprocessor expression * #define can only appear above the module declaration * a preprocessor symbol, once defined, cannot be undefined or redefined * #include and #undef are prohibited * The preprocessor can only be used at the top level. In particular, a prepropcessor conditional, #error, #warn, #line would not be allowed within the export list or within a top-level binding. * A Haskell program must assume that any top-level symbol definitions are constant over the entire program. For example, a program must not depend on having one module compiled with one set of command-line preprocessor symbol bindings and another module defined with a different set of bindings. * preprocessor directives must obey Haskell's layout rules. For example, an #if cannot be indented more than the bindings it contains.
The result would be: * Syntax can be fully checked without knowing the values of any preprocessor symbols. * Preprocessor syntax can be added easily to a Haskell parser's BNF description of Haskell. * No tool will need to support per-file/module preprocessor symbol bindings.
Again, all this is just off the top of my head. I am curious about what problems these restrictions might cause, especially for existing programs. I know that GHC itself uses some features that would be prohibited here. But, GHC is really difficult for tools to handle even with these restrictions on its source code. For now, I am more interested in the libraries in FPTOOLS and users' programs. What libraries/programs cannot easily be reorganizated to meet these restrictions? I suspect "#define'd symbols are only visible to the preprocessor" would be the most troublesome one.
Thanks, Brian _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

On Thu, 17 Aug 2006, Brian Smith wrote:
I find it strange that right now almost every Haskell program directly or indirectly (through FPTOOLS) depends on CPP, yet there is no effort to replace CPP with something better or standardize its usage in Haskell.
I think there should be more effort to avoid CPP completely. My experiences with Modula-3 are, that you can nicely separate special-purpose stuff into modules which are included depending on some conditions. Say you want the same module both for Windows and Unix, you provide directories WIN32 and POSIX containing implementations with the same interface and then the make system can choose the appropriate directory. See for example the handling of line end coding in Windows and Unix: http://www.elego-software-solutions.com/cgi-bin/cvsweb.cgi/cm3/m3-libs/libm3...

Henning Thielemann wrote:
On Thu, 17 Aug 2006, Brian Smith wrote:
I find it strange that right now almost every Haskell program directly or indirectly (through FPTOOLS) depends on CPP, yet there is no effort to replace CPP with something better or standardize its usage in Haskell.
I think there should be more effort to avoid CPP completely.
I agree, especially as I'm trying to write an editor for Haskell which will certainly not cope with CPP at all! ;-) The reason it would not cope is that CPP turns what would otherwise be one program/module/library into several programs/modules/libraries which simultaneously co-exist in the same text in a rather uneasy and vague relationship, and what's even worse: the same module can have multiple meanings in the *same* program depending on use of #ifdef #undef etc, thus making code navigation quite impossible: the meaning of each module now depends on how you got there and might even be different the second time round... It is also notoriously difficult for people to understand code full of #ifdef's.
My experiences with Modula-3 are, that you can nicely separate special-purpose stuff into modules which are included depending on some conditions. Say you want the same module both for Windows and Unix, you provide directories WIN32 and POSIX containing implementations with the same interface and then the make system can choose the appropriate directory.
It would also be nice to have a very simple build system instead of requiring makefiles ie so any Haskell program, even ones involving other tools such as attribute grammar desugaring, could be built using ghc --make (or some other tool which took only the source + description of available tools as input). I think the acid test would be to reach a point where anyone can download the source for some large program such as GHC and just type ghc --make Main and expect the program to be built in one pass with no problems. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

[ I'm just working through a large backlog of mails, so the original message is a bit old... :-) ] Am Sonntag, 20. August 2006 22:37 schrieb Henning Thielemann:
On Thu, 17 Aug 2006, Brian Smith wrote: [...] I think there should be more effort to avoid CPP completely. My experiences with Modula-3 are, that you can nicely separate special-purpose stuff into modules which are included depending on some conditions. Say you want the same module both for Windows and Unix, you provide directories WIN32 and POSIX containing implementations with the same interface and then the make system can choose the appropriate directory. [...]
That's a nice theory, but this doesn't work in practice, at least not for me. The problem in my OpenGL/GLUT/... bindings is that the calling convention to the native libraries is different on Windows, and there is no "Haskell way" to parametrize this. Therefore using a preprocessor is the only sane way I see here. Having to duplicate e.g. 567 "foreign imports" just to avoid CPP in the OpenGL package is a rather bad tradeoff IMHO. Almost everything is better than redundancy, even CPP... Another use of CPP in the OpenGL package is to access OpenGL extension entry points. Here CPP is used to generate a 'foreign import "dynamic"' and two Haskell functions per extension entry. Perhaps this could be done via TH, but this would limit the portability, again a bad tradeoff. I would be glad if there were other ways to achieve these things, but I fail to see them. Cheers, S.
participants (7)
-
Brian Hulley
-
Brian Smith
-
dons@cse.unsw.edu.au
-
Henning Thielemann
-
John Meacham
-
Lennart Augustsson
-
Sven Panne