
On 19 October 2004 09:58, Malcolm Wallace wrote:
Isaac Jones
writes: The issue with cpp is that we can't go by extensions as we do with the rest of the preprocessors... There is a function in HMake which tests to see if a file needs to be cpp'd, so we can employ that. I think we'll probably have to just treat cpp a little differently from the others, unfortunitely, and I haven't gotten around to it.
On the other hand, you could reject the convention that cpp sources are not distinguished by file extension - e.g. create a new convention that, say, a .cpphs extension is preprocessed by cpp(hs) to get a plain .hs file.
Since Cabal is pretty new, this won't break any existing Cabal packages, and when converting non-Cabal packages to Cabal, there is some work to do anyway, so why not just adopt this as one extra rule to follow?
This is just a suggestion - I'm in two minds whether it is a good idea myself, but it is at least worth considering the possibility.
And I suppose the literate version would be .lcpphs? (unlit first, then cpp, then Haskell). It would be more consistent and arguably correct, but I'm not sure that we should do it. Another solution is to adopt a new extension for plain Haskell, say .phs. The conversion from .hs to .phs is either via CPP or just 'cat', depending on some setting somewhere. Also, I recommend that we use the compiler itself for preprocessing: ghc -E foo.hs -o foo.phs because only the compiler knows what the values for the preprocessor symbols __HASKELL__, __GLASGOW_HASKELL__, i386_TARGET_ARCH etc. should be. Otherwise we'll have to run the compiler during ./setup configure to find out the values of these symbols (isn't that what hmake does? What about when a new compiler comes along?). Cheers, Simon

"Simon Marlow"
And I suppose the literate version would be .lcpphs? (unlit first, then cpp, then Haskell).
Maybe. Generally speaking, literate comments and cpp markings are independent, so it doesn't really matter what order you process them in. You could define the convention as either .lcpphs or .cpplhs.
It would be more consistent and arguably correct, but I'm not sure that we should do it.
I agree that the issue is not clear cut.
Another solution is to adopt a new extension for plain Haskell, say .phs. The conversion from .hs to .phs is either via CPP or just 'cat', depending on some setting somewhere.
Not very pretty.
Also, I recommend that we use the compiler itself for preprocessing:
ghc -E foo.hs -o foo.phs
You are right that the compiler is best placed to define pp symbols, so this is all very well, but neither nhc98 nor Hugs currently have the -E option to stop immediately after pp. And come to think of it, the only real reason to have cpp done separately at all is because Hugs does not have a preprocessor call builtin, like ghc and nhc98 do. So maybe the best solution is to ship Hugs with -F"cpphs.hugs" enabled by default? Then no separate extension would be required, and Cabal could just defer all cpp-ing to the compiler. Another thought occurs to me. Does anyone use cpp markings in conjunction with any other preprocessors? For instance, cpp + Happy, cpp + DRiFT? What ordering applies there? I'm inclined to think that it would nearly always be cpp first, other preprocessors second, but perhaps not? After all, the cpp markings would probably still be conditioned on the end compiler, not on the intermediate pp? Regards, Malcolm

On Tue, Oct 19, 2004 at 01:41:53PM +0100, Malcolm Wallace wrote:
Simon Marlow
writes: Another solution is to adopt a new extension for plain Haskell, say .phs. The conversion from .hs to .phs is either via CPP or just 'cat', depending on some setting somewhere.
Not very pretty.
Not very pretty at all.
Also, I recommend that we use the compiler itself for preprocessing:
ghc -E foo.hs -o foo.phs
You are right that the compiler is best placed to define pp symbols, so this is all very well, but neither nhc98 nor Hugs currently have the -E option to stop immediately after pp. And come to think of it, the only real reason to have cpp done separately at all is because Hugs does not have a preprocessor call builtin, like ghc and nhc98 do. So maybe the best solution is to ship Hugs with -F"cpphs.hugs" enabled by default? Then no separate extension would be required, and Cabal could just defer all cpp-ing to the compiler.
Hugs's implementation of -F is a bit clunky: - it slows everything down (Hugs examines most modules twice: first to get the imports and later to actually read the whole thing, so a preprocessor gets run twice) - error handling is terrible I agree with Henrik about doing the preprocessing for Hugs at installation or packaging time, so that users don't need the full environment. That's what currently happens with the fptools libraries.

Ross Paterson
Hugs's implementation of -F is a bit clunky: - it slows everything down (Hugs examines most modules twice: first to get the imports and later to actually read the whole thing, so a preprocessor gets run twice) - error handling is terrible
Yeah, probably a bad idea to force cpp on all users, whether they want it or not.
I agree with Henrik about doing the preprocessing for Hugs at installation or packaging time, so that users don't need the full environment. That's what currently happens with the fptools libraries.
OK, so I think we are probably agreed on this then: * When Cabal is installing for Hugs, it does 'cpp -D__HUGS__' (or equivalent) on all Haskell source files, as it copies them into the installation location. If any more complicated situations arise between the ordering of cpp and other preprocessors, use a chain of file extensions to disambiguate? e.g. .ly.cpp = cpp first -> .ly, then literate happy to get .hs file. .cpp.ly = literate happy -> .cpp.hs, then cpp to get plain .hs file. Regards, Malcolm

I've recently run into a problem using the preprocessor (ghc -cpp). It seems it barfs on 's (apostrophe). Annoying, since naming variables something-prime is a fairly common idiom. Is this something that has a workaround, or could be fixed? (The specific problem is redefining 'head' in a macro and using it like: let w' = head ... -- breaks One workaround is to write it as: let w' = head ... -- works ) -kzm -- If I haven't seen further, it is by standing in the footprints of giants

Ketil Malde
I've recently run into a problem using the preprocessor (ghc -cpp). It seems it barfs on 's (apostrophe). Annoying, since naming variables something-prime is a fairly common idiom.
Is this something that has a workaround, or could be fixed?
The main workaround if using traditional cpp is to avoid apostrophes. :-( On the other hand, cpphs has a more liberal lexical policy, and permits both apostrophes and backticks within a token. So, your example
let w' = head ... -- breaks
would work fine in cpphs, and other tricks impossible with the original cpp are also possible, like this: #define `mplus` +++ The mismatch between C's and Haskell's lexical syntax was one important motivation for developing cpphs. Regards, Malcolm

Hi all, Malcolm Wallace wrote:
Ketil Malde
writes: I've recently run into a problem using the preprocessor (ghc -cpp). It seems it barfs on 's (apostrophe). Annoying, since naming variables something-prime is a fairly common idiom.
Is this something that has a workaround, or could be fixed?
The main workaround if using traditional cpp is to avoid apostrophes. :-(
And this, of course, is one reason why it is good if it is easy to run CPP only on those files where it is necessary. (Unless everyone uses "cpphs", then, which ultimately would seem like a good idea.) Best, /Henrik -- Henrik Nilsson School of Computer Science and Information Technology The University of Nottingham nhn@cs.nott.ac.uk This message has been scanned but we cannot guarantee that it and any attachments are free from viruses or other damaging content: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.

Henrik Nilsson wrote:
[...] (Unless everyone uses "cpphs", then, which ultimately would seem like a good idea.)
If you mean "everyone happy with a LGPL", then I would agree. But GHC and Hugs use a BSD-style license, so cpphs is not an option for them. After some googling and testing I found MCPP (http://directory.fsf.org/MCPP.html), which is a highly configurable preprocessor with a small footprint and a BSD license. Shipping this as an internal tool with GHC 6.4 and the next Hugs release (especially for use with the hugs-package tool) should be possible, I see what I can do... Cheers, S. P.S.: Ross, any schedule for a Hugs release yet?

Hi, Sven Panne wrote:
Henrik Nilsson wrote:
[...] (Unless everyone uses "cpphs", then, which ultimately would seem like a good idea.)
If you mean "everyone happy with a LGPL", then I would agree. But GHC and Hugs use a BSD-style license, so cpphs is not an option for them. After some googling and testing I found MCPP (http://directory.fsf.org/MCPP.html), which is a highly configurable preprocessor with a small footprint and a BSD license. Shipping this as an internal tool with GHC 6.4 and the next Hugs release (especially for use with the hugs-package tool) should be possible, I see what I can do...
My point was more a plea for a CPP-like preprocessor that understands Haskell's lexical syntax, which, as far as I know, is what "cpphs" is about, than an argument for any particular preprocessor. I had a look at the MCPP documentation, and while configurable, it still seems totally C-centric. So I'd guess you are talking about creating a modified version of MCPP tuned to Haskell? Could an alternative possibly be to resolve the licence issue instead? /Henrik -- Henrik Nilsson School of Computer Science and Information Technology The University of Nottingham nhn@cs.nott.ac.uk This message has been scanned but we cannot guarantee that it and any attachments are free from viruses or other damaging content: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.

At 17:26 20/10/04 +0200, Sven Panne wrote:
Henrik Nilsson wrote:
[...] (Unless everyone uses "cpphs", then, which ultimately would seem like a good idea.)
If you mean "everyone happy with a LGPL", then I would agree. But GHC and Hugs use a BSD-style license, so cpphs is not an option for them.
I'm not a lawyer, but I don't agree with this assessment. #g ------------ Graham Klyne For email: http://www.ninebynine.org/#Contact

Graham Klyne
At 17:26 20/10/04 +0200, Sven Panne wrote:
Henrik Nilsson wrote:
[...] (Unless everyone uses "cpphs", then, which ultimately would seem like a good idea.)
If you mean "everyone happy with a LGPL", then I would agree. But GHC and Hugs use a BSD-style license, so cpphs is not an option for them.
I'm not a lawyer, but I don't agree with this assessment.
I'm not a lawyer either, but I don't think there's any problem distributing an LGPL-licensed program with a BSD-Style-licenced program. For one thing, the LGPL states: In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. further: 5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. But who knows... cpphs isn't a library and a lot of the terms of the license are stated explicitly in terms of libraries. I always think of the LGPL that, "You can USE it with whatever kind of license you want, but if you alter it and redistribute it, you must do so under the terms of the LGPL or the GPL." peace, isaac

Isaac Jones writes:
I don't think there's any problem distributing an LGPL-licensed program with a BSD-Style-licenced program.
http://www.gnu.org/licenses/gpl-faq.html#WhatIsCompatible Hope this helps. :-) Peter

Isaac Jones
But who knows... cpphs isn't a library and a lot of the terms of the license are stated explicitly in terms of libraries.
In fact, cpphs is explicitly designed as *both* as library and a tool. For instance, hmake now uses the cpphs library interface internally to check for conditional imports. But I agree that most people will probably just use it as a stand-alone tool rather than a library. The 'main' module is GPL, but everything else is LGPL, to give a little more flexibility. Regards, Malcolm

On Wed, Oct 20, 2004 at 05:26:23PM +0200, Sven Panne wrote:
If you mean "everyone happy with a LGPL", then I would agree. But GHC and Hugs use a BSD-style license, so cpphs is not an option for them. After some googling and testing I found MCPP (http://directory.fsf.org/MCPP.html), which is a highly configurable preprocessor with a small footprint and a BSD license. Shipping this as an internal tool with GHC 6.4 and the next Hugs release (especially for use with the hugs-package tool) should be possible, I see what I can do...
Do we really want to get into the packaging business? I'm not sure about Ketil's setup, but I don't think that gcc -traditional has broken Haskell yet (except for \ at end of line). They may do it tomorrow, of course, but then some of us already have mcpp, and others can get it.
P.S.: Ross, any schedule for a Hugs release yet?
Ah. As usual, we could to a Unix release tomorrow, but Windows needs a bit of work, and who's going to do that? I also think the next Hugs release should support HGL/SOE on Windows (currently X11 only), but there's something wrong with the graphics part of the new Win32 package and I'm not in a position to fix it. (Fixing that would make Win32 and HGL available to GHC too, and the last of hslibs deprecable.) After that, we'll have to start asking Sigbjorn nicely.

On 20.10 17:26, Sven Panne wrote:
If you mean "everyone happy with a LGPL", then I would agree. But GHC and Hugs use a BSD-style license, so cpphs is not an option for them. After some googling and testing I found MCPP (http://directory.fsf.org/MCPP.html), which is a highly configurable preprocessor with a small footprint and a BSD license. Shipping this as an internal tool with GHC 6.4 and the next Hugs release (especially for use with the hugs-package tool) should be possible, I see what I can do...
Does that implement traditional in addition to ansi semantics? The choice whether haskell sources needing cpp should expect traditional (as current) or an ansi preprocessor is sometimes important as many things are done differently in them. For example token pasting and stringification happen in different incompatible ways... - Einar Karttunen

On Wed, Oct 20, 2004 at 10:52:43AM +0100, Henrik Nilsson wrote:
Hi all,
Malcolm Wallace wrote:
Ketil Malde
writes: I've recently run into a problem using the preprocessor (ghc -cpp). It seems it barfs on 's (apostrophe). Annoying, since naming variables something-prime is a fairly common idiom.
Is this something that has a workaround, or could be fixed?
The main workaround if using traditional cpp is to avoid apostrophes. :-(
And this, of course, is one reason why it is good if it is easy to run CPP only on those files where it is necessary. (Unless everyone uses "cpphs", then, which ultimately would seem like a good idea.)
If we are going to create a haskell-specific preprocessor. then we can do a lot better than this. === Macros and Preprocessing in Haskell. Keith Wansbrough (1999). Unpublished. Abstract: Existing large Haskell systems make extensive use of the C preprocessor, CPP. Such use is problematic, as CPP is not designed for use with Haskell code. Certain features of the Haskell report also indicate a possible deficiency in the Haskell language. This paper investigates these deficiencies, and proposes two extensions to Haskell: the inclusion of distfix operators, and the incorporation of a Haskell preprocessor, HSPP, into the Haskell standard. Related issues are discussed, including the provision of a general macro facility for Haskell. http://www.cl.cam.ac.uk/~kw217/research/misc/hspp-hw99.ps.gz === The main advantage of cpp was that it was already available on most systems and people knew of it. if we are going to create our own preprocessor, then there is no need to follow cpp's syntax. The main use of a preprocessor is for cross architecture compatability, if we don't standardize on one syntax, then that goal cannot be met. I would LOVE it if there were a 'standardized' preprocessor for haskell that worked across language implementations. it would make life so much easier. A haskell specific feature I thought would be really cool would be the ability to say available(Control.Monad.State) which would evaluate to true if the Control.Monad.State library were available. John -- John Meacham - ⑆repetae.net⑆john⑈

Hi there, Just a few thoughts on preprocessing issues. First I should say that I have not really kept up with the Cabal development, so apologies if I say something totally obvious or something that just wouldn't work with Cabal. Simon Marlow wrote:
Malcolm Wallace wrote:
Isaac Jones wrote:
Since Cabal is pretty new, this won't break any existing Cabal packages, and when converting non-Cabal packages to Cabal, there is some work to do anyway, so why not just adopt this as one extra rule to follow? This is just a suggestion - I'm in two minds whether it is a good idea myself, but it is at least worth considering the possibility.
And I suppose the literate version would be .lcpphs? (unlit first, then cpp, then Haskell). It would be more consistent and arguably correct, but I'm not sure that we should do it.
While arguably correct, such suffixes look pretty awkward to me. In the Yampa build system we simply adopted the suffix ".cpp" to indicate that C preprocessing was necessary. That would give ".hs.cpp" and ".lhs.cpp" respectively. One could argue about the correctness of that, but it is at least simple, compositional, and plays well with other suffixes. The literate convention is also specified in the Haskell 98 specification, whereas CPP and other preprocessing is not, as far as I can remember. From that perspective, the ".cpp" convention is not totally unreasonable either. Either way, personally I mostly see advantages of adopting suffixes to indicate the need for preprocessing. For example: * It is clear to anyone who's looking at the sources which files needs to be preprocessed. This is particularly important for CPP processing since CPP does not really understand Haskell, and there thus are traps for the unwary. * Suffixes makes it very easy to preprocess only selected files, which again is a particularly good idea when CPP is involved. Of course, there are other ways of doing preprocessing selectively. Maybe Cabal has such mechanisms, making this (mostly) a non-issue? (Indeed, the Yampa build system did provide an alternative way as well). For example, it is sometimes necessary to pass specific flags to the compiler for specific source files only, and if Cabal already supports that, then I guess passing "-E" selectively would just be a special case.
Another solution is to adopt a new extension for plain Haskell, say .phs. The conversion from .hs to .phs is either via CPP or just 'cat', depending on some setting somewhere. Also, I recommend that we use the compiler itself for preprocessing:
ghc -E foo.hs -o foo.phs because only the compiler knows what the values for the preprocessor symbols __HASKELL__, __GLASGOW_HASKELL__, i386_TARGET_ARCH etc. should be. Otherwise we'll have to run the compiler during ./setup configure to find out the values of these symbols (isn't that what hmake does? What about when a new compiler comes along?).
Yes, that's probably true. Malcolm Wallace wrote:
You are right that the compiler is best placed to define pp symbols, so this is all very well, but neither nhc98 nor Hugs currently have the -E option to stop immediately after pp. And come to think of it, the only real reason to have cpp done separately at all is because Hugs does not have a preprocessor call builtin, like ghc and nhc98 do. So maybe the best solution is to ship Hugs with -F"cpphs.hugs" enabled by default? Then no separate extension would be required, and Cabal could just defer all cpp-ing to the compiler.
In the Yampa build system we took the approach that installing a library for use by Hugs meant running all the preprocessing at installation time and thus installing preprocessed sources. I think that was the right approach. It simplifies for the end-user, in particular when a multitude of pre-processing is involved. E.g. they don't need to pass the right flags to Hugs and they don't need to worry about having the preprocessors in their paths etc. (The installation of a library could be system-wide, e.g. the person doing the installation might not be the same as the one actually using it later.) Additionally, there is a performance benefit, which potentially could be significant depending on what preprocessors that are involved. Similar arguments would apply if one for some reason wanted to install libraries for GHCi in source form.
Another thought occurs to me. Does anyone use cpp markings in conjunction with any other preprocessors? For instance, cpp + Happy, cpp + DRiFT? What ordering applies there? I'm inclined to think that it would nearly always be cpp first, other preprocessors second, but perhaps not? After all, the cpp markings would probably still be conditioned on the end compiler, not on the intermediate pp?
If one adopts a convention that indicates the preprocessing to be done by a simple suffix, then I think that would allow the programmer to control the ordering if necessary, avoiding building in speculative assumptions in Cabal? Speaking of suffixes and preprocessing, I've encountered another problem in the context of Yampa that might be worth rising. Originally (well, still, actually), we used Ross Patterson's arrow pre-processor for the arrow syntactic sugar. We then adopted the convention that the suffix ".as" was for "arrowized Haskell source", and ".las" for "literate arrowized Haskell source". I don't think this choice of prefixes was particularly brilliant, but this does not really matter. However, we now have the situation that GHC supports the arrow syntax directly. This begs the question of how to arrange things if one want to distribute arrowized code that also should work for other compilers/interpreters, since preprocessing still would be necessary for those other systems. In particular, which suffix should one use for the arrowized files in question? While I guess one could stick to ".hs" and then resort to various build-system trickery to get the preprocessing done when necessary, it seems to me that a more straightforward solution might be to agree on a suffix that indicates that the Arrow syntax is used (say ".arr"). Systems that do support the arrow syntax could then accept e.g. ".hs.arr" as a synonym to ".hs", or, if necessary, could look at the extension for enabling the syntactic extension. This solution is not without its problems, though, and I'm not sure what the best approach would be. But the issue is similar to some systems having built-in CPP support and others not, and it might make sense to adopt a similar solution. Of course, if arrow support is in the works for the other compilers, this last problem might not be so much of an issue. Best regards, /Henrik -- Henrik Nilsson School of Computer Science and Information Technology The University of Nottingham nhn@cs.nott.ac.uk This message has been scanned but we cannot guarantee that it and any attachments are free from viruses or other damaging content: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.
participants (11)
-
Einar Karttunen
-
Graham Klyne
-
Henrik Nilsson
-
Isaac Jones
-
John Meacham
-
Ketil Malde
-
Malcolm Wallace
-
Peter Simons
-
Ross Paterson
-
Simon Marlow
-
Sven Panne