package description fields

Proposals: 1) New field extra-tmp-files, a list of extra files to be removed by setup clean, beyond those that can be deduced. 2) Rename other-files as extra-source-files for consistency and clarity. 3) New field data-files, a list of files to be copied to a place where an executable can find them (e.g. template-hsc.h for hsc2hs): Hugs: the directory containing the Main module GHC/Windows: the directory containing the executable GHC/Unix: /usr/local/share/<exename> plus a new function in System.Directory to return the name of this directory. That would address Dimitry's requirements in http://www.haskell.org/pipermail/haskell-cafe/2005-July/010886.html Dimitry has a more elaborate proposal: http://www.haskell.org/pipermail/libraries/2005-July/004146.html

On 8/3/05, Ross Paterson
Proposals:
1) New field extra-tmp-files, a list of extra files to be removed by setup clean, beyond those that can be deduced.
What is the use case for this? Temporary files are either created by tools Cabal already "knows" about (GHC, hsc2hs, etc.), or they are created by hooks. It seems like Cabal should know enough to clean temporary files created by the tools it invokes. If the user writes a hook that generates temporary files, then they should also write a pre/-post-clean hook to delete them. Alternative proposal: Modify Cabal so that all temporary files are generated in the scratch directory, and "setup clean" merely deletes the scratch directory. Hooks that generate their temporary files into the scratch directory will have them deleted automatically by "setup clean," and hooks that generate temporary files anywhere else (and in the source tree in particular) are discouraged and not directly supported (though the package author could still use a pre/post-clean hook to delete them manually).
3) New field data-files, a list of files to be copied to a place where an executable can find them (e.g. template-hsc.h for hsc2hs): Hugs: the directory containing the Main module GHC/Windows: the directory containing the executable GHC/Unix: /usr/local/share/<exename> plus a new function in System.Directory to return the name of this directory. That would address Dimitry's requirements in
How about allowing directories too, which would be copied recursively? What would the paths in data-files be relative to? How would one reference a file that is generated by a hook into the scratch directory? For example, imagine that wxHaskell was Cabalized, and that its source distribution was to include the .hs files generated from the wxdirect tool. BTW, what exactly is a "source distribution?" I think a very reasonable definition is "A source distribution can be built with nothing more than a Haskell compiler/interpreter that supports the required language features and that has the correct packages and libraries installed" In particular, a source distribution would never require any preprocessors. Regards, Brian

On Wed, Aug 03, 2005 at 07:27:01PM -0500, Brian Smith wrote:
On 8/3/05, Ross Paterson
wrote: 1) New field extra-tmp-files, a list of extra files to be removed by setup clean, beyond those that can be deduced.
What is the use case for this? Temporary files are either created by tools Cabal already "knows" about (GHC, hsc2hs, etc.), or they are created by hooks. It seems like Cabal should know enough to clean temporary files created by the tools it invokes. If the user writes a hook that generates temporary files, then they should also write a pre/-post-clean hook to delete them.
Indeed, these would be files created by hooks, e.g. the postConf hook in defaultUserHooks. The purpose of the field would be to avoid the need for another hook. For that particular hook (e.g. using an autoconf-based configure), it's not feasible to put the temporary files inside dist.
3) New field data-files, a list of files to be copied to a place where an executable can find them (e.g. template-hsc.h for hsc2hs): Hugs: the directory containing the Main module GHC/Windows: the directory containing the executable GHC/Unix: /usr/local/share/<exename> plus a new function in System.Directory to return the name of this directory. That would address Dimitry's requirements in
How about allowing directories too, which would be copied recursively?
No objection to that.
What would the paths in data-files be relative to?
All file names mentioned in a .cabal file are relative to the root of the package source tree, i.e. the directory containing the .cabal file. But I had in mind that if you had data-files: include/template_hsc.h that template_hsc.h would be copied to a file of that name relative to the package data directory (not under include).
BTW, what exactly is a "source distribution?" I think a very reasonable definition is "A source distribution can be built with nothing more than a Haskell compiler/interpreter that supports the required language features and that has the correct packages and libraries installed" In particular, a source distribution would never require any preprocessors.
That wouldn't do -- a source distribution must be system- and implementation-independent. That means you can't include the output of hsc2hs or cpphs, for example. Preprocessors like happy and alex are a different case: they can produce implementation-independent output, which would be useful on systems without these programs, but they can also produce versions that take advantage of GHC features. In such cases, if a package contained the implementation-independent output as well as the preprocessor input, it might be useful for Cabal to use the preprocessor if available and the packaged output if it wasn't.

Ross Paterson
On Wed, Aug 03, 2005 at 07:27:01PM -0500, Brian Smith wrote:
On 8/3/05, Ross Paterson
wrote: 1) New field extra-tmp-files, a list of extra files to be removed by setup clean, beyond those that can be deduced.
What is the use case for this? Temporary files are either created by tools Cabal already "knows" about (GHC, hsc2hs, etc.), or they are created by hooks. It seems like Cabal should know enough to clean temporary files created by the tools it invokes. If the user writes a hook that generates temporary files, then they should also write a pre/-post-clean hook to delete them.
Indeed, these would be files created by hooks, e.g. the postConf hook in defaultUserHooks. The purpose of the field would be to avoid the need for another hook. For that particular hook (e.g. using an autoconf-based configure), it's not feasible to put the temporary files inside dist.
This field sounds fine to me. I agree with Brian that it may be preferable to make all the tools generate files into dist/tmp (in fact, Cabal itself should probably do that more often), but I know it's not feasible to expect existing tools to change.
3) New field data-files, a list of files to be copied to a place where an executable can find them (e.g. template-hsc.h for hsc2hs): Hugs: the directory containing the Main module GHC/Windows: the directory containing the executable GHC/Unix: /usr/local/share/<exename> plus a new function in System.Directory to return the name of this directory. That would address Dimitry's requirements in
How about allowing directories too, which would be copied recursively?
No objection to that.
This sounds good, except that I'm not sure that System.Directory is the right place for this. It could also go in Distribution somewhere, but either way may be fine. Also, if we do this, we should probably specify the manner in which such a directory should be layed out so: 1) it doesn't get too cluttered 2) different packages don't stomp on each-other's files and 3) different versions of different packages can use the same filenames. One stab at it (where dataFileDir :: FilePath) would be that your datafiles should be in: System.Directory.dataFileDir `joinFilePath` (packageName ++ "-" packageVersion) and this is where cabal will put it. Alternately, of course, you'd have: dataFileDir :: Distribution.Package.PackageIdentifier -> FilePath dataFileDir ident = dataFileDirRoot `joinFilePath` (showPackageId ident) which I like better, but that strongly couples "dataFileDir" to the Cabal package in that you need to have a PackageIdentifier. How do you get that PackageIdentifier? Well, your program will have to parse your .cabal file. No problem! (if you have one) (snip)
BTW, what exactly is a "source distribution?" I think a very reasonable definition is "A source distribution can be built with nothing more than a Haskell compiler/interpreter that supports the required language features and that has the correct packages and libraries installed" In particular, a source distribution would never require any preprocessors.
That wouldn't do -- a source distribution must be system- and implementation-independent. (snip)
I definitely agree with Ross. However, I already have a TODO item for adding a flag to "setup sdist" to handle this in a somewhat more fine-grained mannar: ** if there's a flag, --include-preprocessed-sources (or something better) run the preprocessing phase and include both the unpreprocessed and the preprocessed sources in the source tarball? But really, there are two kinds of preprocessors, as Ross points out. The kind that produce OS-independent code, and the kind that produce OS-dependent code. Perhaps this concept shoudl be added to the PreProcessor type, and a we could have two flags to sdist: --include-standalone-preprocessed-sources Which would generate the OS-independent sources from tools like Alex and Happy... --include-all-preprocessed-sources Which just includes all of the preprocessed sources as above. A downside to this is in how it interacts with another proposal to add tool dependencies. If a package tool-depends on "alex", and then a source tarball is created with --include-standalone-preprocessed-sources, then it actually no longer tool-depends on alex, so we should regenerate the .cabal file. I guess that's no big deal. What do folks think of this? Better ieas for the names of the flags? peace, isaac

On Mon, Aug 08, 2005 at 06:59:04PM -0700, Isaac Jones wrote:
I already have a TODO item for adding a flag to "setup sdist" to handle this in a somewhat more fine-grained mannar:
** if there's a flag, --include-preprocessed-sources (or something better) run the preprocessing phase and include both the unpreprocessed and the preprocessed sources in the source tarball?
But really, there are two kinds of preprocessors, as Ross points out. The kind that produce OS-independent code, and the kind that produce OS-dependent code. Perhaps this concept shoudl be added to the PreProcessor type, and a we could have two flags to sdist:
--include-standalone-preprocessed-sources
Which would generate the OS-independent sources from tools like Alex and Happy...
--include-all-preprocessed-sources
Which just includes all of the preprocessed sources as above.
Each additional option complicates the interface. It's far better to make the tool smart enough to choose the right thing, IMO. So instead of --include-standalone-preprocessed-sources, I propose: * preprocessors are annotated with a flag saying whether they are able to produce system- and implementation-independent output (e.g. happy, alex). * for such preprocessors, and if the preprocessor is present, sdist includes the preprocessed output for a CompilerFlavor of Unknown (e.g. happy with no options), in addition to the input. * building a module for which preprocessor input (e.g. Foo.y) is present: if the preprocessor is available, use it with the specific CompilerFlavor (e.g. happy -g). If not, but preprocessed output is available, use that. Otherwise, fail (as now). This will be a awkward with the current scheme of leaving preprocessed output in the source directory, but it was already desirable to change that to keep source directories tidy and simplify setup clean.

On 8/3/05, Ross Paterson
Proposals:
1) New field extra-tmp-files, a list of extra files to be removed by setup clean, beyond those that can be deduced.
I can't see why this field is required.
2) Rename other-files as extra-source-files for consistency and clarity.
OK
3) New field data-files, a list of files to be copied to a place where an executable can find them (e.g. template-hsc.h for hsc2hs): Hugs: the directory containing the Main module GHC/Windows: the directory containing the executable GHC/Unix: /usr/local/share/<exename> plus a new function in System.Directory to return the name of this directory. That would address Dimitry's requirements in
I would like to propose more Unix like directory layout for Windows i.e.: executables: %ProgramFiles%\<pkg-name>\bin libraries: %ProgramFiles%\<pkg-name>\lib data files: %ProgramFiles%\<pkg-name>\data The advantage is that if you have a lot of libraries and data files then it is easier to find out the executable files. Happy, Alex and Haddock already use layout like in the Ross's proposal but I found it very inconvenient because usually I am building them together with ghc. The problem is that when I do make install in the fptools directory then ghc is installed in ${prefix}/bin but the above tools are installed in ${prefix}. Usually then ${prefix} directory isn't included in the PATH and this makes them inaccessible. I can move them to ${prefix}/bin but then I have to move all other template files too. It would be nice if we had more consistent directory layout. Cheers, Krasimir
participants (4)
-
Brian Smith
-
Isaac Jones
-
Krasimir Angelov
-
Ross Paterson