Re: [arch-haskell] On datafiles in libraries

7 Jan 2011

      Peter Simons wrote:
...
...
I think that we should remove the data files completely from the
packages and only host them at a well-known URL (on kiwilight.com or
haskell.org).
The notion that those files reside on a remote server concerns me a
little, because it means that reproducibility goes, basically, straight
out of the window. If you run cabal2arch to generate a PKGBUILD, and
then run the exact same command just a few seconds later, it might
generate a different PKGBUILD just because some invisible file has
changed on a remote server at the other side of the earth. That property
feels like it's bound to create surprises for people. Also, this change
would reduce the amount of information expressed by cabal2arch's version
number even further into the direction of "zero" than it already is.
I agree with these concerns. The files are essentially configuration files for
cabal2arch and they should be included in the cabal2arch package. They should
also be available online for direct download.

I think the best way to do this is to host them separately on-line as already
suggested, but they should be versioned. They could then be specified in the
"source" array of the cabal2arch PKGBUILD.

This would ensure reproducibility in any given version of cabal2arch while
still making those files available for general use.

The versions of both files could be made to coincide with cabal2arch versions
to keep the cabal2arch PKGBUILD simple, e.g.:

source=(..., "http://example.com/data/ghc-provides.${pkgver}.txt",...)

(I'm assuming that the server would have compression enabled, otherwise I would
use an archive.)

Otherwise, if we expect to use those files in other packages then perhaps they
should receive their own package (similar to the pacman-mirrorlist package).
cabal2arch could still specify a version of that package as a dependency, if
needed.

Regards,
Xyne