[Haskell-cafe] Standard package file format

(Resending from right address)
We're talking about *three* options:
1. syntax for pure Haskell values, which I'll call HSON (Haskell
jSON). That's just an alternative to YAML/TOML/... That would need
extensions to allow omitting optional fields entirely.
2. a pure Haskell embedded domain-specific language (EDSL) that simply
generates cabal description records (GenericPackageDescription
values). That would allow abstraction over some patterns but not much
more. But that alone is already an argument for EDSLs—the one Harendra
already presented.
3. a Haskell embedded domain-specific language (EDSL) designed for an
extensible build tool, like Clojure's (apparently), SBT for Scala or
many others. That would potentially be a rabbit hole leading to a
rather *different* tool—with a different package format to boot. That
can't work as long as all libraries have to be built using the same
tool. But stack and cabal are really about how to manage package
databases/GHC/external environments, while extensible build tools are
about (a more powerful form) of writing custom setup scripts. I
suspect some extensions might be easier if more of the actual building
was done by the setup script, but I'm not sure.
On 16 September 2016 at 10:57, yogsototh
I guess the overriding question I have here is: what is the PROBLEM being solved?
Let me share my experience with Clojure and lein. They use a clojure hash-map for their configuration. So yes arbitrary code could be executed and I believe this is a _very good thing_.
Why? Because it makes it very easy to add sub-configuration that can be used by third party plugin. For example:
- a plugin that help the use of environment variables (lein-environ) which is really helpful for application development (not so much for library development) - a plugin that use S3 for our private dependencies (not supported by default by lein)
For deployment: we were able to add request to our API server that provide not only the written version but also the git commit hash. So we could be certain of the version of the server. Too much time there were sys/admin deployment errors. And that could only be achieved because we were able to run arbitrary command in the project description file.
I certainly forget many other advantages of having a package description format which is simply a data structure in the hosted language. But this has by far my preference.
- cabal is ok, but very imperfect, I generally need to have a lot of copy/paste, I need to change it very often while writing application with many dependencies - JSON/YAML/TOML are simply not powerful enough to match all semantics we might need to configure a project. For example we might want to have Set instead of List for some properties. Or I don't know maybe ternary tree structures.
The point is: we pay a price by adding a step between the semantic and the syntax. While if our configuration format was in Haskell we could express the semantic more directly.
_______________________________________________ Haskell-community mailing list Haskell-community@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community
-- Paolo G. Giarrusso - Ph.D. Student, Tübingen University http://ps.informatik.uni-tuebingen.de/team/giarrusso/

this may be one of the 3 points on Paolo's list. In case it is not, here is another option (4?): - define .hs data records for project config, package configs - write export tools to export config records to existing formats: cabal stack yaml ... this way, there is no need to revise the current workflow or modify tools. However we define a common standard content structure, most users do not need to worry about .cabal, .yaml syntax

On 9/16/16 2:36 AM, Paolo Giarrusso wrote:
(Resending from right address)
We're talking about *three* options: 1. syntax for pure Haskell values, which I'll call HSON (Haskell jSON). That's just an alternative to YAML/TOML/... That would need extensions to allow omitting optional fields entirely. 2. a pure Haskell embedded domain-specific language (EDSL) that simply generates cabal description records (GenericPackageDescription values). That would allow abstraction over some patterns but not much more. But that alone is already an argument for EDSLs—the one Harendra already presented. 3. a Haskell embedded domain-specific language (EDSL) designed for an extensible build tool, like Clojure's (apparently), SBT for Scala or many others. That would potentially be a rabbit hole leading to a rather *different* tool—with a different package format to boot. That can't work as long as all libraries have to be built using the same tool. But stack and cabal are really about how to manage package databases/GHC/external environments, while extensible build tools are about (a more powerful form) of writing custom setup scripts. I suspect some extensions might be easier if more of the actual building was done by the setup script, but I'm not sure.
Options 2 and 3 both require running Haskell code at build time. This presents problems in a couple of cases: Cross compilation: There are already a couple of cases where we need to run Haskell code at build time: Template Haskell, and custom Setup.hs. Neither of these are supported in cross-compilation. (The former is a ghc issue, while the latter is a Cabal issue.) So I'm assuming that the new Haskell-based EDSL wouldn't work in cross-compilation, either. The difference is that Template Haskell and custom Setup.hs are only used by some packages. But if all packages had to use the new EDSL, then cross-compilation would essentially become impossible. Platforms without ghci: Even when not cross-compiling, some platforms don't support ghci. (This is usually the less popular platforms. Not too long ago, it even included ARM.) ghci support is necessary for Template Haskell, and I assume the EDSL would work the same way. So then less popular platforms would be left out in the cold. --Patrick

On 16 September 2016 at 12:13, Patrick Pelletier
On 9/16/16 2:36 AM, Paolo Giarrusso wrote:
(Resending from right address)
We're talking about *three* options: 1. syntax for pure Haskell values, which I'll call HSON (Haskell jSON). That's just an alternative to YAML/TOML/... That would need extensions to allow omitting optional fields entirely. 2. a pure Haskell embedded domain-specific language (EDSL) that simply generates cabal description records (GenericPackageDescription values). That would allow abstraction over some patterns but not much more. But that alone is already an argument for EDSLs—the one Harendra already presented. 3. a Haskell embedded domain-specific language (EDSL) designed for an extensible build tool, like Clojure's (apparently), SBT for Scala or many others. That would potentially be a rabbit hole leading to a rather *different* tool—with a different package format to boot. That can't work as long as all libraries have to be built using the same tool. But stack and cabal are really about how to manage package databases/GHC/external environments, while extensible build tools are about (a more powerful form) of writing custom setup scripts. I suspect some extensions might be easier if more of the actual building was done by the setup script, but I'm not sure.
Options 2 and 3 both require running Haskell code at build time.
But if all packages had to use the new EDSL, then cross-compilation would essentially become impossible.
"All packages migrate to new format" doesn't seem really a plausible option, as I already hinted in the text you quote. There are multiple JVM build tools because they're interoperable (like cabal-install and Stack): each library picks its own build tool, but they can still be linked together. Hpack generates cabal files, stack reuses cabal or hpack files. In principle, option 2 just needs a non-cross-compiled program to produce a package description—say by producing a cabal file. You just need to runghc it, either via ghci or by compiling and running a binary. Option 3 can be trickier depending on details, but the as long as you account for cross-compilation in the design it should be doable. For Template Haskell the problem is deeper (see http://blog.ezyang.com/2016/07/what-template-haskell-gets-wrong-and-racket-g...), so let's *not* use it here. -- Paolo G. Giarrusso - Ph.D. Student, Tübingen University http://ps.informatik.uni-tuebingen.de/team/giarrusso/

This seems to have gone into a different direction. The original point was about the package specification format and not expressing a full fledged build system. That is an entirely different ballgame. The main point of the thread was whether it makes sense to use a single specification format for both stack and cabal install (YAML vs .cabal and then TOML came into picture). Haskell does not seem to be a choice for a package specification format unless we have a very different goal in mind. -harendra On 16 September 2016 at 16:08, Paolo Giarrusso
On 16 September 2016 at 12:13, Patrick Pelletier
wrote: On 9/16/16 2:36 AM, Paolo Giarrusso wrote:
(Resending from right address)
We're talking about *three* options: 1. syntax for pure Haskell values, which I'll call HSON (Haskell jSON). That's just an alternative to YAML/TOML/... That would need extensions to allow omitting optional fields entirely. 2. a pure Haskell embedded domain-specific language (EDSL) that simply generates cabal description records (GenericPackageDescription values). That would allow abstraction over some patterns but not much more. But that alone is already an argument for EDSLs—the one Harendra already presented. 3. a Haskell embedded domain-specific language (EDSL) designed for an extensible build tool, like Clojure's (apparently), SBT for Scala or many others. That would potentially be a rabbit hole leading to a rather *different* tool—with a different package format to boot. That can't work as long as all libraries have to be built using the same tool. But stack and cabal are really about how to manage package databases/GHC/external environments, while extensible build tools are about (a more powerful form) of writing custom setup scripts. I suspect some extensions might be easier if more of the actual building was done by the setup script, but I'm not sure.
Options 2 and 3 both require running Haskell code at build time.
But if all packages had to use the new EDSL, then cross-compilation would essentially become impossible.
"All packages migrate to new format" doesn't seem really a plausible option, as I already hinted in the text you quote. There are multiple JVM build tools because they're interoperable (like cabal-install and Stack): each library picks its own build tool, but they can still be linked together. Hpack generates cabal files, stack reuses cabal or hpack files.
In principle, option 2 just needs a non-cross-compiled program to produce a package description—say by producing a cabal file. You just need to runghc it, either via ghci or by compiling and running a binary. Option 3 can be trickier depending on details, but the as long as you account for cross-compilation in the design it should be doable. For Template Haskell the problem is deeper (see http://blog.ezyang.com/2016/07/what-template-haskell-gets- wrong-and-racket-gets-right/), so let's *not* use it here. -- Paolo G. Giarrusso - Ph.D. Student, Tübingen University http://ps.informatik.uni-tuebingen.de/team/giarrusso/ _______________________________________________ Haskell-community mailing list Haskell-community@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community

On 16 September 2016 at 13:05, Harendra Kumar
This seems to have gone into a different direction. The original point was about the package specification format and not expressing a full fledged build system. That is an entirely different ballgame. The main point of the thread was whether it makes sense to use a single specification format for both stack and cabal install (YAML vs .cabal and then TOML came into picture). Haskell does not seem to be a choice for a package specification format unless we have a very different goal in mind.
I agree "full-fledged build system" is not a possible immediate goal. But an EDSL for expressing cabal projects (as they are today) would still be in scope of your proposal—and I thought you liked the idea (see quote below). Using the earlier options: option 3 is not in scope of this thread, but option 2 is, with the only danger that the design space is so big to present a challenge. Quoting from Harendra Kumar's earlier mail:
Why not adopt (a subset of) .hs AST file format to structure both project and package files?
Aha, that's my preferred choice. If there is a way to restrict features and we can allow just a subset we can have a nice configuration language which is a real language. In fact, I have been toying around this. If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.
For example make has so many built-in functions in it that it has become a full fledged language by itself. The google bazel build uses python as the build config language. Haskell will make a much better choice for such use cases. Pure declarative is a pain for such use cases.
On 16 September 2016 at 16:08, Paolo Giarrusso
wrote: On 16 September 2016 at 12:13, Patrick Pelletier
wrote: On 9/16/16 2:36 AM, Paolo Giarrusso wrote:
We're talking about *three* options: 1. syntax for pure Haskell values, which I'll call HSON (Haskell jSON). That's just an alternative to YAML/TOML/... That would need extensions to allow omitting optional fields entirely. 2. a pure Haskell embedded domain-specific language (EDSL) that simply generates cabal description records (GenericPackageDescription values). That would allow abstraction over some patterns but not much more. But that alone is already an argument for EDSLs—the one Harendra already presented. 3. a Haskell embedded domain-specific language (EDSL) designed for an extensible build tool, like Clojure's (apparently), SBT for Scala or many others. That would potentially be a rabbit hole leading to a rather *different* tool—with a different package format to boot. That can't work as long as all libraries have to be built using the same tool. But stack and cabal are really about how to manage package databases/GHC/external environments, while extensible build tools are about (a more powerful form) of writing custom setup scripts. I suspect some extensions might be easier if more of the actual building was done by the setup script, but I'm not sure.
-- Paolo G. Giarrusso - Ph.D. Student, Tübingen University http://ps.informatik.uni-tuebingen.de/team/giarrusso/

On 16 September 2016 at 16:51, Paolo Giarrusso
I agree "full-fledged build system" is not a possible immediate goal. But an EDSL for expressing cabal projects (as they are today) would still be in scope of your proposal—and I thought you liked the idea (see quote below). Using the earlier options: option 3 is not in scope of this thread, but option 2 is, with the only danger that the design space is so big to present a challenge.
Yeah I like the idea of using Haskell for configs but perhaps in a different problem space e.g. in a build spec. See the quote from my earlier quote below, sorry for the confusion :-) Yes, maybe option 2 might work for package specifications but sounds pretty hairy to explore for this use case alone, unless we have other motivations.
Quoting from Harendra Kumar's earlier mail:
If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.
-harendra
participants (4)
-
Harendra Kumar
-
Imants Cekusins
-
Paolo Giarrusso
-
Patrick Pelletier