
I am starting a new thread for the package file format related discussion.
From a developer's perspective, the major benefit of a standard and widely adopted format and is that people can utilize their knowledge acquired from elsewhere, they do not have to go through and learn differently looking and incomplete documentation of different tools. The benefit of a common config specification is that developers can choose tools freely without worrying about learning the same concepts presented in different ways.
Multiple formats flying around also create a psychological impression of complexity in the ecosystem for newcomers. If we have consistency there are better chances of attracting more people to the language ecosystem. I gather the following from the discussion till now: * We have cabal, YAML and TOML as potential candidates for a common package format which can additionally incorporate the concept of snapshots/package collections and potentially more extensions useful across build tools. * cabal has the benefit of incumbency and backward compatibility, it has shortcomings which are being addressed but it is still a format which is very specific to Haskell ecosystem. It is not a standard and not going to become one. We have to always deal with it ourselves and everyone coming to Haskell will have to learn it. * YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex. * TOML (https://github.com/toml-lang/toml) is promising, simpler than YAML and is being used by a few important projects but is still evolving and is not completely stable. On a first glance it looks pretty simple and a lot of other tools use a similar config format. It is aiming to become a standard and aiming for a wider adoption. As a next step we can perhaps do an hpack like experiment using the TOML format. That way we will have some experience with that as well and get to know if there are any potential problems expressing the existing cabal files. More thoughts, opinions on the topic will help create a better understanding about it. -harendra

Another factor in favor of YAML is that it is a superset of JSON, which
eases the learning curve even more (with JSON being a de facto lingua
franca for cross-platform untyped data structures), and offers some extra
possibilities, although I admit that I can't think of any practical uses.
The fact that both Yaml and JSON can be represented as Aeson Values would
also make things (arguably) easier for tool writers.
On Sep 16, 2016 8:20 AM, "Harendra Kumar"
I am starting a new thread for the package file format related discussion.
From a developer's perspective, the major benefit of a standard and widely adopted format and is that people can utilize their knowledge acquired from elsewhere, they do not have to go through and learn differently looking and incomplete documentation of different tools. The benefit of a common config specification is that developers can choose tools freely without worrying about learning the same concepts presented in different ways.
Multiple formats flying around also create a psychological impression of complexity in the ecosystem for newcomers. If we have consistency there are better chances of attracting more people to the language ecosystem.
I gather the following from the discussion till now:
* We have cabal, YAML and TOML as potential candidates for a common package format which can additionally incorporate the concept of snapshots/package collections and potentially more extensions useful across build tools.
* cabal has the benefit of incumbency and backward compatibility, it has shortcomings which are being addressed but it is still a format which is very specific to Haskell ecosystem. It is not a standard and not going to become one. We have to always deal with it ourselves and everyone coming to Haskell will have to learn it.
* YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex.
* TOML (https://github.com/toml-lang/toml) is promising, simpler than YAML and is being used by a few important projects but is still evolving and is not completely stable. On a first glance it looks pretty simple and a lot of other tools use a similar config format. It is aiming to become a standard and aiming for a wider adoption.
As a next step we can perhaps do an hpack like experiment using the TOML format. That way we will have some experience with that as well and get to know if there are any potential problems expressing the existing cabal files.
More thoughts, opinions on the topic will help create a better understanding about it.
-harendra
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Why not adopt (a subset of) .hs AST file format to structure both project and package files? This would simplify parsing config files as well as syncing code and config files in IDEs. To draw an analogy, JSON derives from JavaScript. Isn't this a precedent?

On 16 September 2016 at 12:35, Imants Cekusins
Why not adopt (a subset of) .hs AST file format to structure both project and package files?
Aha, that's my preferred choice. If there is a way to restrict features and we can allow just a subset we can have a nice configuration language which is a real language. In fact, I have been toying around this. If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language. For example make has so many built-in functions in it that it has become a full fledged language by itself. The google bazel build uses python as the build config language. Haskell will make a much better choice for such use cases. Pure declarative is a pain for such use cases. -harendra

The more power you put into the package file description, the harder it is
for the surrounding ecosystem to reason about it.
So if you can execute arbitrary code in a new-gen cabal file, apart from
the security aspects, it becomes difficult to be sure what is actually
being specified, if you do not reproduce the original environment when
evaluating the file.
Alan
On Fri, Sep 16, 2016 at 9:18 AM, Harendra Kumar
On 16 September 2016 at 12:35, Imants Cekusins
wrote: Why not adopt (a subset of) .hs AST file format to structure both project and package files?
Aha, that's my preferred choice. If there is a way to restrict features and we can allow just a subset we can have a nice configuration language which is a real language. In fact, I have been toying around this. If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.
For example make has so many built-in functions in it that it has become a full fledged language by itself. The google bazel build uses python as the build config language. Haskell will make a much better choice for such use cases. Pure declarative is a pain for such use cases.
-harendra
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Sbt seems to be doing rather well, using full Scala in configurations. I think package descriptions should be limited, but not syntactically. Using some specific monad might work OK.
On 16 Sep 2016, at 09:22, Alan & Kim Zimmerman
wrote: The more power you put into the package file description, the harder it is for the surrounding ecosystem to reason about it.
So if you can execute arbitrary code in a new-gen cabal file, apart from the security aspects, it becomes difficult to be sure what is actually being specified, if you do not reproduce the original environment when evaluating the file.
Alan
On Fri, Sep 16, 2016 at 9:18 AM, Harendra Kumar
wrote: On 16 September 2016 at 12:35, Imants Cekusins wrote: Why not adopt (a subset of) .hs AST file format to structure both project and package files? Aha, that's my preferred choice. If there is a way to restrict features and we can allow just a subset we can have a nice configuration language which is a real language. In fact, I have been toying around this. If we have to express not just a package specification but a sophisticated build configuration, we need a real language. Expressing conditionals, reuse etc becomes a compromise in a purely declarative language.
For example make has so many built-in functions in it that it has become a full fledged language by itself. The google bazel build uses python as the build config language. Haskell will make a much better choice for such use cases. Pure declarative is a pain for such use cases.
-harendra
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

So if you can execute arbitrary code in a new-gen cabal file, apart from the security aspects, ... well config files could use different (not .hs) extensions. They could use their own Prelude and not allow importing other modules.
The main benefit is to reuse existing parsers and simplify code-config sync.

On 2016-09-16 09:30, MigMit wrote:
Sbt seems to be doing rather well, using full Scala in configurations.
Sbt is a *build* description, *NOT* a package description format. Sbt uses ivy.xml files for the latter. (With interop for consuming Maven pom.xml files such that it can leverage the already-huge Maven repositories.)

Am 16.09.2016 um 09:22 schrieb Alan & Kim Zimmerman:
The more power you put into the package file description, the harder it is for the surrounding ecosystem to reason about it.
So if you can execute arbitrary code in a new-gen cabal file, apart from the security aspects, it becomes difficult to be sure what is actually being specified, if you do not reproduce the original environment when evaluating the file.
A little-hyped aspect of Gradle is that it has two strictly divided phases: Phase 1 builds the dependency model, phase 2 executes it. Once phase 1 finishes, the dependency model becomes read-only, phase 2 is not allowed to modify it. On the plus side, this makes it easy for tools to reason about the model: it's static and easy to reproduce (just run phase 1 on the config file, or even better, ask the Gradle daemon that's caching the model). On the minus side, it's hard to make out which code in the config is phase-1 and which is phase-2: Same syntax, no static types to guide the intuition; essentially, you have to know which parameters of what phase-1 library functions are closures to be executed in phase 2. Haskell might be able to do better in this area, though I'm in no position to make any proposals for that.

On 16/09/16 6:37 PM, Tobias Dammers wrote:
Another factor in favor of YAML is that it is a superset of JSON,
Here is a simple string in JSON: "Where's the Golden Fleece?" Here is the same string in YAML: --- Where's the Golden Fleece? ... Superset? I understand "language X is a superset of language Y" to mean that if I have a document in language Y it can be correctly processed by a language X processor. If you mean that any data value that can be represented in JSON can be represented (differently!) in YAML, fine, but that's not the same thing. There are many textual formats that generalise JSON. Heck, even GNUSTEP Property List format does *that*. (And no, I do not recommend adopting that for anything.) For that matter, any JSON document can be transcoded with no loss of structural information into XML and vice versa. That doesn't mean that JSON is a superset of XML! Familiarity with JSON semantics and syntax did not help me AT ALL when faced with YAML. Here's another meta-format worthy of consideration. A *package* is a collection of resources with relationships between them and relationships linking them to other things like authors (think Dublin Core). Is there a standard (genuinely standard) notation specifically for describing resources and their relationships, with quite a few tools for not just reading it and writing it but actually reasoning with it? Why yes. It's called RDF. http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/ The design of RDF is intended to meet the following goals: * having a simple data model * having formal semantics and provable inference * using an extensible URI-based vocabulary * using an XML-based syntax * supporting use of XML schema datatypes * allowing anyone to make statements about any resource There is a human-friendly syntax interconvertible with the XML one, Turtle. http://www.w3.org/TR/turtle/ Now RDF (whether XML or Turtle) is *not* designed for presenting single data values. But that's not really what a package format wants to do anyway. Am I seriously recommending RDF (or possibly OWL-DL) as a good way to describe packages? I am certainly serious that it should be CONSIDERED. And I'm particularly serious about that for two reasons. (1) JSON, XML, TOML, and YAML are all about serialising *data values*. That's all they do. Anything beyond that is up to you. RDF and OWL are all about describing *relationships* between *resources*. It's worth considering carefully what you want to say in a package file format. If you want to describe *relationships*, then something that deals with data values may not be the right *kind* of "language". Simply jarring people loose from the idea that a "single possibly structured data value" language is the ONLY kind of language is of value in itself. (2) JSON, XML, TOML, and YAML are all about serialising *data values*. *Single* possibly structured data values. That's all they do. There is no sense in which there is any standard way to *combine* data in these forms. In contrast, RDF was *invented* to have a way of patching together multiple sets of facts from multiple sources. Given a collection of package descriptions in YAML, all you have is a bunch of text files; what you do with them is *entirely* up to you. Given a bunch of RDF/XML or RDF/Turtle files, there is a *standard* way to write a query (SPARQL) which integrates them. It becomes possible to write consistency-checking queries that can be processed by multiple tools. It becomes possible to ask "if I need these, what else do I need?" in a standard way. Again, the idea here is to get people thinking that having a documented semantics that can be processed by existing description logic tools has value, so that something at a higher semantic level than YAML or XML might be worth thinking about.

While y'all are going 'round about this, an argument parser in Rust has its own blog, api docs, twitter account, github, and tutorial videos. https://clap.rs/ And it supports YAML in addition to plain old Rust code.

While y'all are going 'round about this, an argument parser in Rust has its own blog, api docs, twitter account, github, and tutorial videos.
An *argument parser*? Visits web page incredulously. Great balls of fire, it's true. I really don't want to use an argument parser that requires that much documentation. Life is too short.

Am 19.09.2016 um 02:12 schrieb Richard A. O'Keefe:
On 16/09/16 6:37 PM, Tobias Dammers wrote:
Another factor in favor of YAML is that it is a superset of JSON,
Here is a simple string in JSON:
"Where's the Golden Fleece?"
Here is the same string in YAML:
--- Where's the Golden Fleece? ...
Superset?
Yes. The original string is also valid in YAML if used in the position where JSON allows a string.
If you mean that any data value that can be represented in JSON can be represented (differently!) in YAML, fine, but that's not the same thing.
Sure, but any valid JSON is also valid YAML. Modulo some exotic exceptions for valid-but-useless and valid-but-probably-not-what-the-sender-intended JSON.
Familiarity with JSON semantics and syntax did not help me AT ALL when faced with YAML.
Sure, YAML is a massive superset. The advantage is more in interoperability - you can hook a YAML parser to JSON-outputting processes and expect that it will "just work", so you don't have to worry about syntax, so you don't need separate frontends for YAML and JSON for your webservice.
Am I seriously recommending RDF (or possibly OWL-DL) as a good way to describe packages? I am certainly serious that it should be CONSIDERED.
+1
(1) JSON, XML, TOML, and YAML are all about serialising *data values*. That's all they do. Anything beyond that is up to you. RDF and OWL are all about describing *relationships* between *resources*. It's worth considering carefully what you want to say in a package file format. If you want to describe *relationships*, then something that deals with data values may not be the right *kind* of "language".
Simply jarring people loose from the idea that a "single possibly structured data value" language is the ONLY kind of language is of value in itself.
It does have its advantages. That's why everybody is using XML these days, after all. Even though XML does have some pretty horrible properties (too much noise being the most prominent).
(2) JSON, XML, TOML, and YAML are all about serialising *data values*. *Single* possibly structured data values. That's all they do. There is no sense in which there is any standard way to *combine* data in these forms.
Yes, that's supposed to live at the semantic level, i.e. in the types. For JSON and TOML that's a serious restriction. In XML and YAML, you can keep type information (better standardization for that in YAML than in XML), so you can stick user-defined semantics into the serialization format if you want to. I.e. you can achieve RDF in XML or YAML by writing types that handle combinability or anything else that you want, these things aren't tied into the language. It is still possible that RDF is more convenient :-)

Am 16.09.2016 um 08:20 schrieb Harendra Kumar:
* TOML (https://github.com/toml-lang/toml) is promising, simpler than YAML and is being used by a few important projects but is still evolving and is not completely stable. On a first glance it looks pretty simple and a lot of other tools use a similar config format. It is aiming to become a standard and aiming for a wider adoption.
TOML is limited in its data types: numbers, dates, strings for primitives, arrays and string-to-object maps. I'd consider that too limited to ever become a universal configuration format.

I guess the overriding question I have here is: what is the PROBLEM being
solved? I know of basically no beginners who were confused or intimidated
by the syntax of Cabal's file format. It's fairly commonplace for
beginners to be confused by the *semantics*: which fields are needed and
what they mean, how package version bounds work, what flags are and how
they interact with dependencies, the relationship between libraries and
executables defined in the same file, etc. But the syntax? It's just not
an issue. I'm not sure what it means to say that people have to "learn"
it, because in introducing dozens of people to building things in Haskell,
I've never seen that learning process even be noticeable, much less an
impediment.
With this in mind, a lot of the statements about these various languages
are not entirely convincing. That it's a superset of JSON? It's not clear
why this matters. A psychological impression of complexity? Just not
anything I've seen evidence of. Indeed, aside from the rather painful
many-years-long migration, the *cost* (though certainly not a prohibitive
one) of moving to something like YAML or TOML is that they have a bit
louder syntax, that demands more attention and feels more complex.
There is one substantial disadvantage I'd point out to the Cabal file
format as it stands, and that's that it's pretty non-obvious how to parse
it, so we will always struggle to interact with it from automated tools,
unless those tools are also written in Haskell and can use the Cabal
library. That's a real concern; pragmatic large-scale build environments
are not tied to specific languages, and include a variety of ad-hoc
third-party tooling that needs to be integrated, and Cabal remains opaque
to them. But that doesn't seem to be what's motivating this conversation.
On Thu, Sep 15, 2016 at 11:20 PM, Harendra Kumar
I am starting a new thread for the package file format related discussion.
From a developer's perspective, the major benefit of a standard and widely adopted format and is that people can utilize their knowledge acquired from elsewhere, they do not have to go through and learn differently looking and incomplete documentation of different tools. The benefit of a common config specification is that developers can choose tools freely without worrying about learning the same concepts presented in different ways.
Multiple formats flying around also create a psychological impression of complexity in the ecosystem for newcomers. If we have consistency there are better chances of attracting more people to the language ecosystem.
I gather the following from the discussion till now:
* We have cabal, YAML and TOML as potential candidates for a common package format which can additionally incorporate the concept of snapshots/package collections and potentially more extensions useful across build tools.
* cabal has the benefit of incumbency and backward compatibility, it has shortcomings which are being addressed but it is still a format which is very specific to Haskell ecosystem. It is not a standard and not going to become one. We have to always deal with it ourselves and everyone coming to Haskell will have to learn it.
* YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex.
* TOML (https://github.com/toml-lang/toml) is promising, simpler than YAML and is being used by a few important projects but is still evolving and is not completely stable. On a first glance it looks pretty simple and a lot of other tools use a similar config format. It is aiming to become a standard and aiming for a wider adoption.
As a next step we can perhaps do an hpack like experiment using the TOML format. That way we will have some experience with that as well and get to know if there are any potential problems expressing the existing cabal files.
More thoughts, opinions on the topic will help create a better understanding about it.
-harendra
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

what is the PROBLEM being solved?
by making config files follow .hs syntax, cabal file structure may be defined as a data record. This would make it clear, which fields are compulsory, which are optional. Enums may be used.

The discussion originated in an earlier thread from a question about the
possibility of using the same format across different tools, cabal and
stack which currently use different file formats. If they have to use the
same format what that format should be.
On 16 September 2016 at 13:54, Chris Smith
I guess the overriding question I have here is: what is the PROBLEM being solved? I know of basically no beginners who were confused or intimidated by the syntax of Cabal's file format. It's fairly commonplace for beginners to be confused by the *semantics*: which fields are needed and what they mean, how package version bounds work, what flags are and how they interact with dependencies, the relationship between libraries and executables defined in the same file, etc. But the syntax? It's just not an issue. I'm not sure what it means to say that people have to "learn" it, because in introducing dozens of people to building things in Haskell, I've never seen that learning process even be noticeable, much less an impediment.
With this in mind, a lot of the statements about these various languages are not entirely convincing. That it's a superset of JSON? It's not clear why this matters. A psychological impression of complexity? Just not anything I've seen evidence of. Indeed, aside from the rather painful many-years-long migration, the *cost* (though certainly not a prohibitive one) of moving to something like YAML or TOML is that they have a bit louder syntax, that demands more attention and feels more complex.
There is one substantial disadvantage I'd point out to the Cabal file format as it stands, and that's that it's pretty non-obvious how to parse it, so we will always struggle to interact with it from automated tools, unless those tools are also written in Haskell and can use the Cabal library. That's a real concern; pragmatic large-scale build environments are not tied to specific languages, and include a variety of ad-hoc third-party tooling that needs to be integrated, and Cabal remains opaque to them. But that doesn't seem to be what's motivating this conversation.
On Thu, Sep 15, 2016 at 11:20 PM, Harendra Kumar
wrote:
I am starting a new thread for the package file format related discussion.
From a developer's perspective, the major benefit of a standard and widely adopted format and is that people can utilize their knowledge acquired from elsewhere, they do not have to go through and learn differently looking and incomplete documentation of different tools. The benefit of a common config specification is that developers can choose tools freely without worrying about learning the same concepts presented in different ways.
Multiple formats flying around also create a psychological impression of complexity in the ecosystem for newcomers. If we have consistency there are better chances of attracting more people to the language ecosystem.
I gather the following from the discussion till now:
* We have cabal, YAML and TOML as potential candidates for a common package format which can additionally incorporate the concept of snapshots/package collections and potentially more extensions useful across build tools.
* cabal has the benefit of incumbency and backward compatibility, it has shortcomings which are being addressed but it is still a format which is very specific to Haskell ecosystem. It is not a standard and not going to become one. We have to always deal with it ourselves and everyone coming to Haskell will have to learn it.
* YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex.
* TOML (https://github.com/toml-lang/toml) is promising, simpler than YAML and is being used by a few important projects but is still evolving and is not completely stable. On a first glance it looks pretty simple and a lot of other tools use a similar config format. It is aiming to become a standard and aiming for a wider adoption.
As a next step we can perhaps do an hpack like experiment using the TOML format. That way we will have some experience with that as well and get to know if there are any potential problems expressing the existing cabal files.
More thoughts, opinions on the topic will help create a better understanding about it.
-harendra
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
_______________________________________________ Haskell-community mailing list Haskell-community@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-community

Am 16.09.2016 um 10:24 schrieb Chris Smith:
With this in mind, a lot of the statements about these various languages are not entirely convincing. That it's a superset of JSON? It's not clear why this matters.
It does matter for people who already know JSON: They can skip over the config file syntax and dive right into the semantics. Given that a substantial fraction of programmers knows JSON, using that syntax would create a lower entry barrier. The same argument can be made for YAML. This argument cannot be made for TOML at this time, maybe never if TOML's limitations prevent widespread adoption.
A psychological impression of complexity? Just not anything I've seen evidence of. Indeed, aside from the rather painful many-years-long migration, the *cost* (though certainly not a prohibitive one) of moving to something like YAML or TOML is that they have a bit louder syntax, that demands more attention and feels more complex.
YAML's complexity is partly because it tries to cover everything, partly because it is pushing hard to be both human-readable and machine-readable. It's pretty good at this actually, though I guess 20/20 hindsight could lead to improvements - but not enough to make a new YAML version worth the effort.
There is one substantial disadvantage I'd point out to the Cabal file format as it stands, and that's that it's pretty non-obvious how to parse it, so we will always struggle to interact with it from automated tools, unless those tools are also written in Haskell and can use the Cabal library. That's a real concern; pragmatic large-scale build environments are not tied to specific languages, and include a variety of ad-hoc third-party tooling that needs to be integrated, and Cabal remains opaque to them. But that doesn't seem to be what's motivating this conversation.
That's implicit in the "it would be nice to have a standard format" argument, even if it hasn't been explicitly voiced yet.

I guess the overriding question I have here is: what is the PROBLEM being solved?
Let me share my experience with Clojure and lein. They use a clojure hash-map for their configuration. So yes arbitrary code could be executed and I believe this is a _very good thing_. Why? Because it makes it very easy to add sub-configuration that can be used by third party plugin. For example: - a plugin that help the use of environment variables (lein-environ) which is really helpful for application development (not so much for library development) - a plugin that use S3 for our private dependencies (not supported by default by lein) For deployment: we were able to add request to our API server that provide not only the written version but also the git commit hash. So we could be certain of the version of the server. Too much time there were sys/admin deployment errors. And that could only be achieved because we were able to run arbitrary command in the project description file. I certainly forget many other advantages of having a package description format which is simply a data structure in the hosted language. But this has by far my preference. - cabal is ok, but very imperfect, I generally need to have a lot of copy/paste, I need to change it very often while writing application with many dependencies - JSON/YAML/TOML are simply not powerful enough to match all semantics we might need to configure a project. For example we might want to have Set instead of List for some properties. Or I don't know maybe ternary tree structures. The point is: we pay a price by adding a step between the semantic and the syntax. While if our configuration format was in Haskell we could express the semantic more directly.

.. for interop with other packagers / builders, .hs compatible config content could be transformed / exported to other formats. .hs -> YAML, JSON, ... is likely to be possible and easier than the other way around.

While I would personally love having a package description in haskell, I
don't think it is a good idea.
If you can't start or modify a package without already knowing haskell, it
is a huge barrier to entry. I remember trying to get started in scala and
having a lot of trouble with sbt because I didn't know their operators for
lists and arrays or hash tables or whatever it is that they use in their
files.
On Fri, Sep 16, 2016 at 4:57 AM, yogsototh
I guess the overriding question I have here is: what is the PROBLEM being
solved?
Let me share my experience with Clojure and lein. They use a clojure hash-map for their configuration. So yes arbitrary code could be executed and I believe this is a _very good thing_.
Why? Because it makes it very easy to add sub-configuration that can be used by third party plugin. For example:
- a plugin that help the use of environment variables (lein-environ) which is really helpful for application development (not so much for library development) - a plugin that use S3 for our private dependencies (not supported by default by lein)
For deployment: we were able to add request to our API server that provide not only the written version but also the git commit hash. So we could be certain of the version of the server. Too much time there were sys/admin deployment errors. And that could only be achieved because we were able to run arbitrary command in the project description file.
I certainly forget many other advantages of having a package description format which is simply a data structure in the hosted language. But this has by far my preference.
- cabal is ok, but very imperfect, I generally need to have a lot of copy/paste, I need to change it very often while writing application with many dependencies - JSON/YAML/TOML are simply not powerful enough to match all semantics we might need to configure a project. For example we might want to have Set instead of List for some properties. Or I don't know maybe ternary tree structures.
The point is: we pay a price by adding a step between the semantic and the syntax. While if our configuration format was in Haskell we could express the semantic more directly.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

David McBride writes:
While I would personally love having a package description in haskell, I don't think it is a good idea.
I think we all can agree, that using the fully-fledged language for configuration is an extremely bad idea from many perspectives. The worst of all, IMO, is that it makes reasoning about the configuration equivalent to the halting problem. And god, does it hurt in practice! -- speaking as someone who had spent a non-trivial amount of time on doing exactly this stuff in another age and for another language. However. This does not mean that we cannot find a subset of the language that would be a point of balance between the needs of expressivity, learnability and decidability. After all JSON was born in roughly this spirit, wasn't it? The wins are obvious to me: - the syntax is immediately obvious to the target audience - minimum effort to get existent Haskell tools to work with the "new" format at the source level -- syntax highlighting, checking, etc. The only required additions would be restriction enforcement - no third-party libraries need to be used as dependencies for our core tooling
If you can't start or modify a package without already knowing haskell, it is a huge barrier to entry.
I'm unconvinced that this problem cannot be resolved within the subsetting approach.
I remember trying to get started in scala and having a lot of trouble with sbt because I didn't know their operators for lists and arrays or hash tables or whatever it is that they use in their files.
That is because they committed to the sin of employing the whole of Scala for the thing. Bad for them. But also.. let's not commit the mistake of conflating the surface syntax and the semantics. The semantics are dictated by need -- whose sharpening effect on the learning curve is unavoidable. I'm willing to argue that a large part of your confusion came from the /semantics/ of sbt, not the syntax. The syntax differences, OTOH, can and ought to be trivialized. -- с уважениeм / respectfully, Косырев Сергей

Am 16.09.2016 um 15:37 schrieb Kosyrev Serge:
The worst of all, IMO, is that it makes reasoning about the configuration equivalent to the halting problem.
That's a solved problem: Generate an execution plan, which would need to be fully evaluated in Haskell; then execute it and don't feed anything back into it. It's easy to reason about the plan in that scenario. This is what Gradle does.
And god, does it hurt in practice! -- speaking as someone who had spent a non-trivial amount of time on doing exactly this stuff in another age and for another language.
Which language?
This does not mean that we cannot find a subset of the language that would be a point of balance between the needs of expressivity, learnability and decidability.
Subsettings makes it hard to know what works and what doesn't. A Haskell subset would have to be strict - which begs the question what's the point in calling this a subset of Haskell (and even if there is a point, it will draw ridicule along the lines of "Haskell is unsuitable for describing its own configurations").
After all JSON was born in roughly this spirit, wasn't it?
JSON was/is a serialization format, first and foremost.
If you can't start or modify a package without already knowing haskell, it is a huge barrier to entry.
I'm unconvinced that this problem cannot be resolved within the subsetting approach.
Actually subsetting is making this worse: Things freshly learned for Haskell won't work in the config language, restrictions encountered in the config language will be unthinkingly transferred to Haskell. Having two subtly but fundamentally different languages is about the worst thing you can expose a learner to.

On 2016-09-16 09:51 AM, Joachim Durchholz wrote:
This does not mean that we cannot find a subset of the language that would be a point of balance between the needs of expressivity, learnability and decidability.
Subsettings makes it hard to know what works and what doesn't. A Haskell subset would have to be strict - which begs the question what's the point in calling this a subset of Haskell (and even if there is a point, it will draw ridicule along the lines of "Haskell is unsuitable for describing its own configurations").
Haskell is indeed unsuitable for describing the package configuration, IMO, but not because it's lazy. It's because it lacks any syntax for long and human-readable string literals (package description, anyone?). That also condemns every subset of Haskell.
After all JSON was born in roughly this spirit, wasn't it?
Yes, and JSON (and JavaScript) would suck for the very same reason. This deficiency of JSON was a major incentive for creating YAML. I'm mildly in favour of supporting another package format in addition to .cabal, as long as compatibility is kept, and as long as the new format is actually superior. I think any subset of Haskell would be a setback from usability perspective. One major benefit of YAML that I haven't seen mentioned is that it could be used to replace the README.md file at the same time. Right now a package description consists of both .cabal and (optionally) Markdown. I suspect the latter language is actually harder for complete beginners.

On 2016-09-16 10:29 AM, Imants Cekusins wrote:
it lacks any syntax for long and human-readable string literals (package description, anyone?). can {- comments -} be used for package description?
I suppose they could, but that would rather defeat the purpose of using a Haskell subset in the first place. Haskell ignores comments, package descriptions should not be ignored.

ok how about a pragma:7.13.6.3. Annotating modules You can annotate modules with the ANN pragma by using the module keyword. For example: {-# ANN module (Just "A `Maybe String' annotation") #-} if the topic is *Standard package file format*, why not agree on e.g. adopting *GenericPackageDescription* or another similar haskell type (rather than a text-based file) as the standard? then any format (cabal, yaml, json, ...) may be used as long as a library exists and is maintained for each such format, which parses / produces the format from / to the standard type? how about this?

On 2016-09-16 10:48 AM, Imants Cekusins wrote:
ok how about a pragma:
7.13.6.3. Annotating modules
You can annotate modules with the |ANN| pragma by using the |module| keyword. For example:
{-# ANN module (Just "A `Maybe String' annotation") #-}
I suppose this could do, but there are some downsides: - somewhat cumbersome syntax, - reliance on a GHC extension, and worst of all, - not a Haskell value. The last point implies that the package.hs with this kind of module annotation could not produce a proper GenericPackageDescription when executed as a Haskell program.
if the topic is _Standard package file format_, why not agree on e.g. adopting *GenericPackageDescription* or another similar haskell type (rather than a text-based file) as the standard?
then any format (cabal, yaml, json, ...) may be used as long as a library exists and is maintained for each such format, which parses / produces the format from / to the standard type?
This makes perfect sense to me. The devil may be in the details. Would cabal-install need to link in all these maintained libraries statically? Or would there be some plug-in mechanism to load them on demand?

Would cabal-install need to link in all these maintained libraries statically? Or would there be some plug-in mechanism to load them on demand?
well the libraries would need to be official and some with the packager. the formats would be perfectly interchangeable i.e. cabal -> standard_type -> yaml -> standard_type -> json -> standard_type -> cabal would produce the same cabal file only 1 config file per package to avoid confusion however if the user prefers working with format F, they can always convert the format which came with the package, to F the file can always be validated by virtue of parsing and reproducing the original file without errors. it comes at a price of duplicated efforts however it would give every choice one can wish for. If one must use yaml, they use yaml etc.

On 2016-09-16 16:10, Mario Blažević wrote:
After all JSON was born in roughly this spirit, wasn't it?
Yes, and JSON (and JavaScript) would suck for the very same reason. This deficiency of JSON was a major incentive for creating YAML.
I'm mildly in favour of supporting another package format in addition to .cabal, as long as compatibility is kept, and as long as the new format is actually superior. I think any subset of Haskell would be a setback from usability perspective.
This may be somewhat heretical, but I don't actually think we need to have a human-editable format. (Of course it should probably be *reasonably* human-readable/editable just for debugging and such.) Just provide simple commands to view/manipulate whatever package settings there are. Helpfully said commands could also sanity check whatever you're trying to do and perhaps provide better error messages than a tool which only has the "final" package description to work with. For beginners a simple GUI could be provided and IDEs could do their own thing. Problem solves.

2016-09-16 19:14 GMT+02:00 Bardur Arantsson
This may be somewhat heretical, but I don't actually think we need to have a human-editable format. [...]
Coming back to the central question (see Chris' mail): What problem do we solve by doing that? Replacing a relatively easy to read format by something unreadable by humans? That's probably the opposite of what we want...
[...] For beginners a simple GUI could be provided and IDEs could do their own thing.
If somebody thinks a GUI is a good idea, we don't need to change something at all: Just write a GUI for reading/editing .cabal files.
Problem solves.
Which problem? :-) Unless we really define what we want to improve and why, the whole discussion is pointless. Is it readability by humans? Being "standard" (whatever that means)? Being easily parsable, probably by a separate library? Being more flexible by what one can express? Having more abstraction facilities in the description? I have the impression that different people in this discussion try to solve different problems. Cheers, S.

[...]
1. Adopt common standard for different package tools.
What are these tools? AFAICT we are talking about cabal and stack only, and from the recent discussion it seems that stack has slightly different goals: One stack.yaml can reference vaious cabal package descriptions, something I've never use until now, because I wasn''t even aware that it is
2016-09-16 22:02 GMT+02:00 Imants Cekusins
1. Give users and packager devs a choice of config file formats / representations.
Why is this even a goal? On the contrary, I see this as an anti-goal,
because it leads to useless creativity and fragmentation.
1. Explore ways to simplify manual package configuration.
This is a worthwhile goal IMHO, but we need to be more concrete, e.g. how
can repetitive stuff like the tons of almost-copy-n-paste in https://github.com/haskell-opengl/GLUT/blob/master/GLUT.cabal be avoided? This has nothing to do with syntax, more with abstraction facilities and semantics: If we just switch to JSON or YAML, GLUT.cabal would as repetitive as before, only in a different surface syntax.

On 16 Sep 2016, at 09:24, Chris Smith
wrote: I guess the overriding question I have here is: what is the PROBLEM being solved? I know of basically no beginners who were confused or intimidated by the syntax of Cabal's file format.
As a "beginner"(*), I fully agree. However having more than one language in the mix can be confusing and complicating...
It's fairly commonplace for beginners to be confused by the *semantics*: which fields are needed and what they mean, how package version bounds work, what flags are and how they interact with dependencies, the relationship between libraries and executables defined in the same file, etc.
It's all about the semantics - it should preferably be formalised, and ideally the relevant library/package system should be able to check/enforce rules.
But the syntax? It's just not an issue. I'm not sure what it means to say that people have to "learn" it, because in introducing dozens of people to building things in Haskell, I've never seen that learning process even be noticeable, much less an impediment.
I quite agree
Andrew Butterfield School of Computer Science & Statistics Trinity College Dublin 2, Ireland (*) I've only started to use cabal recently, because a TA of mine built a cabal-based coursework grading system for me - I generally do application devpt in Haskell and the only build command I need is ghc --make.... Currently moving quickly onto stack this year....

On 2016-09-16 at 08:20:15 +0200, Harendra Kumar wrote: [...]
* YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex.
I'm not sure if this has been pointed out already, but beyond turning a proper grammar into a stringly-typed one, shoehorning some features of .cabal files into YAML syntax really appear like a case of the "Genius Tailor"[1], e.g. consider the `hpack` example when: - condition: flag(fast) then: ghc-options: -O2 else: ghc-options: -O0 besides looking quite awkward IMHO (just as an exercise, try inserting a nested if/then/else in that example above), the prospect that a standard format like YAML would allow to reuse standard tooling/libraries for YAML seems quite weak to me; if, for instance, you run the above through a YAML pretty-printer, you easily end up with something like when: - else: ghc-options: -O0 then: ghc-options: -O2 condition: flag(fast) or any other ordering depending on how the keys are sorted/hashed. Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if by accident you place a 2nd `else:` branch somewhere, you end up with an ambiguous .yaml file which may either result in an error, in the first key getting dropped (most likely variant), or in the 2nd key getting dropped. Which one you get depends on the YAML parser implementation. I really don't understand the appeal of applying the golden hammer of YAML, if `.cabal`'s grammar is already self-evident and concise with its syntax: if flag(fast) ghc-options: -O2 else ghc-options: -O0 where this if/then/else construct is encoded in the grammar proper rather than being merely a semantic interpretation after decoding a general grammar designed for simpler typed data-representations which isn't even accurate enough (since it has additional symmetries/freedoms) to capture the desired grammar faithfully, which make YAML quite error-prone for this specific application. [1]: The "Genius Tailor" was mentioned recently in a related discussion here: https://mail.haskell.org/pipermail/haskell-cafe/2016-September/124868.html -- hvr

On 2016-09-16 23:57, Herbert Valerio Riedel wrote:
Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if by accident you place a 2nd `else:` branch somewhere, you end up with an ambiguous .yaml file which may either result in an error, in the first key getting dropped (most likely variant), or in the 2nd key getting dropped. Which one you get depends on the YAML parser implementation.
I was actually curious about this, and it's interesting to note that even JSON which was supposed to have *ONE STANDARD* now apparently has two, an ECMA one and and IETF RFC (seems to be more recent). So I'd say JSON technically _allows_ duplicate keys, but that you cannot reasonably any type of sane behavior in practice if you do that. Source: http://stackoverflow.com/a/23195243 (Didn't check up on what the situation is in YAML. YAML is too awful to contemplate regardless.) Regards,

On 2016-09-17 at 06:47:52 +0200, Bardur Arantsson wrote: [...]
I was actually curious about this, and it's interesting to note that even JSON which was supposed to have *ONE STANDARD* now apparently has two, an ECMA one and and IETF RFC (seems to be more recent).
Btw, that's partly because ECMA and IETF weren't able to agree who "owns" JSON, for more details see https://www.tbray.org/ongoing/When/201x/2014/03/05/RFC7159-JSON -- hvr

On 17/09/16 4:47 PM, Bardur Arantsson wrote:
I was actually curious about this, and it's interesting to note that even JSON which was supposed to have *ONE STANDARD* now apparently has two, an ECMA one and and IETF RFC (seems to be more recent).
It's a long sad story. The ECMA standard exists for largely politcal reasons. The RFC is the "active" one. JSON is a textbook example of "syntax-first".

(resent from different account, sorry if dupe) On 2016-09-16 at 08:20:15 +0200, Harendra Kumar wrote: [...]
* YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex.
I'm not sure if this has been pointed out already, but beyond turning a proper grammar into a stringly-typed one, shoehorning some features of .cabal files into YAML syntax really appear like a case of the "Genius Tailor"[1], e.g. consider the `hpack` example when: - condition: flag(fast) then: ghc-options: -O2 else: ghc-options: -O0 besides looking quite awkward IMHO (just as an exercise, try inserting a nested if/then/else in that example above), the prospect that a standard format like YAML would allow to reuse standard tooling/libraries for YAML seems quite weak to me; if, for instance, you run the above through a YAML pretty-printer, you easily end up with something like when: - else: ghc-options: -O0 then: ghc-options: -O2 condition: flag(fast) or any other ordering depending on how the keys are sorted/hashed. Besides, many YAML (& JSON) parsers silently drop duplicate keys, so if by accident you place a 2nd `else:` branch somewhere, you end up with an ambiguous .yaml file which may either result in an error, in the first key getting dropped (most likely variant), or in the 2nd key getting dropped. Which one you get depends on the YAML parser implementation. I really don't understand the appeal of applying the golden hammer of YAML, if `.cabal`'s grammar is already self-evident and concise with its syntax: if flag(fast) ghc-options: -O2 else ghc-options: -O0 where this if/then/else construct is encoded in the grammar proper rather than being merely a semantic interpretation after decoding a general grammar designed for simpler typed data-representations which isn't even accurate enough (since it has additional symmetries/freedoms) to capture the desired grammar faithfully, which make YAML quite error-prone for this specific application. [1]: The "Genius Tailor" was mentioned recently in a related discussion here: https://mail.haskell.org/pipermail/haskell-cafe/2016-September/124868.html -- hvr

On 17 September 2016 at 03:43, Herbert Valerio Riedel
I'm not sure if this has been pointed out already, but beyond turning a proper grammar into a stringly-typed one, shoehorning some features of .cabal files into YAML syntax really appear like a case of the "Genius Tailor"[1], e.g. consider the `hpack` example
when: - condition: flag(fast) then: ghc-options: -O2 else: ghc-options: -O0
I agree. Supporting conditionals with YAML looks hacky! -harendra

Am 17.09.2016 um 01:53 schrieb Harendra Kumar:
I agree. Supporting conditionals with YAML looks hacky!
All I have seen was direct translation and conclusion that it doesn't work. I haven't seen any attempts at making it look well. Also, while aesthetics isn't irrelevant, it's a pretty weak argument.

On Sat, Sep 17, 2016 at 7:27 AM, Joachim Durchholz
Am 17.09.2016 um 01:53 schrieb Harendra Kumar:
I agree. Supporting conditionals with YAML looks hacky!
All I have seen was direct translation and conclusion that it doesn't work. I haven't seen any attempts at making it look well.
Also, while aesthetics isn't irrelevant, it's a pretty weak argument.
Read the next paragraph in hvr's email: he was very much not talking about aesthetics.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.

Am 17.09.2016 um 00:13 schrieb Herbert Valerio Riedel:
the prospect that a standard format like YAML would allow to reuse standard tooling/libraries for YAML seems quite weak to me;
It's not about standard tooling, it's about tools written by third parties. Tools that you didn't have the time or interest to write yourself, but which still help make your ecosystem more useful to others.
if, for instance, you run the above through a YAML pretty-printer, you easily end up with something like
when: - else: ghc-options: -O0 then: ghc-options: -O2 condition: flag(fast)
or any other ordering depending on how the keys are sorted/hashed.
Only if you use a bad pretty-printer that parses the YAML, then writes it in prettified form. Such a pretty-printer would also lose comments. In other words: I'd be surprised to find a pretty-printer in actual use that works that way.
Besides, many YAML (& JSON) parsers silently drop duplicate keys,
That's indeed a common bug/misfeature due to historical accidents. It's easy to fix though, and libraries have started to acquire options to get that reported as an error.
I really don't understand the appeal of applying the golden hammer of YAML, if `.cabal`'s grammar is already self-evident and concise with its syntax:
if flag(fast) ghc-options: -O2 else ghc-options: -O0
where this if/then/else construct is encoded in the grammar proper rather than being merely a semantic interpretation after decoding a general grammar designed for simpler typed data-representations which isn't even accurate enough (since it has additional symmetries/freedoms) to capture the desired grammar faithfully, which make YAML quite error-prone for this specific application.
Yeah it isn't nice. Changing the grammar always produces that kind of awkwardnesses. However, for a fair comparison, you need to actively look for things that work better with the alternate grammar before you conclude it's worse.

Give users and packager devs a choice of config file formats / representations. Why is this even a goal? On the contrary, I see this as an anti-goal, because it leads to useless creativity and fragmentation. such creativity and fragmentation may actually give benefits.
can MVC [1] be relevant here? currently both config content (let's call it a *model*) and representation ( *view*: specific config file type) are bundled. if a common *model *is agreed on*,* package tool and IDE devs could pick any *view (*format*)* that best suits their / users needs. such fragmentation would not break the workflow. If someone thinks of a convenient format and believe it worth their time to write a *controller* for it, why not? [1] mvc https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller

On Sat, Sep 17, 2016 at 2:54 AM, Imants Cekusins
currently both config content (let's call it a *model*) and representation (*view*: specific config file type) are bundled.
if a common *model *is agreed on*,* package tool and IDE devs could pick any *view (*format*)* that best suits their / users needs.
such fragmentation would not break the workflow. If someone thinks of a convenient format and believe it worth their time to write a *controller* for it, why not?
Do I have to obtain whatever whizzy new controller you've come up with in order to work with your packages? Do I have to do this when everyone has come up with their own whizzy new controller and I need to fit their packages into whatever I am trying to write? -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Do I have to obtain whatever whizzy new controller you've come up with in order to work with your packages? Do I have to do this when everyone has come up with their own whizzy new controller and I need to fit their packages into whatever I am trying to write? that's the while point. If we could agree on a standard serializeable model, each controller would ensure the link between the *view* and the *model.*
user could open a package in any IDE / environment. The environment's controller would display the model in its own / user preferred view.

.. the model would be shipped with packages. pretty printing the config model to formatted yet non-editable config view (like the docs) may be made part of build process.

On Sat, Sep 17, 2016 at 3:06 AM, Imants Cekusins
that's the while point. If we could agree on a standard serializeable model,
That seems like a big "if". Especially since many dev tools exist to extend the model, and quite aside from "so where's the 'standard' now", conflicts you can currently control (mostly) suddenly become problematic. (I'm tempted to point to how gtk2hs's configuration phase works. pTk may be an even more severe example, although non-Haskell.) -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Fri, Sep 16, 2016 at 4:35 PM, Imants Cekusins
Would cabal-install need to link in all these maintained libraries statically? Or would there be some plug-in mechanism to load them on demand?
well the libraries would need to be official and some with the packager.
the formats would be perfectly interchangeable i.e. cabal -> standard_type -> yaml -> standard_type -> json -> standard_type -> cabal would produce the same cabal file
only 1 config file per package to avoid confusion
however if the user prefers working with format F, they can always convert the format which came with the package, to F
Even just looking at the set of features which is 1:1 betw. YAML and JSON,
we're essentially just talking about key-value pairs with a couple of
common types for the values. This isn't all .cabal files contain (e.g. see
hvr's points about conditionals), but if it were true, is it really worth
changing how Cabal works for a diffferent color bikeshed?
On Sat, Sep 17, 2016 at 8:06 AM, Imants Cekusins
Do I have to obtain whatever whizzy new controller you've come up with in order to work with your packages? Do I have to do this when everyone has come up with their own whizzy new controller and I need to fit their packages into whatever I am trying to write? that's the while point. If we could agree on a standard serializeable model, each controller would ensure the link between the *view* and the *model.*
user could open a package in any IDE / environment. The environment's controller would display the model in its own / user preferred view.
Why not have .cabal files be the standard model, and anyone can write tools on top to translate to/from .cabal if users really want to use something else? In general, though, I don't think the fragmentation is worth it. Tom

here are some charts https://github.com/ciez/tmp to highlight differences between - currently used text based config and - suggested model based config

Why not have .cabal files be the standard model, and anyone can write tools on top to translate to/from .cabal if users really want to use something else? .cabal file is representation rather than a model. It is parsed to model. Being a distinct file type with its own AST, it needs quite a bit of attention. It needs to be parsed, updated, validated, formatted.
Another config format emerged. More problems (distinct file type etc). More formats may follow. Is there hope to agree on common format (as per thread title), if common content can not be agreed on? Isn't config first of all about content? Is the common format going to contain incompatible / conflicting data items? With common content, display format will not matter at all, neither will package tool nor IDE used to work on a project. Config being a Haskell type, it would be well formed. The options would be well known. Users and IDE devs will not need to worry about indenting, commas, line breaks and other *goodies*.

2016-09-18 17:40 GMT+02:00 Imants Cekusins
.cabal file is representation rather than a model. It is parsed to model.
Well, that's the case for basically everything you give to a program, so I don't see the point here. A .hs file is e.g. just a textual representation of the more abstract notion of a Haskell program/module, too. A .cabal file is just a textual representation of a the abstract notion of a Haskell package description.
Being a distinct file type with its own AST,
Distinct from what?
it needs quite a bit of attention.
From whom?
It needs to be parsed, updated, validated, formatted.
This will be the case for whatever is being used, so again: What's the point? It doesn't matter if it's in its own .cabal syntax, in some Haskell-like syntax, JSON, YAML, or even some graphical representation.
Another config format emerged.
I'm not sure what config format is meant here. If it's stack.yaml, it *must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.
More problems (distinct file type etc).
What are the actual problems here?
More formats may follow.
If they are for different purposes, that's OK and is to be expected.
Is there hope to agree on common format (as per thread title), if common content can not be agreed on? Isn't config first of all about content? Is the common format going to contain incompatible / conflicting data items?
.cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?
With common content, display format will not matter at all, neither will package tool nor IDE used to work on a project.
Config being a Haskell type, it would be well formed. The options would be well known.
Users and IDE devs will not need to worry about indenting, commas, line breaks and other *goodies*.
Somehow you will always need a concrete representation of abstract notions (call them "models", "ASTs", etc.), otherwise you won't be able to process them. So you will always need to care about some kind of syntax etc., I can't see how using a "Haskell type" will help here. And you will need some semantics for the representation. Even if we used e.g. JSON (or whatever is en vogue at the moment), IDEs will not magically start understanding and supporting Haskell projects. Again: What is the actual problem we're trying to solve? I still haven't seen a concrete use case which is hard/impossible with the current state of affairs. Personally, I would e.g. like to see some abstraction facilities to avoid repetition in .cabal files with lots of executables, but I don't care about the concrete syntax (and Cabal's internal model/AST wouldn't be affected, either).

Well, that's the case for basically everything you give to a program, so I don't see the point here. A .hs file is e.g. just a textual representation of the more abstract notion of a Haskell program/module, too. A .cabal file is just a textual representation of a the abstract notion of a Haskell package description.
yes, .hs AST etc must be implemented. However implementing cabal in addition to that is more work.
Distinct from what? from .hs.
[attention] From whom? IDE devs
It needs to be parsed, updated, validated, formatted. This will be the case for whatever is being used, so again: What's the point? It doesn't matter if it's in its own .cabal syntax, in some Haskell-like syntax, JSON, YAML, or even some graphical representation.
if serialized model is used, then parsing, update, validation, formatting are no longer necessary I'm not sure what config format is meant here. If it's stack.yaml, it
*must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.
What standard package format are we trying to agree then?
More problems (distinct file type etc).
What are the actual problems here?
implementing each new file type in IDE is a lot of work. That is, if IDE is trying to do anything with contents of that file. Such as support syncing renamed file to config.
More formats may follow.
If they are for different purposes, that's OK and is to be expected.
Each new format would need to be implemented. Time spent on implementing new formats is time not spent on implementing any other features. It may take nearly as long as implementing .hs support itself. Is this even thought about? If this may be avoided, why not at least consider this as an option?
.cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?
Standard package file format (as the thread is called). Isn't it about cabal and yaml? Anyway, can not a common config file be used for both purposes? If not, can common file type / model be used for both purposes - sharing the common parts of the type structure?
Somehow you will always need a concrete representation of abstract notions (call them "models", "ASTs", etc.), otherwise you won't be able to process them. So you will always need to care about some kind of syntax etc., I can't see how using a "Haskell type" will help here. And you will need some semantics for the representation. Even if we used e.g. JSON (or whatever is en vogue at the moment), IDEs will not magically start understanding and supporting Haskell projects.
well if config is expressed in terms of Haskell syntax, implemented .hs support will be enough to support editing these config files. Each file type (including .cabal) takes time to implement.
Again: What is the actual problem we're trying to solve? I still haven't seen a concrete use case which is hard/impossible with the current state of affairs. Personally, I would e.g. like to see some abstraction facilities to avoid repetition in .cabal files with lots of executables, but I don't care about the concrete syntax (and Cabal's internal model/AST wouldn't be affected, either).
adopting standard package file format. Which could be addressed even better by adopting typed standard config content. the problems as I see them are: - users need to learn .cabal (.yaml, ...) syntax in addition to .hs syntax - IDE need to implement each such syntax on top of .hs. That is, if support / sync of these configs to code files is expected. Am I the only one who sees these as issues that need / can be solved? Also maybe let's be more specific: what is this thread - *Standard package file format* - all about?

2016-09-18 19:38 GMT+02:00 Imants Cekusins
yes, .hs AST etc must be implemented.
If we're talking about a Haskell tool, it *is* already implemented: Just look into the Cabal project on github. If we're not talking about a Haskell tool and something outside the cabal/hackage/stackage ecosystem, writing an AST and a parser for it will be the least of your problems: The main problem will be how to map cabal's/stack's view what is a package/project to your tool's view. I don't think there's an universally agreed upon notion of what is a package or project, almost every IDE out there has its own view of what those mean, each with its pros and cons (which may be heavily influenced by the package/project programming language, not the language in which the IDE is written).
However implementing cabal in addition to that is more work.
Are we talking about parsing/printing here? If yes, there's already work in that direction (making the frontend, i.e. parser/printer/AST, a separate library), at least that's what I understood so far.
Distinct from what? from .hs.
And that's perfectly fine: A project/package description is something fundamentally different than a turing-complete general-purpose programming language. A project/package description should be a mostly static, declarative thing, perhaps with a few conditionals and/or (hygienic) macros or such for convenience/brevity, but not something which can calculate fibonacci numbers or solve differential equations.
[attention] From whom? IDE devs
Hmmm, so parsing some package/project description is a problem when writing an IDE? I highly doubt that this is relevant compared to the amount of work needed for an average IDE.
It needs to be parsed, updated, validated, formatted. This will be the case for whatever is being used, so again: What's the point? It doesn't matter if it's in its own .cabal syntax, in some Haskell-like syntax, JSON, YAML, or even some graphical representation.
if serialized model is used, then parsing, update, validation, formatting are no longer necessary
Huh? What's a "serialized model" then? Whatever you do, you have to parse/validate/... any description. Even if you choose some subset of Haskell (which is probably a bad idea IMHO because it's either too general or not really Haskell anymore), there has to be *some* parser etc. Where should that come from? Neither Emacs nor VIM can e.g. parse/print Haskell out of the box, VS probably can't either.
I'm not sure what config format is meant here. If it's stack.yaml, it
*must* be somehow different (even if we ignore the surface syntax), because it describes a project, not a single package.
What standard package format are we trying to agree then?
stack.yaml is not a "package format", so there is nothing to agree on.
More problems (distinct file type etc).
What are the actual problems here?
implementing each new file type in IDE is a lot of work. That is, if IDE is trying to do anything with contents of that file. Such as support syncing renamed file to config.
More formats may follow.
If they are for different purposes, that's OK and is to be expected.
Each new format would need to be implemented. Time spent on implementing new formats is time not spent on implementing any other features. It may take nearly as long as implementing .hs support itself. Is this even thought about?
stack.yaml is not .cabal in a new syntax, there is new functionality. Even if both were e.g. written in YAML, your shiny hypothetical IDE wouldn't suddenly support reproducible multi-package builds out of the box if it couldn't do so before.
If this may be avoided, why not at least consider this as an option?
.cabal files describe "how a package looks like" and a stack.yaml describes "how to build a project in a reproducable way", which are different (although related) things. What should "common" mean here?
Standard package file format (as the thread is called). Isn't it about cabal and yaml?
If we are really only talking about a *package* format, there is currently only .cabal and a single format is by definition "standard". :-)
well if config is expressed in terms of Haskell syntax, implemented .hs support will be enough to support editing these config files. Each file type (including .cabal) takes time to implement.
Again: Where is that ominous ".hs support" coming from?
the problems as I see them are:
- users need to learn .cabal (.yaml, ...) syntax in addition to .hs syntax
As has already been mentioned by others, I *highly* doubt that the .cabal syntax itself poses the slightest problem for anyone. The semantics are a different story, but you have to learn them anyway.
- IDE need to implement each such syntax on top of .hs. That is, if support / sync of these configs to code files is expected.
You just update your internal view of the package/project and write out
the changed part. With library support for .cabal and YAML files that's trivial. Am I the only one who sees these as issues that need / can be solved?
Also maybe let's be more specific: what is this thread - *Standard package file format* - all about?
That's the central question IMHO. :-) The current discussion seems to drift towards: Do we need the current package/project dichotomy or can we throw everything together? (Note that e.g. Visual Studio distinguishes projects and solutions, too, perhaps there's a reason for that?)

Am 18.09.2016 um 14:03 schrieb Tom Murphy:
Even just looking at the set of features which is 1:1 betw. YAML and JSON, we're essentially just talking about key-value pairs with a couple of common types for the values.
This is just as correct as saying that Haskell is about functions - i.e. superficially correct but mostly beside the point. For JSON, it's string-to-whatever maps, arrays, and primitive types. For YAML, it's string-to-whatever maps, arrays, primitive types, references (so you can have shared and circular data structures), and arbitrary types (it will use constructors to deserialize).
This isn't all .cabal files contain (e.g. see hvr's points about conditionals), but if it were true, is it really worth changing how Cabal works for a diffferent color bikeshed?
It's bikeshedding if and only if interoperability is irrelevant. However, in today's world, rejecting interoperability is insanity. So: no bikeshedding, there are real issues. It's still quite possible that it's simply not worth it; the cons associated with changing the buildfile format are pretty weighty after all, and if the Cabal people say they can fix the known problems with that format, it's probably a better idea to see what comes of that before pursuing alternate formats.

On Sat, Sep 17, 2016 at 2:41 AM, Joachim Durchholz
Changing the grammar always produces that kind of awkwardnesses. However, for a fair comparison, you need to actively look for things that work better with the alternate grammar before you conclude it's worse.
The burden is on you to prove that the massive upheaval of a switch is justified, not on others to prove that your preference won't work. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Am 17.09.2016 um 08:57 schrieb Brandon Allbery:
On Sat, Sep 17, 2016 at 2:41 AM, Joachim Durchholz
wrote: Changing the grammar always produces that kind of awkwardnesses. However, for a fair comparison, you need to actively look for things that work better with the alternate grammar before you conclude it's worse.
The burden is on you to prove that the massive upheaval of a switch is justified, not on others to prove that your preference won't work.
I do like YAML, but I know far too little about the various use cases to justify any preference; it's quite possible that it's not a good fit, but I can't really decide it. All I can do is provide knowledge about YAML, which in some cases was really necessary, and pointing out one-sided arguments such as Herbert's; doing a review of Cabal config usecases and see how well they map to YAML is, sadly, beyond my capabilities. Contributing the best I can and all that.

Hello, On 2016-09-17 at 08:41:37 +0200, Joachim Durchholz wrote:
Am 17.09.2016 um 00:13 schrieb Herbert Valerio Riedel:
the prospect that a standard format like YAML would allow to reuse standard tooling/libraries for YAML seems quite weak to me;
It's not about standard tooling, it's about tools written by third parties. Tools that you didn't have the time or interest to write yourself, but which still help make your ecosystem more useful to others.
Sure, but we don't need to throw out the baby with the bathwater to accomplish that! Oleg is currently working on a new parser for cabal.config, cabal.project & ${pkg}.cabal grammar (NB: cabal already uses one standard unified syntax for all its configuration/description files) which lends itself better to provide equivalent of ghc-exactprint (i.e. perfect roundtripping, allowing for faithful refactoring tooling). Then 3rd parties can then use this new parser as a library. [..]
I really don't understand the appeal of applying the golden hammer of YAML, if `.cabal`'s grammar is already self-evident and concise with its syntax:
if flag(fast) ghc-options: -O2 else ghc-options: -O0
where this if/then/else construct is encoded in the grammar proper rather than being merely a semantic interpretation after decoding a general grammar designed for simpler typed data-representations which isn't even accurate enough (since it has additional symmetries/freedoms) to capture the desired grammar faithfully, which make YAML quite error-prone for this specific application.
Yeah it isn't nice. Changing the grammar always produces that kind of awkwardnesses. However, for a fair comparison, you need to actively look for things that work better with the alternate grammar before you conclude it's worse.
Well, that burden of proof lies with those who argue YAML to be superior to .cabal syntax, doesn't it? The if/then/else awkwardness is just one aspect I pointed out explicitly. I hinted at other issues which result from first parsing into an inappropriate data-model just for the sake of using YAML, and then having to re-parse that interim lossy data-model for real into the actual data-model we're interested in (and hoping we didn't loose some of the essential information). But I see no need to invest time to spell those problems out until I see a compelling argument that e.g. YAML syntax is really preferable (to justify the costs incurred) to the status quo in the first place. -- hvr

On 2016-09-17 09:25, Herbert Valerio Riedel wrote:
Hello,
On 2016-09-17 at 08:41:37 +0200, Joachim Durchholz wrote:
Am 17.09.2016 um 00:13 schrieb Herbert Valerio Riedel:
the prospect that a standard format like YAML would allow to reuse standard tooling/libraries for YAML seems quite weak to me;
It's not about standard tooling, it's about tools written by third parties. Tools that you didn't have the time or interest to write yourself, but which still help make your ecosystem more useful to others.
Sure, but we don't need to throw out the baby with the bathwater to accomplish that!
Oleg is currently working on a new parser for cabal.config, cabal.project & ${pkg}.cabal grammar (NB: cabal already uses one standard unified syntax for all its configuration/description files) which lends itself better to provide equivalent of ghc-exactprint (i.e. perfect roundtripping, allowing for faithful refactoring tooling). Then 3rd parties can then use this new parser as a library.
I didn't see anything in the PR about exporting that parser as a library. Do you have a reference for that? Regardless: It will only help third party code written in Haskell. Much as I like most userland software to be written in Haskell it won't help e.g. IntelliJ IDEA one whit. Regards,

Am 17.09.2016 um 09:51 schrieb Bardur Arantsson:
Regardless: It will only help third party code written in Haskell. Much as I like most userland software to be written in Haskell it won't help e.g. IntelliJ IDEA one whit.
Unless Haskell runs on the JVM. Do you know whether Frege (https://github.com/Frege) is a viable option for that? At least at the surface, it qualifies, but I don't know whether the details (performance, Java library interoperability, stability, availability of Haskell language extensions) work out well enough for that.

On 2016-09-17 10:50, Joachim Durchholz wrote:
Am 17.09.2016 um 09:51 schrieb Bardur Arantsson:
Regardless: It will only help third party code written in Haskell. Much as I like most userland software to be written in Haskell it won't help e.g. IntelliJ IDEA one whit.
Unless Haskell runs on the JVM.
I think people have been wishing for that for a while... some people even worked on it, but so far nothing's come of it AFAIK.
Do you know whether Frege (https://github.com/Frege) is a viable option for that?
Not in the least last time I checked. It's missing far too many of the extensions that almost everybody uses as a matter of course. Maybe given a few more years, but I'm not holding my breath. Regards,

Am 17.09.2016 um 11:35 schrieb Bardur Arantsson:
On 2016-09-17 10:50, Joachim Durchholz wrote:
Do you know whether Frege (https://github.com/Frege) is a viable option for that?
Not in the least last time I checked. It's missing far too many of the extensions that almost everybody uses as a matter of course.
Pity. Any idea how hard it would be to make it compile ghc?

Hi Joachim,
Besides Frege, Haskell does indeed run on the JVM now via GHCVM [1] - it
was my HSoC project. I'll be doing a release in a couple days once I get a
couple issues sorted out and the installation is streamlined. I'm currently
working with Cary Robbins on getting the HaskForce Intellij Plugin working
for GHCVM. If all goes well, GHCVM 0.0.1 will ship with ghcvm, ghcvm-pkg,
cabalvm (a fork of cabal-install 1.22.9.0/Cabal 1.22.8.0 that supports
GHCVM), a working Intellij plugin, and will support all of GHC 7.10.3
extensions other than Template Haskell + interoperation with Java
libraries. You can join us on Gitter for live updates [2].
[1] http://github.org/rahulmut/ghcvm
[2] http://gitter.im/rahulmutt/ghcvm
Thanks,
Rahul
On Sat, Sep 17, 2016 at 2:20 PM, Joachim Durchholz
Am 17.09.2016 um 09:51 schrieb Bardur Arantsson:
Regardless: It will only help third party code written in Haskell. Much as I like most userland software to be written in Haskell it won't help e.g. IntelliJ IDEA one whit.
Unless Haskell runs on the JVM. Do you know whether Frege (https://github.com/Frege) is a viable option for that? At least at the surface, it qualifies, but I don't know whether the details (performance, Java library interoperability, stability, availability of Haskell language extensions) work out well enough for that.
_______________________________________________ Haskell-Cafe mailing list To (un)subscribe, modify options or view archives go to: http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe Only members subscribed via the mailman list are allowed to post.
-- Rahul Muttineni

On Sat, Sep 17, 2016 at 09:51:25AM +0200, Bardur Arantsson wrote:
Regardless: It will only help third party code written in Haskell. Much as I like most userland software to be written in Haskell it won't help e.g. IntelliJ IDEA one whit.
If you're talking about more IDEs supporting Haskell, then having a more standard package format really won't help that much. Getting good and stable support there's a need for tools that can be called by IDEs. Building a Haskell project IDEs won't read the cabal file and call ghc, but they just call cabal. The same is the case for e.g. auto completion or any other IDE operation that needs to consider the whole project, the configuration and all of its dependencies. Reimplemeting cabals logic in every IDE doesn't make that much sense and at the end it won't work that well and it will easily break.

YAML and TOML are not, strictly speaking, package file formats. They are *meta-formats*. There is, by design, nothing about them that ties them in any way to any kind of package system. That means that other, even more popular, meta-formats should be considered. In particular, while XML and JSON are not by any means *wonderful*, they are far better known than TOML or even YAML. On 16/09/16 6:20 PM, Harendra Kumar wrote:
From a developer's perspective, the major benefit of a standard and widely adopted format and is that people can utilize their knowledge acquired from elsewhere, they do not have to go through and learn differently looking and incomplete documentation of different tools. The benefit of a common config specification is that developers can choose tools freely without worrying about learning the same concepts presented in different ways.
If we are talking about *meta-formats*, this is only half true. No amount of knowledge about YAML per se will tell you how to use YAML to describe Haskell packages. Nor will it let you choose tools freely if what you want is tools that understand your *package file format* specifically. (For example, editors that can drop in handy templates, or validate a description.)
* YAML (http://yaml.org/spec/1.2/spec.html) is standard and popular. A significant chunk of developer community is already familiar with it. It is being used by stack and by hpack as an alternative to cabal format. The complaint against it is that the specification/implementation is overly complex.
It's not clear what "standard" means in this context. yaml.org *calls* it "standard", but as the joke puts it, "CALLING a tail a leg doesn't MAKE it a leg." XML is a standard: it's managed by a well-known body. JSON is both an ECMA standard and an Internet RFC. There are other complaints: - that there is no *other* reason for most Haskell programmers to be aware of YAML, - that stack and hpack do not use "YAML" but an underspecified subset of YAML, and that - that due to YAML's complexity different implementations tend to implement different subsets, meaning less interoperability than you'd expect, - that the Ruby documentation for its YAML module http://ruby-doc.org/stdlib-1.9.3/libdoc/yaml/rdoc/YAML.html says "Do not use YAML to load untrusted data. Doing so is unsafe and could allow malicious input to execute arbitrary code inside your application." I must admit I'm surprised. - ... Could I respectfully suggest that the first step in a project like this is to describe the *semantics* of your package management information in a language-neutral way? I know a great language for describing abstract data types and giving them semantics. It's named for some logician, I think his surname was Curry. (:-) Seriously, there seems to be an endemic problem with programmers racing to syntax without thinking over-much about semantics. It happened with XML. It happened again with RDF. Eventually the semantics gets patched up, after pointless pain and suffering. Having nutted out exactly what the issues are with the semantics, then you can experiment with syntax.

I haven't totally followed this whole thread, so apologies if this isn't entirely relevant, but I use shake for building, and cabal for dependencies. The shakefile has the list of packages and required versions, and generates the .cabal file, which is used with --only-dependencies to get dependencies. I think it works well. I can't do builds in cabal anyway since it can't handle anything complicated, but even if I had a simple build I'd prefer shake since it's so much nicer. Since it's in haskell, it's flexible but can't be analyzed, though I can't think of why you'd want to analyze it. Meanwhile, cabal is just fine at expressing packages and versions, and is basically just a way to tell cabal-install what to download and install. Since I generate it, I don't care much about the format, but the existing one seems perfectly adequate.
participants (25)
-
Alan & Kim Zimmerman
-
Andrew Butterfield
-
Bardur Arantsson
-
Brandon Allbery
-
Chris Smith
-
Christopher Allen
-
Daniel Trstenjak
-
David McBride
-
Evan Laforge
-
Harendra Kumar
-
Herbert Valerio Riedel
-
Herbert Valerio Riedel
-
Imants Cekusins
-
Joachim Durchholz
-
Kosyrev Serge
-
Mario Blažević
-
MarLinn
-
MigMit
-
ok@cs.otago.ac.nz
-
Rahul Muttineni
-
Richard A. O'Keefe
-
Sven Panne
-
Tobias Dammers
-
Tom Murphy
-
yogsototh