
On 16/09/16 6:37 PM, Tobias Dammers wrote:
Another factor in favor of YAML is that it is a superset of JSON,
Here is a simple string in JSON: "Where's the Golden Fleece?" Here is the same string in YAML: --- Where's the Golden Fleece? ... Superset? I understand "language X is a superset of language Y" to mean that if I have a document in language Y it can be correctly processed by a language X processor. If you mean that any data value that can be represented in JSON can be represented (differently!) in YAML, fine, but that's not the same thing. There are many textual formats that generalise JSON. Heck, even GNUSTEP Property List format does *that*. (And no, I do not recommend adopting that for anything.) For that matter, any JSON document can be transcoded with no loss of structural information into XML and vice versa. That doesn't mean that JSON is a superset of XML! Familiarity with JSON semantics and syntax did not help me AT ALL when faced with YAML. Here's another meta-format worthy of consideration. A *package* is a collection of resources with relationships between them and relationships linking them to other things like authors (think Dublin Core). Is there a standard (genuinely standard) notation specifically for describing resources and their relationships, with quite a few tools for not just reading it and writing it but actually reasoning with it? Why yes. It's called RDF. http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/ The design of RDF is intended to meet the following goals: * having a simple data model * having formal semantics and provable inference * using an extensible URI-based vocabulary * using an XML-based syntax * supporting use of XML schema datatypes * allowing anyone to make statements about any resource There is a human-friendly syntax interconvertible with the XML one, Turtle. http://www.w3.org/TR/turtle/ Now RDF (whether XML or Turtle) is *not* designed for presenting single data values. But that's not really what a package format wants to do anyway. Am I seriously recommending RDF (or possibly OWL-DL) as a good way to describe packages? I am certainly serious that it should be CONSIDERED. And I'm particularly serious about that for two reasons. (1) JSON, XML, TOML, and YAML are all about serialising *data values*. That's all they do. Anything beyond that is up to you. RDF and OWL are all about describing *relationships* between *resources*. It's worth considering carefully what you want to say in a package file format. If you want to describe *relationships*, then something that deals with data values may not be the right *kind* of "language". Simply jarring people loose from the idea that a "single possibly structured data value" language is the ONLY kind of language is of value in itself. (2) JSON, XML, TOML, and YAML are all about serialising *data values*. *Single* possibly structured data values. That's all they do. There is no sense in which there is any standard way to *combine* data in these forms. In contrast, RDF was *invented* to have a way of patching together multiple sets of facts from multiple sources. Given a collection of package descriptions in YAML, all you have is a bunch of text files; what you do with them is *entirely* up to you. Given a bunch of RDF/XML or RDF/Turtle files, there is a *standard* way to write a query (SPARQL) which integrates them. It becomes possible to write consistency-checking queries that can be processed by multiple tools. It becomes possible to ask "if I need these, what else do I need?" in a standard way. Again, the idea here is to get people thinking that having a documented semantics that can be processed by existing description logic tools has value, so that something at a higher semantic level than YAML or XML might be worth thinking about.