
2012/1/8 Greg Weber
2012/1/8 Gábor Lehel
Later on you write that the names of record fields are only accessible from the record's namespace and via record syntax, but not from the global scope. For Haskell I think it would make sense to reverse this decision. On the one hand, it would keep backwards compatibility; on the other hand, Haskell code is already written to avoid name clashes between record fields, so it wouldn't introduce new problems. Large gain, little pain. You could use the global-namespace function as you do now, at the risk of ambiguity, or you could use the new record syntax and avoid it. (If you were to also allow x.n syntax for arbitrary functions, this could lead to ambiguity again... you could solve it by preferring a record field belonging to the inferred type over a function if both are available, but (at least in my current state of ignorance) I would prefer to just not allow x.n for anything other than record fields.)
Perhaps you can give some example code for what you have in mind - we do need to figure out the preferred technique for interacting with old-style records. Keep in mind that for new records the entire point is that they must be name-spaced. A module could certainly export top-level functions equivalent to how records work now (we could have a helper that generates those functions).
Let's say you have a record.
data Record = Record { field :: String }
In existing Haskell, you refer to the accessor function as 'field' and to the contents of the field as 'field r', where 'r' is a value of type Record. With your proposal, you refer to the accessor function as 'Record.field' and to the contents of the field as either 'Record.field r' or 'r.field'. The point is that I see no conflict or drawback in allowing all of these at the same time. Writing 'field' or 'field r' would work exactly as it already does, and be ambiguous if there is more than one record field with the same name in scope. In practice, existing code is already written to avoid this ambiguity so it would continue to work. Or you could write 'Record.field r' or 'r.field', which would work as the proposal describes and remove the ambiguity, and work even in the presence of multiple record fields with the same name in scope.
The point is that I see what you gain by allowing record fields to be referred to in a namespaced way, but I don't see what you gain by not allowing them to be referred to in a non-namespaced way. In theory you wouldn't care because the non-namespaced way is inferior anyways, but in practice because all existing Haskell code does it that way, it's significant.
My motivation for this entire change is simply to be able to use two record with field members of the same name. This requires *not* generating top-level functions to access record fields. I don't know if there is a valid use case for the old top-level functions once switched over to the new record system (other than your stated personal preference). We could certainly have a pragma or something similar that generates top-level functions even if the new record system is in use.
Oh, in a sense you're right. If the top-level accessor functions are treated as if they were defined by the module containing the record, and there is more than one with the same name, the compiler would see it as multiple definitions and indeed report an error. On the other hand if they are treated as imported names (conceptually, implicitly imported from the namespace of the record, say), then the compiler would only report an error when you actually try to use the ambiguous name. I had been assuming the latter case without realizing it. It corresponds to what you have now if you have multiple records imported with overlapping field names. Again, exporting the field accessors to global scope and deferring any errors from ambiguity or overlap to the point of their use would not in any way interfere with the use of those same field accessors with the namespaced syntax. If you only use the namespaced syntax, it would work exactly as in your proposal: the top-level accessors are never used so no ambiguity errors are reported. If you only use the top-level syntax, then it works almost exactly as Haskell currently does (except you can define multiple records with overlapping field names in the same module as long as you don't use them, which I had not considered). The set of well-formed programs if you allow top-level access would be almost a superset of the set of well-formed programs if you don't. (The exception is that top-level field accessors would conflict with non-accessor plain old functions of the same name, whereas if they weren't visible outside of the record's namespace they wouldn't, but I don't feel like that's a huge concern.)
All of that said, maybe having TDNR with bad syntax is preferable to not having TDNR at all. Can't it be extended to the existing syntax (of function application)? Or least some better one, which is ideally right-to-left? I don't really know the technical details...
Generalized data-namespaces: Also think I'm opposed. This would import the problem from OO languages where functions written by the module (class) author get to have a distinguished syntax (be inside the namespace) over functions by anyone else (which don't).
Maybe you can show some example code? To me this is about controlling exports of namespaces, which is already possible - I think this is mostly a matter of convenience.
If I'm understanding correctly, you're suggesting we be able to write:
data Data = Data Int where twice (Data d) = 2 * d thrice (Data d) = 3 * d ...
and that if we write 'let x = Data 7 in x.thrice' it would evaluate to 21. I have two objections.
The first is the same as with the TDNR proposal: you would have both code that looks like 'data.firstFunction.secondFunction.thirdFunction', as well as the existing 'thirdFunction $ secondFunction $ firstFunction data' and 'thirdFunction . secondFunction . firstFunction $ data', and if you have both of them in the same expression (which you will) it becomes unpleasant to read because you have to read them in opposite directions.
This would not be possible because the functions can only be accessed from the namespace - you could only use the dot (or T.firstFunction). It is possible as per your complaint below:
Sorry, I was unclear here. The firstFunction, secondFunction, and thirdFunction in my examples are *not* referring to the very same firstFunction, secondFunction, and thirdFunction, they are all placeholders for arbitrary functions. My problem is that you could (and would have to, because the syntaxes aren't interchangeable) write things like this: foo . bar . (baz.quux.asdf) . wasd $ hjkl Now what's the right order for reading the functions in this expression? The correct answer is: hjkl wasd baz quux asdf bar foo or using numbers to denote their place: 7 6 3 4 5 2 1 If you had written the equivalent using existing Haskell syntax it would be: foo . bar . (asdf $ quux baz) . wasd $ hjkl and the right order for reading it is: hjkl wasd baz quux asdf bar foo or with numbers: 7 6 5 4 3 2 1 If you introduce heavy use of the a.b.c.d syntax you would frequenty have to jump around and switch directions while you read an expression. If you restrict it to only field accessors I think it would be limited and tolerable, my quarrel is with allowing arbitrary functions (whether by TDNR or data-namespacing) in which case you would likely as not end up with half of functions going one way and the other half going the other.
The second is that only the author of the datatype could put functions into its namespace; the 'data.foo' notation would only be available for functions written by the datatype's author, while for every other function you would have to use 'foo data'. I dislike this special treatment in OO languages and I dislike it here.
-- Work is punishment for failing to procrastinate effectively.