Re: Records in Haskell

8 Jan 2012


      2012/1/8 Gábor Lehel 
...
...
...
Later on you write that the names of record fields are only accessible
from the record's namespace and via record syntax, but not from the
global scope. For Haskell I think it would make sense to reverse this
decision. On the one hand, it would keep backwards compatibility; on
the other hand, Haskell code is already written to avoid name clashes
between record fields, so it wouldn't introduce new problems. Large
gain, little pain. You could use the global-namespace function as you
do now, at the risk of ambiguity, or you could use the new record
syntax and avoid it. (If you were to also allow x.n syntax for
arbitrary functions, this could lead to ambiguity again... you could
solve it by preferring a record field belonging to the inferred type
over a function if both are available, but (at least in my current
state of ignorance) I would prefer to just not allow x.n for anything
other than record fields.)
Perhaps you can give some example code for what you have in mind - we do
need to figure out the preferred technique for interacting with old-style
records. Keep in mind that for new records the entire point is that they
must be name-spaced. A module could certainly export top-level functions
equivalent to how records work now (we could have a helper that generates
those functions).
Let's say you have a record.
data Record = Record { field :: String }
In existing Haskell, you refer to the accessor function as 'field' and
to the contents of the field as 'field r', where 'r' is a value of
type Record. With your proposal, you refer to the accessor function as
'Record.field' and to the contents of the field as either
'Record.field r' or 'r.field'. The point is that I see no conflict or
drawback in allowing all of these at the same time. Writing 'field' or
'field r' would work exactly as it already does, and be ambiguous if
there is more than one record field with the same name in scope. In
practice, existing code is already written to avoid this ambiguity so
it would continue to work. Or you could write 'Record.field r' or
'r.field', which would work as the proposal describes and remove the
ambiguity, and work even in the presence of multiple record fields
with the same name in scope.
The point is that I see what you gain by allowing record fields to be
referred to in a namespaced way, but I don't see what you gain by not
allowing them to be referred to in a non-namespaced way. In theory you
wouldn't care because the non-namespaced way is inferior anyways, but
in practice because all existing Haskell code does it that way, it's
significant.
My motivation for this entire change is simply to be able to use two record
with field members of the same name. This requires *not* generating
top-level functions to access record fields. I don't know if there is a
valid use case for the old top-level functions once switched over to the
new record system (other than your stated personal preference). We could
certainly have a pragma or something similar that generates top-level
functions even if the new record system is in use.
...
...
...
All of that said, maybe having TDNR with bad syntax is preferable to
not having TDNR at all. Can't it be extended to the existing syntax
(of function application)? Or least some better one, which is ideally
right-to-left? I don't really know the technical details...
Generalized data-namespaces: Also think I'm opposed. This would import
the problem from OO languages where functions written by the module
(class) author get to have a distinguished syntax (be inside the
namespace) over functions by anyone else (which don't).
Maybe you can show some example code? To me this is about controlling
exports of namespaces, which is already possible - I think this is
mostly a
...
matter of convenience.
If I'm understanding correctly, you're suggesting we be able to write:
data Data = Data Int where
   twice (Data d) = 2 * d
   thrice (Data d) = 3 * d
   ...
and that if we write 'let x = Data 7 in x.thrice' it would evaluate to
21. I have two objections.
The first is the same as with the TDNR proposal: you would have both
code that looks like
'data.firstFunction.secondFunction.thirdFunction', as well as the
existing 'thirdFunction $ secondFunction $ firstFunction data' and
'thirdFunction . secondFunction . firstFunction $ data', and if you
have both of them in the same expression (which you will) it becomes
unpleasant to read because you have to read them in opposite
directions.
This would not be possible because the functions can only be accessed from
the namespace - you could only use the dot (or T.firstFunction). It is
possible as per your complaint below:
...
The second is that only the author of the datatype could put functions
into its namespace; the 'data.foo' notation would only be available
for functions written by the datatype's author, while for every other
function you would have to use 'foo data'. I dislike this special
treatment in OO languages and I dislike it here.

Re: Records in Haskell

Greg Weber