Re: Records in Haskell

25 Feb 2012

      On Fri, Feb 24, 2012 at 11:40 PM, Johan Tibell  wrote:
...
Hi Barney,
On Fri, Feb 24, 2012 at 2:00 PM, Barney Hilken  wrote:
...
Every one of your messages about records stresses your dislike for polymorphic projections, and your insistence that the Has class should be hidden from the user. I've read all of your explanations, but I'm still totally unconvinced. All your arguments about the semantics of labels are based on the way you want to use them, not on what they are. They are projection functions! Semantically, the only difference between them is the types. Polymorphism makes perfect sense and is completely natural. There is nothing "untyped" about it.
I share Greg's concerns about polymorphic projections. For example,
given a function
   sort :: Ord a => ...
we don't allow any 'a' that happens to export a operator that's
spelled <= to be passed to 'sort'. We have the user explicitly create
an instance and thereby defining that their <= is e.g. a strict weak
ordering and thus make sense when used with 'sort'. This explicitness
is useful, it communicates the contract of the function to the reader
and lets us catch mistakes in a way that automatically polymorphic
projections don't.
Automatically polymorphic projections feels like Go's structural
polymorphism, C++'s templates or C's automatic numeric coercions, and
I'm worried it'll lead to problems when used at scale. They're not
required to solve the problem we're trying to solve, so lets hurry
slowly and don't bake them in together with the namespacing problem.
At the very least use two different LANGUAGE pragmas so users can have
one without the other.
I agree completely. This is what I like about DORF: the D stands for
"Declared", which is referring to the fact that the contracts are
explicit. Record fields aren't automatically polymorphic based on
their name and type, as with SORF, rather they are scoped and
disambiguated in the same way as classes. My only discontentments are
that it requires top-level declarations for each record field, which
feels excessive, and that the polymorphism is opt-out, in other words
if you declare a record with a given field and a field with that
name/type is already in scope, it is automatically considered to be an
instance of the same field. (Not that this is not the same as SORF,
because if more than one field with the same name/type is in scope you
can (and have to) use the usual explicit module qualification to
disambiguate them, and you can write a new top-level field declaration
to make it explicit that you're not re-using the field which was in
scope.)

I think both of these problems could be solved at the same time if (a)
instead of requiring explicit top-level declarations for fields,
declaring a record would also automatically declare its fields, as
distinct from any other fields which may have been in scope, and (b)
there would be some lightweight syntax you could use within record
declarations to specify that you do want to re-use the record field in
scope instead of declaring a new one. This has the added benefit that
record declarations as currently written would continue to have the
same meaning as they currently have. (For the record, I don't see any
harm in also allowing explicit top-level field declarations, outside
of records, it's the requirement for them which seems onerous.)

So in DORF, if you want to declare a Contact record with a name, a
phone number, and an address, you would write:

fieldLabel name :: Text
fieldLabel phoneNumber :: PhoneNumber
fieldLabel address :: Address

data Contact = Contact { name :: Text, phoneNumber :: PhoneNumber,
address :: Address }
-- it's unclear whether the type annotations would be belong in the
field declarations, in the record, or in both

then if you also want to keep track of people as employees, you write

fieldLabel position :: Position
fieldLabel salary :: Word

data Employee = Employee { name :: Text, position :: Position, salary :: Word }

And the name field would automatically be shared between them, and
could be used polymorphically with either record.

but then if you later write...

data City = City { name :: Text}

that would also automatically re-use the name field, but that would
clearly be wrong. It could be avoided by explicitly declaring a new
name field beforehand. (I suppose this aspect of the complaint might
be overblown, because as you can see when want a new field you always
write a fieldLabel declaration, and if you don't you're implying that
you're intending to use the existing one. But it's still very
verbose.)

In my variant of the proposal, declaring the Contact record would look
like this:

data Contact = Contact { name :: Text, position :: Position, salary :: Int }

This would automatically declare the name, position, and salary fields
with their associated types (equivalently to the fieldLabel
declarations from the previous example).

Then for Employee you would write:

data Employee = Employee { %name, position :: Position, salary :: Int }

where the just-invented-on-the-spot-don't-attach-any-importance-to-it
%name syntax would indicate that you want to re-use the name field in
scope, while position and salary would be newly declared.

And when you write

data City = City { name :: Text }

you would be declaring a new field.

Just like DORF, if you wanted to declare the Employee record in a
situation where you already had the name fields from both City and
Contact in scope, you would write:

data Employee = Employee { %Data.Contact.name, ... }

using normal module qualification to disambiguate it.

One thing you couldn't do (with either proposal, I think) is declare
multiple fields in the same module and with the same name, but which
*aren't* meant to be shared. This is just like how you can't declare
two classes with the same name in the same module, either. (Some kind
of independently introduced submodules feature feels like it might be
the appropriate remedy here.)

Please correct me if I've misunderstood or mischaracterized any aspect of DORF.

Re: Records in Haskell

Gábor Lehel