
On 13/01/2012, Simon Peyton-Jones
Thanks to Greg for leading the records debate. I apologise that I don't have enough bandwidth to make more than an occasional contribution. Greg's new wiki page, and the discussion so far has clarified my thinking, and this message tries to express that new clarity. I put a conclusion at the end.
Simon
Overview ~~~~~~~~ It has become clear that there are two elements to pretty much all the proposals we have on the table. Suppose we have two types, 'S' and 'T', both with a field 'f', and you want to select field 'f' from a record 'r'. Somehow you have to disambiguate which 'f' you mean.
(Plan A) Disambiguate using qualified names. To select field f, say (S.f r) or (T.f r) respectively.
(Plan B) Disambiguate using types. This approach usually implies dot-notation. If (r::S), then (r.f) uses the 'f' from 'S', and similarly if (r::T).
Note that
* The Frege-derived records proposal (FDR), uses both (A) and (B) http://hackage.haskell.org/trac/ghc/wiki/Records/NameSpacing
* The Simple Overloaded Record Fields (SORF) proposal uses only (B) http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
* The Type Directed Name Resolution proposal (TDNR) uses only (B)
http://hackage.haskell.org/trac/haskell-prime/wiki/TypeDirectedNameResolutio...
I know of no proposal that advocates only (A). It seems that we are agreed that we must make use of types to disambigute common cases.
Complexities of (Plan B) ~~~~~~~~~~~~~~~~~~~~~~~~ Proposal (Plan B) sounds innocent enough. But I promise you, it isn't. There has ben some mention of the "left-to-right" bias of Frege type inference engine; indeed the wohle explanation of which programs are accepted and which are rejected, inherently involves an understanding of the type inference algorithm. This is a Very Bad Thing when the type inference algorithm gets complicated, and GHC's is certainly complicated.
Here's an example:
type family F a b data instance F Int [a] = Mk { f :: Int }
g :: F Int b -> () h :: F a [Bool] -> ()
k x = (g x, x.f, h x)
Consider type inference on k. Initially we know nothing about the type of x. * From the application (g x) we learn that x's type has shape (F Int <something>). * From the application (h x) we learn that x's type has shape (F <something else> [Bool]) * Hence x's type must be (F Int [Bool]) * And hence, using the data family we can see which field f is intended.
Notice that a) Neither left to right nor right to left would suffice b) There is significant interaction with type/data families (and I can give you more examples with classes and GADTs) c) In passing we note that it is totally unclear how (Plan A) would deal with data families
This looks like a swamp. In a simple Hindley-Milner typed language you might get away with some informal heuristics, but Haskell is far too complicated.
Fortunately we know exactly what to do; it is described in some detail in our paper "Modular type inference with local assumptions" http://www.haskell.org/haskellwiki/Simonpj/Talk:OutsideIn
The trick is to *defer* all these decisions by generating *type constraints* and solving them later. We express it like this:
G, r:t1 |- r.f : t2, (Has t1 "f" t2)
This says that if r is in scope with type t1, then (r.f) has type t2, plus the constraint (Has t1 "f" t2), which we read as saying
Type t1 must have a field "f" of type t2
We gather up all the constraints and solve them. In solving them we may figure out t1 from some *other* constraint (to the left or right, it's immaterial. That allow us to solve *this* constraint.
So it's all quite simple, uniform, and beautiful. It'll fit right into GHC's type-constraint solver.
But note what has happened: we have simply re-invented SORF. So the conclusion is this:
the only sensible way to implement FDR is using SORF.
What about overloading? ~~~~~~~~~~~~~~~~~~~~~~~ A feature of SORF is that you can write functions like this
k :: Has r "f" Int => r -> Int k r = r.f + 1
Function 'k' works on any record that has a field 'f'. This may be cool, but it wasn't part of our original goal. And indeed neither FDR nor TDNR offer it.
But, the Has constraints MUST exist, in full glory, in the constraint solver. The only question is whether you can *abstract* over them. Imagine having a Num class that you could not abstract over. So you could write
k1 x = x + x :: Float k2 x = x + x :: Integer k3 x = x + x :: Int
using the same '+' every time, which generates a Num constraint. The type signature fixes the type to Float, Integer, Int respectively, and tells you which '+' to use. And that is exactly what ML does!
But Haskell doesn't. The Coolest Thing about Haskell is that you get to *abstract* over those Num constraints, so you can write
k :: Num a => a -> a k x = x + x
and now it works over *any* Num type.
On reflection, it would be absurd not to do ths same thing for Has constraints. If we are forced to have Has constraints internally, it woudl be criminal not to abstract over them. And that is precisely what SORF is.
Is (Plan A) worth it? ~~~~~~~~~~~~~~~~
Once you have (Plan B), and SORF in full glory, plus of course the existing ability to name fields T_f, S_f, if you want, I think it is highly questionable whether we need the additional complexities of (Plan A)?
And I do think (Plan A) has lots of additional complexities that we have not begun to explore yet. The data-family thing above is an example, and I can think of some others.
But even if it was simple, we still have to ask: does *any* additional complexity give enough payoff, if you already have SORF? I suspect not.
Extensions to SORF ~~~~~~~~~~~~~~~~~~ Frege lets you add "virtual fields" to a record type, using an extra RExtension mechanism that I do not yet understand. But SORF lets you do so with no additional mechanism. See "Virtual record selectors" on http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields The point is that the existing type-class instance mechanisms do just what we want.
Syntax ~~~~~~ The debate on the mailing list has moved sharply towards discussing lexical syntax. I'm not going to engage in that discussion because while it is useful to air opinions, it's very hard to get agreement. But for the record:
* I don't mind having Unicode alternatives, but there must be ASCII syntax too
* I think we must use ordinary dot for field selection.
* I think it's fine to insist on no spaces; we are already doing this for qualified names, as someone pointed out
* I think we should not introduce new syntax unless we are absolutely forced into it. Haskell's record update syntax isn't great, but we have it.
Honestly, I would be glad to find that syntax deprecated/removed. It is awkward and in my opinion not very haskellish. The let-syntax is clean, but quite verbose, and likely suboptimal for other reasons, such as Greg Weber posted to the wiki. Nevertheless, the dot-let-syntax seems (to me at least) the rational convergence of plain let-syntax and the new dot-selection syntax. Perhaps we might: * Define old record update syntax as syntactic sugar for let-syntax. * Allow language extensions, such as Frege-syntax, as sugar for the same.
Conclusion ~~~~~~~~~~ I am driven to the conclusion that SORF is the way to go. - Every other proposal on the table requires SORF (either visibly or invisibly) - When you have SORF, you don't really need anything else
I fully agree. Mind, I'm not a GHC God, but I fully agree nonetheless. A powerful record system (not solely label namespaces) would be a great win for Haskell. This is that record system.
The blocking issues are described on http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields
a) "Representation hiding" (look for that heading)
b) "Record update" (ditto), most especially for records whose fields have polymorphic types
If we fix these we can move forward.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users