
2012/1/8 Gábor Lehel
2012/1/8 Greg Weber
: 2012/1/8 Gábor Lehel
Thank you. I have a few questions/comments.
"The module/record ambiguity is dealt with in Frege by preferring modules and requiring a module prefix for the record if there is ambiguity."
I think I see why they do it this way (otherwise you can't refer to a module if a record by the same name is in scope), but on the other hand it would seem intuitive to me to choose the more specific thing, and a record feels more specific than a module. Maybe you could go that way and just not give your qualified imports the same name as a record? (Unqualified imports are in practice going to be hierarchical, and no one's in the habit of typing those out to disambiguate things, so I don't think it really matters if qualified records shadow them.)
In the case where a Record has the same name as its containing module it would be more specific than a module, and preferring it makes sense. I think doing this inside the module makes sense, as one shouldn't need to refer to the containing module's name. We should think more about the case where module & records are imported.
"Expressions of the form x.n: first infer the type of x. If this is just an unbound type variable (i.e. the type is unknown yet), then check if n is an overloaded name (i.e. a class operation). [...] Under no circumstances, however, will the notation x.n contribute in any way in inferring the type of x, except for the case when n is a class operation, where an appropriate class constraint is generated."
Is this just a simple translation from x.n to n x? What's the rationale for allowing the x.n syntax for, in addition to record fields, class methods specifically, but no other functions?
It is a simple translation from x.n to T.n x The key point being the function is only accessible through the record's namespace. The dot is only being used to tap into a namespace, and is not available for general function application.
I think my question and your answer are walking past each other here. Let me rephrase. The wiki page implies that in addition to using the dot to tap into a namespace, you can also use it for general function application in the specific case where the function is a class method ("appropriate class constraint is generated" etc etc). I don't understand why. Or am I misunderstanding?
Later on you write that the names of record fields are only accessible from the record's namespace and via record syntax, but not from the global scope. For Haskell I think it would make sense to reverse this decision. On the one hand, it would keep backwards compatibility; on the other hand, Haskell code is already written to avoid name clashes between record fields, so it wouldn't introduce new problems. Large gain, little pain. You could use the global-namespace function as you do now, at the risk of ambiguity, or you could use the new record syntax and avoid it. (If you were to also allow x.n syntax for arbitrary functions, this could lead to ambiguity again... you could solve it by preferring a record field belonging to the inferred type over a function if both are available, but (at least in my current state of ignorance) I would prefer to just not allow x.n for anything other than record fields.)
Perhaps you can give some example code for what you have in mind - we do need to figure out the preferred technique for interacting with old-style records. Keep in mind that for new records the entire point is that they must be name-spaced. A module could certainly export top-level functions equivalent to how records work now (we could have a helper that generates those functions).
Let's say you have a record.
data Record = Record { field :: String }
In existing Haskell, you refer to the accessor function as 'field' and to the contents of the field as 'field r', where 'r' is a value of type Record. With your proposal, you refer to the accessor function as 'Record.field' and to the contents of the field as either 'Record.field r' or 'r.field'. The point is that I see no conflict or drawback in allowing all of these at the same time. Writing 'field' or 'field r' would work exactly as it already does, and be ambiguous if there is more than one record field with the same name in scope. In practice, existing code is already written to avoid this ambiguity so it would continue to work. Or you could write 'Record.field r' or 'r.field', which would work as the proposal describes and remove the ambiguity, and work even in the presence of multiple record fields with the same name in scope.
The point is that I see what you gain by allowing record fields to be referred to in a namespaced way, but I don't see what you gain by not allowing them to be referred to in a non-namespaced way. In theory you wouldn't care because the non-namespaced way is inferior anyways, but in practice because all existing Haskell code does it that way, it's significant.
Later on:
"- the function that updates field x of data type T is T.{x=} - the function that sets field x in a T to 42 is T.{x=42} - If a::T then a.{x=} and a.{x=42} are valid"
I think this looks considerably ugly. Aren't there better alternatives? { T.x = }, { T.x = 42 }, { a.x = }, { a.x = 42 } maybe? (Does this conflict in some unfinesseable way with explicit layout contexts?)
I think this is one of those slightly different syntaxes that many people will have an initial bad reaction to, however once they use it they will like it just fine. The problem with what you are suggesting is that it would be verbose when updating multiple fields at once. But we should investigate if it is possible to have a syntax closer to the existing update syntax.
Good point.
"the function that changes field x of a T by applying some function to it is T.{x <-}"
Same comment on syntax applies. I believe this is a new feature? It would be welcome, albeit the overloading of <- is a bit worrisome (don't have better ideas at the moment, but I think there was a thread). I assume T.{x <- f}, a.{x <-}, and a.{x <- f} (whatever the syntax is) would also be valid, by analogy to the above?
Yes, new feature, so not necessary in the initial implementation. I personally think Haskell should drop the monadic curly brackets which nobody uses, but whatever syntax works is fine with me.
Re: Compatibility with existing records: based on (very) cursory inspection I don't see an obstacle to making it (near-)fully compatible - you would just be adding some new syntax, most significantly x.n. Backwards compatibility is a great advantage, so why not?
Generalizing the syntax to arbitrary TDNR: I think I'm opposed to this. The problem is that in existing Haskell the vast majority of expressions (with the notable (and imho unfortunate) exception of (>>=)) flow from right to left. Going the other way with record fields isn't a big problem because it's simple and doesn't even feel like function application so much as member-selection (like modules), but if you were to allow any function you would soon end up with lengthy chains of them which would clash nastily with the surrounding code. Having to jump back and forth and switch directions while reading is unpleasant. OO languages have this problem and I don't envy them for it. And in particular having "a . b" mean "first do b, then do a", but "a.b" mean "do b to a" would be confusing. (You'd already have this problem with global namespace record field selectors, but at least it's localized.)
I agree - I think a.b or A.b should always mean tapping into a namespace and not be generalized outside of that.
All of that said, maybe having TDNR with bad syntax is preferable to not having TDNR at all. Can't it be extended to the existing syntax (of function application)? Or least some better one, which is ideally right-to-left? I don't really know the technical details...
Generalized data-namespaces: Also think I'm opposed. This would import the problem from OO languages where functions written by the module (class) author get to have a distinguished syntax (be inside the namespace) over functions by anyone else (which don't).
Maybe you can show some example code? To me this is about controlling exports of namespaces, which is already possible - I think this is mostly a matter of convenience.
If I'm understanding correctly, you're suggesting we be able to write:
data Data = Data Int where twice (Data d) = 2 * d thrice (Data d) = 3 * d ...
and that if we write 'let x = Data 7 in x.thrice' it would evaluate to 21. I have two objections.
The first is the same as with the TDNR proposal: you would have both code that looks like 'data.firstFunction.secondFunction.thirdFunction', as well as the existing 'thirdFunction $ secondFunction $ firstFunction data' and 'thirdFunction . secondFunction . firstFunction $ data', and if you have both of them in the same expression (which you will) it becomes unpleasant to read because you have to read them in opposite directions.
The second is that only the author of the datatype could put functions into its namespace; the 'data.foo' notation would only be available for functions written by the datatype's author, while for every other function you would have to use 'foo data'. I dislike this special treatment in OO languages and I dislike it here.
Another thing that would be nice is lenses to solve the nested-record-update problem - at least the room to add them later. Most of the proposed syntax would be unaffected, but you'd need some syntax for the lens itself... I'm not sure what it might be. Would it be terrible to have T.x refer to a lens rather than a getter? (I don't know how you'd refer to the getter then, so probably yeah.) Or maybe { T.x }, building backwards from { T.x = }?
Another existing language very similar to Haskell whose record system might be worth evaluating is Disciple: http://disciple.ouroborus.net/. Unfortunately I couldn't find any specific page it seemed best to link to.
The syntax of DDC seems the same as this proposal. However, I could not find any specific information either.
The main things I remember being interesting about it are that it's based on lenses, and uses some kind of extensible projectors system to allow something similar to what you achieve with datatype-namespaces, namely 'virtual' record fields. But I haven't studied it in detail.
Ah, I remember now where I saw a more thorough discussion: in his thesis[1]. Section 2.7 (page 115) and in particular 2.7.4 (119). It seems to be a very similar proposal to datatype-namespacing except it would address my second objection above and allow third-party code to add functions to the namespace as well. My first objection (the 'flow' of the code being in the opposite direction to all other code) still applies though. I couldn't find any discussion of lenses, except as pertaining to destructive update (which is another feature of Disciple). [1] http://www.cse.unsw.edu.au/~benl/papers/thesis/lippmeier-impure-world.pdf
On Sun, Jan 8, 2012 at 2:40 AM, Greg Weber
wrote: I have updated the wiki - the entry level page [1] compares the different proposals and points to a more fleshed out explanation of the Frege proposal [2].
I think I now understand the differences between the existing proposals and am able to provide leadership to move this forward. Let me summarize the state of things: There is a debate over extensible records that we are putting off into the future. Instead we have 2 proposals to make things better right now: * an overloaded record fields proposal that still has implementation concerns * a name-spacing & simple type resolution proposal that is awaiting your critique
The Frege language originally had overloaded record fields but then moved to the latter system. The existing experience of the Frege language is very fortunate for us as we now have some experience to help inform our own decision.
Greg Weber
[1] http://hackage.haskell.org/trac/ghc/wiki/Records [2] http://hackage.haskell.org/trac/ghc/wiki/Records/NameSpacing
On Wed, Jan 4, 2012 at 7:54 AM, Greg Weber
wrote: The Frege author does not have a ghc mail list account but gave a more detailed explanation of how he goes about TDNR for records and how often it type checks without annotation in practice.
A more general explanation is here:
http://www.reddit.com/r/haskell/comments/nph9l/records_stalled_again_leaders...
He sent a specific response to Simon's mail list message, quoted below:
Simon Peyton-Jones is absolutely correct when he notes:
Well the most obvious issue is this. 3.2 says e.m = (T.m e) if the expression e has type t and the type constructor of t is T and there exists a function T.m But that innocent-looking statement begs the *entire* question! How do we know if "e has type t?
The way it is done in Frege is such that, if you have a function that uses or updates (nondestructively, of course) a "record" then at least the type constructor of that record has to be known. This is no different than doing it explicitly with case constructs, etc., just here you learn the types from the constructors you write in the patterns.
Hence, it is not so that one can write a function that updates field f to 42 for any record that contains a field f:
foo x = x.{f=42} -- type annotation required for foo or x
In practice this means you'll have to write a type annotation here and there. Often, the field access is not the only one that happens to some variable of record type, or the record is the result of another function application. In such cases, the type is known. I estimate that in 2/3 of all cases one does not need to write (T.e x) in sparsely type annotated code, despite the fact that the frege type checker has a left to right bias and does not yet attempt to find the type of x in the code that "follows" the x.e construct (after let unrolling etc.) I think one could do better and guarantee that, if the type of x is inferrable at all, then so will be x.e (Still, it must be more than just a type variable.)
On Sun, Jan 1, 2012 at 2:39 PM, Greg Weber
wrote: On Sat, Dec 31, 2011 at 3:28 PM, Simon Peyton-Jones
wrote: > > Frege has a detailed explanation of the semantics of its record > implementation, and the language is *very* similar to Haskell. Lets > just > start by using Frege's document as the proposal. We can start a new > wiki > page as discussions are needed. > > > > If it’s a serious proposal, it needs a page to specify the design. > Currently all we have is a paragraph on > http://hackage.haskell.org/trac/ghc/wiki/Records, under “Better name > spacing”. > > > > As previously stated on this thread, the Frege user manual is > available > here: > > http://code.google.com/p/frege/downloads/detail?name=Language-202.pdf > > see Sections 3.2 (primary expressions) and 4.2.1 (Algebraic Data type > Declaration - Constructors with labeled fields) > > > > To all those concerned about Records: look at the Frege > implementation > and poke holes in it. > > > > Well the most obvious issue is this. 3.2 says > > e.m = (T.m e) if the expression e has type t and the type constructor > > of t is T and there exists a function T.m > > But that innocent-looking statement begs the *entire* question! How > do > we know if “e has type t? This is the route ML takes for arithmetic > operators: + means integer plus if the argument is of type Int, float > plus > if the argument is of type Float, and so on. > > > > Haskell type classes were specifically designed to address this > situation. And if you apply type classes to the record situation, I > think > you end up with > > > http://hackage.haskell.org/trac/ghc/wiki/Records/OverloadedRecordFields More specifically I think of this as TDNR, which instead of the focus of the wiki page of maintaining backwards compatibility and de-surgaring to polymorphic constraints. I had hoped that there were different ideas or at least more flexibility possible for the TDNR implementation.
> > > > Well, so maybe we can give up on that. Imagine Frege without the > above > abbreviation. The basic idea is that field names are rendered unique > by > pre-pending the module name. As I understand it, to record selection > one > would then be forced to write (T.m e), to select the ‘m’ field. That > is > the, qualification with T is compulsory. The trouble with this is > that > it’s *already* possible; simply define suitably named fields > > data T = MkE { t_m :: Int, t_n :: Bool } > > Here I have prefixed with a (lower case version of) the type name. > So > we don’t seem to be much further ahead. > > > > Maybe one could make it optional if there is no ambiguity, much like > Haskell’s existing qualified names. But there is considerable > ambiguity > about whether T.m means > > m imported from module T > > or > > the m record selector of data type T
If there is ambiguity, we expect the T to be a module. So you would need to refer to Record T's module: OtherModule.T.n or T.T.n Alternatively these conflicts could be compilation errors. Either way programmers are expected to structure their programs to avoid conflicting names, no different then they do now.
> > > Perhaps one could make it work out. But before we can talk about it > we > need to see a design. Which takes us back to the question of > leadership. > >
I am trying to provide as much leadership on this issue as I am capable of. Your critique is very useful in that effort.
At this point the Frege proposal without TDNR seems to be a small step forward. We can now define records with clashing fields in the same module. However, without TDNR we don't have convenient access to those fields. I am contacting the Frege author to see if we can get any more insights on implementation details.
> > Simon > > > > > > We only want critiques about > > * achieving name-spacing right now > > * implementing it in such a way that extensible records could be > implemented in its place in the future, although we will not allow > that > discussion to hold up a records implementation now, just possibly > modify > things slightly. > > > > Greg Weber > > > > On Thu, Dec 29, 2011 at 2:00 PM, Simon Peyton-Jones >
wrote: > > | The lack of response, I believe, is just a lack of anyone who > | can cut through all the noise and come up with some > | practical way to move forward in one of the many possible > | directions. > > You're right. But it is very telling that the vast majority of > responses on > > > http://www.reddit.com/r/haskell/comments/nph9l/records_stalled_again_leaders... > were not about the subject (leadership) but rather on suggesting yet > more, incompletely-specified solutions to the original problem. My > modest > attempt to build a consensus by articulating the simplest solution I > could > think of, manifestly failed. > > The trouble is that I just don't have the bandwidth (or, if I'm > honest, > the motivation) to drive this through to a conclusion. And if no one > else > does either, perhaps it isn't *that* important to anyone. That said, > it > clearly is *somewhat* important to a lot of people, so doing nothing > isn't > very satisfactory either. > > Usually I feel I know how to move forward, but here I don't. > > Simon > > _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-- Work is punishment for failing to procrastinate effectively.
-- Work is punishment for failing to procrastinate effectively.
-- Work is punishment for failing to procrastinate effectively.