Keep the present Haskell record system!

Dear all, I'm increasingly convinced that the records should be left alone for Haskell', possibly modulo some minor tweaks to polish the system. While there are a number of reasonable proposals out there for a replacement record system, and while I do agree that a more flexible system would be desirable, there just is no consensus whatsoever as to which one is the way forward. Moreover, my impression is that maintaining backwards compatibility while truly addressing the shortcomings of the present system, almost inevitably would lead to a language with TWO kinds of records. I must say I really don't like that prospect. If the above is true, then that would imply that backwards compatibility would have to be sacrificed in a serious way if we were to move forward with a proposal for a new record system. I don't think there is a case for this in the context of Haskell'. A while ago, I suggested some rules-of-thumb for when serious backwards-compatibility breaking changes might be considered for Haskell'. To recap: If a proposed change breaks backwards compatibility, then it is acceptable only if either 1. very little existing code is likely going to be broken in practice, or 2.1. it is widely agreed that not addressing the issue really would harm the long-term relevance of Haskell', and 2. it is widely agreed that attempting to maintain backwards compatibility would lead to an unwieldy language design, and 3. the proposed design and its implications are well understood, i.e. it has been implemented in at least one system and it has been used extensively, or a strong argument can be made on the grounds of, say, an underlying well-understood theory. From the people who commented on that, there seemed to be some support for this position, and no one, as far as I know, said they thought that the above was too restrictive. My position is that two parallel record systems is so unwieldy so as to not be acceptable. That would imply that fixing the records would break lots of code. That means rule 1 above does not apply. One can probably find some proposals out there that meet criteria 2.3. However, I don't think 2.1 can be argued convincingly. Yes, some people claim that no proper records is the key thing that gets in the way of serious software development in Haskell. I don't believe that as I see lots of serious Haskell software out there that seems to have been developed just fine with the existing records (or not using records at all, or using some kind of "home brewed" records.) (In contrast, it seems to me that lots of serious Haskell software out there do use MPTCs, FDs, and rank-2 or higher types.) Moreover, I think the current record system is being unfairly discredited. Yes, it is certainly not perfect. But by adopting some simple conventions for naming fields to avoid name clashes, it is really not that hard to use them in the way records are used in a language like, say, C. (No, I'm not arguing that C's record system is great, just that that basic kind of records evidentially is enough for supporting some really serious software development efforts.) Moreover, there is scope for improving the present system a little bit without any major changes. Anyway, the present records certainly beats having no records at all hands down, and they are moreover very lightweight and fits very nicely into Haskell. For this, they deserve credit. My proposal is thus that we move forward on records for Haskell' by deciding to essentially keep the current record system. All the best, /Henrik -- Henrik Nilsson School of Computer Science and Information Technology The University of Nottingham nhn@cs.nott.ac.uk This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.

On Wed, Mar 01, 2006 at 08:26:14AM +0000, Henrik Nilsson wrote:
I'm increasingly convinced that the records should be left alone for Haskell', possibly modulo some minor tweaks to polish the system.
Yes, no alternative candidate is available (specified, implemented, used). But I wonder whether there was some problem with the has-predicate approach described in http://research.microsoft.com/~simonpj/Haskell/records.html Or perhaps it never seemed a high enough priority.

Ross Paterson
On Wed, Mar 01, 2006 at 08:26:14AM +0000, Henrik Nilsson wrote:
I'm increasingly convinced that the records should be left alone for Haskell', possibly modulo some minor tweaks to polish the system.
Yes, no alternative candidate is available (specified, implemented, used).
Well, there _are_ some alternatives that have been specified and implemented e.g. TREX in Hugs, and experimental languages like Daan Leijen's Morrow. But the main reason I can see for there being little use of these candidates, is that they are not compatible with current Haskell. Thus, although I agree that none is ready for inclusion in Haskell-prime, I think we do need some mechanism for experimental records to be tried out in real Haskell implementations before the Haskell-double-prime committee starts its work. Perhaps, taking the extensions-layering idea, we could say that the current named-fields are encapsulated as an "extension that is part of the standard". Implementations could then introduce a flag to switch off this particular extension (current records) in conjunction with flags to switch on experimental replacements. This would give a certain flexibility for users to play with different systems, and the breaking of compatibility would be explicitly notated, either by the build options, or using a proposal like ticket #94. My suggestion is that we separate out everything from the Report to do with named-field records into something like a self-contained addendum. Whilst still an official part of the language standard, it might also be marked as a possibility for future removal. This would make it clear what parts of the language could be changed (or re-used without conflict) in an alternative records system. Regards, Malcolm

On Wed, Mar 01, 2006 at 11:00:41AM +0000, Malcolm Wallace wrote:
Thus, although I agree that none is ready for inclusion in Haskell-prime, I think we do need some mechanism for experimental records to be tried out in real Haskell implementations before the Haskell-double-prime committee starts its work.
Perhaps, taking the extensions-layering idea, we could say that the current named-fields are encapsulated as an "extension that is part of the standard". Implementations could then introduce a flag to switch off this particular extension (current records) in conjunction with flags to switch on experimental replacements. This would give a certain flexibility for users to play with different systems, and the breaking of compatibility would be explicitly notated, either by the build options, or using a proposal like ticket #94.
My suggestion is that we separate out everything from the Report to do with named-field records into something like a self-contained addendum. Whilst still an official part of the language standard, it might also be marked as a possibility for future removal. This would make it clear what parts of the language could be changed (or re-used without conflict) in an alternative records system.
Sounds like a good idea to me, if it can be done. We might want to do the same thing with FDs (assuming we come up with a form good enough for Haskell'). There will be a question of how contagious these extensions are, e.g. am I using extension X if I import a module that uses it?

On Wed, Mar 01, 2006 at 08:26:14AM +0000, Henrik Nilsson wrote:
I'm increasingly convinced that the records should be left alone for Haskell', possibly modulo some minor tweaks to polish the system.
for the record:-) I'm not in favour of this part.
But the main reason I can see for there being little use of these candidates, is that they are not compatible with current Haskell.
Thus, although I agree that none is ready for inclusion in Haskell-prime, I think we do need some mechanism for experimental records to be tried out in real Haskell implementations before the Haskell-double-prime committee starts its work.
however, if Malcolm's compromise could be made to work, that would at least be some step forward: instead of waiting for someone else to take the necessary steps, Haskell' would at least prepare for them, and address the road-block of backwards-compatibility that has held up better record systems for so long. but that would have to go beyong tweaking the existing system, and even separating out the definition into an extension isn't sufficient: we can't just have a flag to switch support for old-style records on or off. we need a transition path, preferably with an interface that may either be implemented by old-style labelled fields, or by some alternative record system. if that could be made to work, the switch would be between old-style labelled fields and record system X.
My suggestion is that we separate out everything from the Report to do with named-field records into something like a self-contained addendum. Whilst still an official part of the language standard, it might also be marked as a possibility for future removal. This would make it clear what parts of the language could be changed (or re-used without conflict) in an alternative records system.
that would be useful, especially since the interface involves syntax and types (so those Haskellers giving explicit declarations to their record-using functions might have problems if a future alternative required contexts, eg, for has and lacks predicates, etc..). one might think that a simple approach would be to separate labelled fields into user-level features (record syntax, patterns, updates,..) and language-level desugaring (algebraic data types), and then to provide a pre-processor for that to make the currently internal desugared version explicit. the output of such a preprocessing could then be used even if Haskell'' removed labelled fields alltogether. the problem with this is that it would introduce structure-dependent code for applications that wanted to avoid just that. so this is nothing but a worst-case fallback alternative. an alternative approach would change the desugared version from data types to type classes, in the hope of preserving some advantages of using labelled fields even without the syntactic sugar. so, ideally, old-style programs could survive into Haskell'' in desugared, but still extensible form. unfortunately, it seems more and more likely that Haskell' type classes will not be expressive enough for such an approach, even though current Haskell type classes clearly are. a more promising approach would be to specify the user-level features of the current system, then to show at least two translations: one for the current desugaring, and a second one to demonstrate at least one implementation of those features in an alternative record system. the point of that exercise would be to figure out which features of the current user-level view of labelled fields would make a later transition difficult, and to mark them as deprecated or to remove them now. in other words, the "reference" translation should not be to the most powerful record system imagined so far, but to a fairly simple one, which all "better" record systems ought to be able to mimic. my current favourite for such a simple alternative record system would be Daan Leijen's "Extensible records with scoped labels" (TFP2005): http://www.cs.uu.nl/~daan/pubs.html in my view, his scoped labels are not so much a feature but a consequence of simplifying the type system: no need for lacks or has predicates, but still extensible records without need for declarations. if one accepts the potential for scoped labels, lacks can be dropped, and has can be encoded in the type structure. so, as long as the translation doesn't make use of scoped labels, limiting the user-level features of labelled fields to what can be translated safely into this record system should ensure some level of future-proving (using his system as a reference would not expose changes in type contexts, but I hope that some form of partial type declarations will make it into Haskell'; then switching from labelled fields to, say, TREX, would need to turn explicit signatures involving records into partial ones). cheers, claus ps. the Curry folks are looking into adding labelled fields, and seem to have decided to go for a trial implementation of Daan's system before making any decisions: http://www.informatik.uni-kiel.de/~curry/listarchive/0406.html

On Saturday 04 March 2006 19:35, Claus Reinke wrote:
a more promising approach would be to specify the user-level features of the current system, then to show at least two translations: one for the current desugaring, and a second one to demonstrate at least one implementation of those features in an alternative record system.
the point of that exercise would be to figure out which features of the current user-level view of labelled fields would make a later transition difficult, and to mark them as deprecated or to remove them now. in other words, the "reference" translation should not be to the most powerful record system imagined so far, but to a fairly simple one, which all "better" record systems ought to be able to mimic.
my current favourite for such a simple alternative record system would be Daan Leijen's "Extensible records with scoped labels" (TFP2005): http://www.cs.uu.nl/~daan/pubs.html
Yes. Daan Leijen's record system is the best of all the ones I have read about, not least because of its simplicity.
ps. the Curry folks are looking into adding labelled fields, and seem to have decided to go for a trial implementation of Daan's system before making any decisions: http://www.informatik.uni-kiel.de/~curry/listarchive/0406.html
I would very much like to have this. I wouldn't mind if it were qualified as an experimental extension, etc.. If not in Haskell' then maybe at least in some future ghc version? Ben

On 04/03/06, Benjamin Franksen
On Saturday 04 March 2006 19:35, Claus Reinke wrote:
my current favourite for such a simple alternative record system would be Daan Leijen's "Extensible records with scoped labels" (TFP2005): http://www.cs.uu.nl/~daan/pubs.html
Yes. Daan Leijen's record system is the best of all the ones I have read about, not least because of its simplicity.
I just looked at this, and it made me wonder why we all haven't jumped on it. It looks pretty much ideal -- are there any known problems with it? It has pretty much all the operations that you'd want in an extensible record system, there are no lacks predicates to worry about, type inference is complete and sound (and there are constructive proofs available of this), and it doesn't look like it should be all that difficult to implement (the paper even gives suggestions as to possible efficient implementations). What are we searching for? - Cale

Can it be implemented efficiently? Cale Gibbard wrote:
On 04/03/06, Benjamin Franksen
wrote: On Saturday 04 March 2006 19:35, Claus Reinke wrote:
my current favourite for such a simple alternative record system would be Daan Leijen's "Extensible records with scoped labels" (TFP2005): http://www.cs.uu.nl/~daan/pubs.html Yes. Daan Leijen's record system is the best of all the ones I have read about, not least because of its simplicity.
I just looked at this, and it made me wonder why we all haven't jumped on it. It looks pretty much ideal -- are there any known problems with it? It has pretty much all the operations that you'd want in an extensible record system, there are no lacks predicates to worry about, type inference is complete and sound (and there are constructive proofs available of this), and it doesn't look like it should be all that difficult to implement (the paper even gives suggestions as to possible efficient implementations). What are we searching for?
- Cale _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://haskell.org/mailman/listinfo/haskell-prime

On 3/6/06, Lennart Augustsson
Can it be implemented efficiently?
Section 8, p. 8: "This leads to a simple compilation scheme that gives constant access to labels, but avoids the many runtime parameters for extension. The extension operation is done dynamically but since it is O(n) anyway, we expect that the runtime penalty is negligible." Jim

Yes, I've read the article too. And I really like the record system. But an off-hand remark like that doesn't convince me. Daan, what's your opinion? -- Lennart Jim Apple wrote:
On 3/6/06, Lennart Augustsson
wrote: Can it be implemented efficiently?
Section 8, p. 8: "This leads to a simple compilation scheme that gives constant access to labels, but avoids the many runtime parameters for extension. The extension operation is done dynamically but since it is O(n) anyway, we expect that the runtime penalty is negligible."
Jim _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://haskell.org/mailman/listinfo/haskell-prime

Hello Lennart, Monday, March 6, 2006, 9:50:24 AM, you wrote: LA> Yes, I've read the article too. And I really like the record system. LA> But an off-hand remark like that doesn't convince me. my own opinion is that this scheme is like classes - they can be resolved at compile time in most real cases but noone do it because code will be too large. if some function can accept any records which has field 'a' then to use this function on records of different types we need either to do specialization or use scheme with non-constant access time also, while i like dynamic records for some types of tasks, i think that the "spirit" of Haskell in whole is to give explicit definitions of all types used and in this respect this type extension in not on "main way". i will be glad to write smth like this: data A = A { f1 :: Integer -- filesize , f2 :: String -- filename } data B : A = B { f3 : Int -- filedate , f4 : Int -- filetime } i.e. explcicitly define concrete types as a set of fields, explicitly define types of fields and make comments just here. in this respect, O'Haskell is what i really like -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

interesting. it always seems that I get more responses if I argue against my own interests?-) anyway: - if you want explicitly declared, non-extensible records, you can use data type variants to tag anonymous records with fixed sets of labels (so anonymous records accomodate both user groups) - if you want type class interfaces that hide representation, will they be any easier to implement on top of labelled fields? it was investigation of this issue that led me away from TREX (the type of extension there forces you to name labels in the type, and type classes force you to make types more explicit than needed), and to approaches based on first-class labels - these questions, and more like them (eg., how to tag polymorphic components?), should be addressed when preparing labelled fields for replacement, as per Malcolm's suggestion; if you define a user-level view of records that can be implemented by either labelled fields or extensible records, then you have to account for typical usage patterns as well - if the committee follows Malcolm's suggestion, with my comments, implementations could, at some point, provide an implementation of records using Daan's system as an alternative to the current implementation.; it is possible that, in the process of defining the alternative translation into Daan's system, the committee finds that there is no point in keeping the old system around - in case you hadn't noticed: the Data.Record.hs attached to the first-class labels ticket does implement scoped labels in current Haskell+ (only ghc at the moment, I'm afraid, which is one reason why I'm after more coherence between implementations). it does not have Daan's simple types, because Haskell doesn't give me a commutative-associative type constructor, and I have to implement what would otherwise be a type-system extension via type classes (hence predicates in types). but for some of you, that's what you ask for anyway, and you do get record concatenation as a bonus;-) - once upon a time, there was paper after paper on optimizing pattern matching. but, in these days of heavy type-class programming, I could only name a single paper on optimizing type-classes. I remember one of the first talks on HLists, where the speaker said something like "advanced Haskell compilers will eliminate this type-class overhead statically" and the Haskell compiler implementers in the audience replied "what Haskell compilers? ours won't!". what about using the static properties of HLists, Data.Record, nested tuples, and so on, for generating constant time access, etc? any takers?-) cheers, claus

also, while i like dynamic records for some types of tasks, i think that the "spirit" of Haskell in whole is to give explicit definitions of all types used and in this respect this type extension in not on "main way".
record extension is the basis for record concatenation, which is the basis for composing programs that use records. for instance, if you have two attribute grammars that compute two sets of attributes and you want to compose them into a single grammar, you run into troubles. (dual arguments for extensible variants, be it for exception types, or for extensible grammars that cover haskell+extensions without having to specify and maintain two separate grammars). and the concept of partial type specifications is not uncommon in Haskell (polymorphism, type classes). claus

Hello Claus, Monday, March 6, 2006, 2:35:04 PM, you wrote:
also, while i like dynamic records for some types of tasks, i think that the "spirit" of Haskell in whole is to give explicit definitions of all types used and in this respect this type extension in not on "main way".
CR> record extension is the basis for record concatenation, which is CR> the basis for composing programs that use records. for instance, CR> if you have two attribute grammars that compute two sets of CR> attributes and you want to compose them into a single grammar, CR> you run into troubles. (dual arguments for extensible variants, CR> be it for exception types, or for extensible grammars that cover CR> haskell+extensions without having to specify and maintain two CR> separate grammars). and the concept of partial type specifications CR> is not uncommon in Haskell (polymorphism, type classes). (sorry for late answer) this again should be maintained in "Haskell way", i.e. with static type declarations: data Pizza = ... data Cola = ... type PizzaWithCola = Pizza+Cola weight :: PizzaWithCola -> Double weight pc = pizzaWeight pc + colaWeight pc pizzaWeight :: Pizza -> Double colaWeight :: Cola -> Double it is one more remainder of what we need OOP-like features such as data fields inheritance. O'Haskell has something in this area, although afair it doesn't support the multiple inheritance -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin wrote:
Hello Lennart,
Monday, March 6, 2006, 9:50:24 AM, you wrote:
LA> Yes, I've read the article too. And I really like the record system. LA> But an off-hand remark like that doesn't convince me.
my own opinion is that this scheme is like classes - they can be resolved at compile time in most real cases but noone do it because code will be too large. if some function can accept any records which has field 'a' then to use this function on records of different types we need either to do specialization or use scheme with non-constant access time
Yes, your opinion is similar to mine. :) -- Lennart

my own opinion is that this scheme is like classes - they can be resolved at compile time in most real cases but noone do it because code will be too large. if some function can accept any records which has field 'a' then to use this function on records of different types we need either to do specialization or use scheme with non-constant access time
for those who haven't seen it, the following paper explored the former possibility with good success (at a time when type classes where still somewhat simpler:): Dictionary-free Overloading by Partial Evaluation Mark P. Jones, ACM SIGPLAN Workshop on Partial Evaluation and Semantics-Based Program Manipulation, Orlando, Florida, June 1994. http://www.cse.ogi.edu/~mpj/pubs/pepm94.html cheers, claus

Hello Claus, Monday, March 6, 2006, 4:30:04 PM, you wrote:
my own opinion is that this scheme is like classes - they can be resolved at compile time in most real cases but noone do it because code will be too large. if some function can accept any records which has field 'a' then to use this function on records of different types we need either to do specialization or use scheme with non-constant access time
CR> for those who haven't seen it, the following paper explored the former CR> possibility with good success (at a time when type classes where CR> still somewhat simpler:): CR> Dictionary-free Overloading by Partial Evaluation CR> Mark P. Jones, ACM SIGPLAN Workshop on Partial CR> Evaluation and Semantics-Based Program Manipulation, CR> Orlando, Florida, June 1994. CR> http://www.cse.ogi.edu/~mpj/pubs/pepm94.html 2-3 weeks ago i rolled in ghc-users list list of suggestions to improve ghc efficiency and make it close to C++. in particular, i proposed to make more aggressive compile-time specialization (at cost of less aggressive inlining of non-polymorphic functions) like the C++ templates common-used implementation. may be, i don't know something, but i think that in most cases we can end up with fully specialized code -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Efficient implementations are discussed in the paper. It looks simple
enough complexity-wise at compile time. I'm not 100% sure, but at a
glance, I don't see anything here which looks like it should be very
complex.
As for the actual runtime semantics, apart from using association
lists (linear selection, constant extension) or labelled vectors (log
selection, linear extension), records may be implemented as plain
unlabelled vectors (constant selection, linear extension), provided
that an additional extension predicate is used in the type system,
which can always be inferred and so may be hidden from the user
completely. Given that one would expect record selection to be very
common, this last option looks quite attractive.
Another nice aspect is that we'd get automatic labelled variant types
almost for free from the same machinery.
- Cale
On 06/03/06, Lennart Augustsson
Can it be implemented efficiently?
Cale Gibbard wrote:
On 04/03/06, Benjamin Franksen
wrote: On Saturday 04 March 2006 19:35, Claus Reinke wrote:
my current favourite for such a simple alternative record system would be Daan Leijen's "Extensible records with scoped labels" (TFP2005): http://www.cs.uu.nl/~daan/pubs.html Yes. Daan Leijen's record system is the best of all the ones I have read about, not least because of its simplicity.
I just looked at this, and it made me wonder why we all haven't jumped on it. It looks pretty much ideal -- are there any known problems with it? It has pretty much all the operations that you'd want in an extensible record system, there are no lacks predicates to worry about, type inference is complete and sound (and there are constructive proofs available of this), and it doesn't look like it should be all that difficult to implement (the paper even gives suggestions as to possible efficient implementations). What are we searching for?
- Cale _______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://haskell.org/mailman/listinfo/haskell-prime

With respect to the discussion on records, let me throw in my usual warning: all of this seems overly obsessed with concrete representations of data types. The representation should not be exposed in the first place: you don't want to access it (=> make all fields private) you don't want to extend it (=> implementation inheritance is bad, interface inheritance is good.) (read e. g. Introduction to Design Patterns by Gamma et al.) You think it is a win to be able to write a function that takes "everything that has a foo :: Foo component"? I think it is not, since it is not robust design. It will only take records, and components have to be components. What if you later change the type's representation from a record to something else? If you change the component to a function? If you want a reliable notion of "everything that has a foo :: Foo", then you need to declare an interface (erm, one parameter type class). My point is that the OO community has learned all this stuff the hard way (from software problems arising from naive use of objects and inheritance), and it has taken them years, if not decades, and now it looks as if we are going to joyfully repeat this whole process. An important selling point of the records proposal seems to be that you don't have to declare a type name for a record type. While I don't buy this whole idea (we have a declarative programming language but we want to avoid (type) declarations?) I see a concrete problem: what if you want to make such a nameless type an instance of some type class? Then we get all sorts of overlappings. So with respect to the original post (see the subject of this email) I tend to agree: leave records as they are. Of course they are problematic, but the main reason is not missing extensibility. As I see it, the problem is that the named component notation was added late and still allows to access the earlier positional notation, and the component names are in the (module-global) namespace. This would be more tolerable if we had ad-hoc overloading. Since we haven't, I'm now basically putting each data declaration in a separate module and import these qualified. (This simulates the "per-type" namespace for components.) Respectfully submitted, -- -- Johannes Waldmann -- Tel/Fax (0341) 3076 6479/80 -- ---- http://www.imn.htwk-leipzig.de/~waldmann/ -------

I have similar concerns. All of the record proposals I have seen don't really seem to subsume the current haskell record system, the task is different, anonymous records feel like they fill a similar niche to anonymous tuples, and just like we don't declare everything as newtype Foo = Foo (Int,Char,Either Bar Baz) it would seem odd that the only way to get named fields is to do the same thing but with records. not to say that the record proposals won't be useful in their own right, but I wouldn't want too see them necessarily replace labeled fields. it would also be unfortunate if the choice to use the (often cleaner) labeled fields implied a run-time and change in representation, one shouldn't have to make decisions like that. It feels to me that anonymous records should look more like the other anonymous(ish) type haskell provides, tuples. so the following syntax feels more natural (x = 3,y = 4) (x=3 | r) where (perhaps) (x,y,z) is shorthand for (t1 = x,t2 = y,t3 = z) this would also have the nice advantage that it is syntaxwise more backwards compatable, since labeled fields require a constructor argument to appear before the {}'s but tuples already don't. (and presumably a new record system wouldn't) I think an issue is that we call the current system records, so we expect them to work like other languages records, when in actuality, they are just labeled fields on standard haskell datatypes, but since that is true, we get all the power of standard haskell datatypes such as strict fields, summation types, the ability to declare instances and lots of typesafety. Haskell currenty just doesn't have a record system, adding one shouldn't get rid of the unrelated and very useful labeled fields feature :) John -- John Meacham - ⑆repetae.net⑆john⑈

On 06/03/06, John Meacham
I have similar concerns. All of the record proposals I have seen don't really seem to subsume the current haskell record system, the task is different, anonymous records feel like they fill a similar niche to anonymous tuples, and just like we don't declare everything as
newtype Foo = Foo (Int,Char,Either Bar Baz)
it would seem odd that the only way to get named fields is to do the same thing but with records.
not to say that the record proposals won't be useful in their own right, but I wouldn't want too see them necessarily replace labeled fields.
it would also be unfortunate if the choice to use the (often cleaner) labeled fields implied a run-time and change in representation, one shouldn't have to make decisions like that.
It feels to me that anonymous records should look more like the other anonymous(ish) type haskell provides, tuples. so the following syntax feels more natural
(x = 3,y = 4) (x=3 | r)
where (perhaps) (x,y,z) is shorthand for (t1 = x,t2 = y,t3 = z)
this would also have the nice advantage that it is syntaxwise more backwards compatable, since labeled fields require a constructor argument to appear before the {}'s but tuples already don't. (and presumably a new record system wouldn't)
I think an issue is that we call the current system records, so we expect them to work like other languages records, when in actuality, they are just labeled fields on standard haskell datatypes, but since that is true, we get all the power of standard haskell datatypes such as strict fields, summation types, the ability to declare instances and lots of typesafety. Haskell currenty just doesn't have a record system, adding one shouldn't get rid of the unrelated and very useful labeled fields feature :)
John
I agree, and I'd also say that the concrete syntax of Daan's proposal ought to still be up in the air, as the version in the paper uses TeX symbols which we don't have :) From my viewpoint, there's nothing particularly wrong with the labelled fields in datatypes feature as it stands. Leaving them in without change would be fine. Haskell just seems to be missing a flexible way to deal with large product types, and a record system would fix that problem. Normally, I'd say that one should more often design one's program so that large products aren't needed at any point, but there are cases where one is dealing with interfaces from applications written in languages where large products are common. (A lot of network applications spring to mind. Joel Reymont had this issue, definitely.) - Cale

On 06/03/06, Johannes Waldmann
With respect to the discussion on records, let me throw in my usual warning: all of this seems overly obsessed with concrete representations of data types.
We already have mechanisms for abstraction. There's a gap in our ability to form certain concrete representations we might want. This paper simply describes how to add those representations to the language in a nice way.
The representation should not be exposed in the first place: you don't want to access it (=> make all fields private) you don't want to extend it (=> implementation inheritance is bad, interface inheritance is good.) (read e. g. Introduction to Design Patterns by Gamma et al.)
You think it is a win to be able to write a function that takes "everything that has a foo :: Foo component"? I think it is not, since it is not robust design. It will only take records, and components have to be components. What if you later change the type's representation from a record to something else? If you change the component to a function? If you want a reliable notion of "everything that has a foo :: Foo", then you need to declare an interface (erm, one parameter type class).
Well, changing data representations is always inflexible. This isn't a new problem, and as you mentioned, you can still fix it with the use of typeclasses.
My point is that the OO community has learned all this stuff the hard way (from software problems arising from naive use of objects and inheritance), and it has taken them years, if not decades, and now it looks as if we are going to joyfully repeat this whole process.
Large product types normally indicate an awkward design, yes, but they're still implicit in many real-world interfaces, and it can be quite difficult to deal with them. This gives nice ways to break them up and work with them where they naturally occur.
An important selling point of the records proposal seems to be that you don't have to declare a type name for a record type. While I don't buy this whole idea (we have a declarative programming language but we want to avoid (type) declarations?) I see a concrete problem: what if you want to make such a nameless type an instance of some type class? Then we get all sorts of overlappings.
Well, I don't know about that. You don't have to declare a type name simply because all the types here already exist. You can still newtype them. However, not all record types are polymorphic. Declaring instances for completely specified rows would not be an issue. It's not clear to me that having instances for polymorphic records would be too much of an issue either. Yes, it would be easy to get overlaps, but not much more so than with existing polymorphic types. If there's more than one polymorphic instance, then of course you get overlap, because you can construct a record type with the union of the labels from the two instances. However are multiple row-polymorphic instances even needed? Due to the problem that records could very well satisfy both predicates in any situation like that, if you needed multiple instances, it would be better to newtype as usual.
So with respect to the original post (see the subject of this email) I tend to agree: leave records as they are. Of course they are problematic, but the main reason is not missing extensibility.
Well, the issue is just that Haskell does not actually have a record system. It has algebraic types, and while those can emulate certain aspects of records, they are not the same thing. The current "record syntax" is just syntax sugar for labelling the fields of a product in an algebraic type. It's nice syntax sugar, and I wouldn't want to get rid of it. (Though it could perhaps do with a renaming :)
As I see it, the problem is that the named component notation was added late and still allows to access the earlier positional notation, and the component names are in the (module-global) namespace.
The problem is that people see "record syntax" and think that somehow what they're declaring is any different from an ordinary product. The syntax gives you a little more capacity for dealing with more fields, and a little bit of future proofing, but not much more, and really it's the same thing, as the ability to use the positional notation indicates. Even with syntax sugar, using large product types in current Haskell is poor design. I'll illustrate one of the main reasons for this, and how extensible records can help fix that problem: Suppose that A, B, and C are types and that we have: data T = T {x :: A, y :: B, z :: C} which we're trying to use to simulate a record type. Then any function f :: T -> T has the ability to read and depend on all the components of the T which it is working with. There are many cases where this is completely inappropriate, but restricting access to one or more of the components is difficult. We'd need to define a new typeclass with get/set functions, and use that instead. Doing this sort of thing for every one of the fields of every product one uses is obviously not a good solution. On the other hand, a function: f :: {x :: A, y :: B | r} -> {x :: A, y :: B | r} obviously can only depend and act on the x and y components, and is not allowed to touch z at all. Sure, you might perhaps say that there's too much polymorphism there, but this usually isn't an issue, and there are still newtypes to tag things and ensure that they don't get into the wrong parts of the program. Record types would also be permitted as members of algebraic data types. More flexible systems than just using products as records are possible using typeclasses with label types like HList, but these generally involve quite a lot of typeclass hackery which, while it's nice to see that it can be done, at some point begins to feel like an abuse of the system, when one could do a better treatment at the compiler level. Such systems still wouldn't have properties as nice as the record system in the paper. (There is no provision for associative or commutative data/type constructors.) A related issue is that these tend to be closer in performance to association lists, which means that while extension is fast, record selection is linear time.
This would be more tolerable if we had ad-hoc overloading. Since we haven't, I'm now basically putting each data declaration in a separate module and import these qualified. (This simulates the "per-type" namespace for components.)
I think that ad-hoc overloading would be much more intolerable. In some cases the design you describe (separating a data type into a module) is appropriate, but I wouldn't hold myself to it. Usually I'd only use that if I planned to hide the constructors. A lot of the time, field labels can be renamed such that they don't overlap. Inventing new names is not hard work. (You can just put part or all of the type name in the labels, and you get basically the same effect as the module system gives you.) - Cale

Cale Gibbard wrote: (a thoughtful response, thank you) and ...
... field labels can be renamed such that they don't overlap. Inventing new names is not hard work.
Oh yes it is. I want meaningful names, and if the meaning of two things is identical, then inventing separate names is hard and unnecessary and misleading.
(You can just put part or all of the type name in the labels,
Ugly ugly ugly. By writing fooBar (for "the foo of Bar") I'm putting type or module information in a name. That's a bad idea because it bypasses the type or module system. I think I really want a separate component namespace per type, and I can only get this by putting each type in its own module, but then another problem comes up: how to name the module/the type? You see this in e. g. Data.Map: it contains the type Map, and says http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Map.html
This module is intended to be imported qualified, to avoid name clashes with Prelude functions. eg. import Data.Map as Map
but then what is the type of a map? It's Data.Map.Map (or, as the documentation suggests, Map.Map, which does not look any better). Sometimes my conclusion is "module Foo where data Type = Make { ... }" because then "import qualified Foo ; x :: Foo.Type = Foo.Make ..." (in case I'll publish the constructor). You see I want to avoid inventing a name for the type and its constructor (if there is only one) because I already have done it (it's the module name). So .. what if we just allow to write a data (or class?) declaration directly (instead of a module declaration). E. g. the file "Foo.hs" contains "data Foo where (... constructors as in GADT ...) ; some_function :: ..." with the effect that after "import qualified Foo" from elsewhere we can write "x :: Foo; .. Foo.some_function ..." Just an idea. PS: GADTs are way cool! Any chance of having them in Haskell-Prime? -- -- Johannes Waldmann -- Tel/Fax (0341) 3076 6479/80 -- ---- http://www.imn.htwk-leipzig.de/~waldmann/ -------
participants (11)
-
Benjamin Franksen
-
Bulat Ziganshin
-
Cale Gibbard
-
Claus Reinke
-
Henrik Nilsson
-
Jim Apple
-
Johannes Waldmann
-
John Meacham
-
Lennart Augustsson
-
Malcolm Wallace
-
Ross Paterson