Return of the revenge of the revisit of the extensible records, reiterated

newer
hash tables, judy arrays, and more

Atze van der Ploeg

26 Nov 2013 26 Nov '13

5:09 p.m.

Dear all, Extensible records have been a long outstanding feature request for GHC. Using the new closed type families and type literals, it is actually possible to implement Daan Leijen's "`Extensible Records with Scoped Labels" system as a library. I have implemented this library at https://github.com/atzeus/openrec (the documentation is at http://homepages.cwi.nl/~ploeg/openrecvardocs/). The only thing it requires is the availability of a closed type family that compares two type level symbols (i.e. the ordering on strings): type family (m :: Symbol) <=.? (n :: Symbol) :: Bool patches to GHC that implement this built-in closed type family are also at the github site. I would like to generate some discussion about: * Is this the interface we would like for open records and variants? * Would it be worthwhile to invest in syntactic sugar for open record operations? * Any comments on the interface and or its implementation? Cheers! Atze van der Ploeg

Attachments:

attachment.html (text/html — 1.6 KB)

Show replies by date

adam vogt

26 Nov 26 Nov

9 p.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

Hi Atze, The following code I come up with using your variants is unsatisfying for a couple reasons I explain below: case inj (Label :: Label "x") 1 of v | Left one <- decomp (Label :: Label "x") v -> one :: Double _ -> 5 -- aside: any guarantees about this being unreachable? First, decomp is like lookup to me, so I expected the Right constructor for when that lookup succeeds. Second, (Label :: Label "x") is a pretty unacceptable way to write what is just x in other languages. One idea would be to follow http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-Labelable.h... and create values/labels which will do something like: x .=. 1 -- call to inj perhaps? v ^? x -- call to decomp Another idea is to make `x stand for (Label :: Label "x"), much like 'x and ''x in template haskell. Trying out a good syntax by using a quasiquoter or preprocessor before getting something into ghc is probably worth doing. One example that has not been as useful as originally thought it would be is: http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-RecordPuns.... Third, I think the literal 1 needs a type annotation, but it should be possible for it to be constrained to Double. On a somewhat related note, would your strategy of having sorted labels give better compile times for for code which uses records that are a bit larger than a toy example: http://code.haskell.org/~aavogt/xmonad-hlist/ Regards, Adam Vogt On Tue, Nov 26, 2013 at 12:09 PM, Atze van der Ploeg wrote:

...

Dear all,

Extensible records have been a long outstanding feature request for GHC. Using the new closed type families and type literals, it is actually possible to implement Daan Leijen's "`Extensible Records with Scoped Labels" system as a library.

I have implemented this library at https://github.com/atzeus/openrec (the documentation is at http://homepages.cwi.nl/~ploeg/openrecvardocs/). The only thing it requires is the availability of a closed type family that compares two type level symbols (i.e. the ordering on strings): type family (m :: Symbol) <=.? (n :: Symbol) :: Bool patches to GHC that implement this built-in closed type family are also at the github site.

I would like to generate some discussion about: * Is this the interface we would like for open records and variants? * Would it be worthwhile to invest in syntactic sugar for open record operations? * Any comments on the interface and or its implementation?

Cheers!

Atze van der Ploeg

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Atze van der Ploeg

27 Nov 27 Nov

8:08 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

Actually, after your email I played around with variants and it turns out they don't work at all. I'll rework the variants. The records work beatifully though.

...

First, decomp is like lookup to me, so I expected the Right constructor for when that lookup succeeds.

Agreed!

...

Second, (Label :: Label "x") is a pretty unacceptable way to write what is just x in other languages. One idea would be to follow < http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-Labelable.h...

and create values/labels which will do something like:

x .=. 1 -- call to inj perhaps?

v ^? x -- call to decomp

Another idea is to make `x stand for (Label :: Label "x"), much like 'x and ''x in template haskell. Trying out a good syntax by using a quasiquoter or preprocessor before getting something into ghc is probably worth doing. One example that has not been as useful as originally thought it would be is: < http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-RecordPuns....

Yes, that is currently the most painful bit of the syntax. It should be possible to adopt HList labelable. I would like a small syntactic extension that allows 'x for (Label :: Label "x") indeed. I'll probably hack this up later.

...

On a somewhat related note, would your strategy of having sorted labels give better compile times for for code which uses records that are a bit larger than a toy example: http://code.haskell.org/~aavogt/xmonad-hlist/

Depends, as far as I understand HList record sometimes require searching for a permutation of l such that l~l' which seems expensive to me. This is not necessary if we keep the row sorted. For projections and decompositions the performance is (theoretically) the same: linear searching in a list (sorted or unsorted list) is O(n). The real benefit of keeping the row sorted is that { x = 0 , y = 0 } and { y = 0, x = 0 } have the same type. When a row is not sorted, as in HList, then if we for example have an instance Eq for a row (because all elements support Eq) then for using (==) both arguments would have to the same order in the row or we need a manual call to a permutation function. When keeping the row ordered, this is not necessary. The same kind of problem occurs when we fix the type of a function to a specific row: ( using whishful syntax ) f :: Rec [ x = Int , y = Int ] -> Int If the row is not ordered, then f { y = 0 , x = 0 } will not typecheck and will require a manual call to permute the row. Cheers! Atze

...

Regards, Adam Vogt

On Tue, Nov 26, 2013 at 12:09 PM, Atze van der Ploeg

wrote:

...

...
Dear all,

Extensible records have been a long outstanding feature request for GHC. Using the new closed type families and type literals, it is actually possible to implement Daan Leijen's "`Extensible Records with Scoped Labels" system as a library.

I have implemented this library at https://github.com/atzeus/openrec(the documentation is at http://homepages.cwi.nl/~ploeg/openrecvardocs/). The only thing it requires is the availability of a closed type family that compares two type level symbols (i.e. the ordering on strings): type family (m :: Symbol) <=.? (n :: Symbol) :: Bool patches to GHC that implement this built-in closed type family are also at the github site.

I would like to generate some discussion about: * Is this the interface we would like for open records and variants? * Would it be worthwhile to invest in syntactic sugar for open record operations? * Any comments on the interface and or its implementation?

Cheers!

Atze van der Ploeg

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Carter Schonwald

9:24 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

very cool work! Thanks for sharing! The ordering on symbols to get rid of the permutation issue is slick. you should perhaps also share on the ghc-devs list, or maybe throw your design ideas on a page on the ghc wiki, to document them? -Carter On Wed, Nov 27, 2013 at 3:08 AM, Atze van der Ploeg wrote:

...

Actually, after your email I played around with variants and it turns out they don't work at all. I'll rework the variants. The records work beatifully though.

...
First, decomp is like lookup to me, so I expected the Right constructor for when that lookup succeeds.

Agreed!

...
Second, (Label :: Label "x") is a pretty unacceptable way to write what is just x in other languages. One idea would be to follow < http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-Labelable.h...

and create values/labels which will do something like:

x .=. 1 -- call to inj perhaps?

v ^? x -- call to decomp

Another idea is to make `x stand for (Label :: Label "x"), much like 'x and ''x in template haskell. Trying out a good syntax by using a quasiquoter or preprocessor before getting something into ghc is probably worth doing. One example that has not been as useful as originally thought it would be is: < http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-RecordPuns....

Yes, that is currently the most painful bit of the syntax. It should be possible to adopt HList labelable. I would like a small syntactic extension that allows 'x for (Label :: Label "x") indeed. I'll probably hack this up later.

...
On a somewhat related note, would your strategy of having sorted labels give better compile times for for code which uses records that are a bit larger than a toy example: http://code.haskell.org/~aavogt/xmonad-hlist/

Depends, as far as I understand HList record sometimes require searching for a permutation of l such that l~l' which seems expensive to me. This is not necessary if we keep the row sorted. For projections and decompositions the performance is (theoretically) the same: linear searching in a list (sorted or unsorted list) is O(n).

The real benefit of keeping the row sorted is that { x = 0 , y = 0 } and { y = 0, x = 0 } have the same type. When a row is not sorted, as in HList, then if we for example have an instance Eq for a row (because all elements support Eq) then for using (==) both arguments would have to the same order in the row or we need a manual call to a permutation function. When keeping the row ordered, this is not necessary. The same kind of problem occurs when we fix the type of a function to a specific row: ( using whishful syntax ) f :: Rec [ x = Int , y = Int ] -> Int

If the row is not ordered, then f { y = 0 , x = 0 } will not typecheck and will require a manual call to permute the row.

Cheers!

Atze

...
Regards, Adam Vogt

On Tue, Nov 26, 2013 at 12:09 PM, Atze van der Ploeg

wrote:

...
...
Dear all,

Extensible records have been a long outstanding feature request for GHC. Using the new closed type families and type literals, it is actually possible to implement Daan Leijen's "`Extensible Records with Scoped Labels" system as a library.

I have implemented this library at https://github.com/atzeus/openrec(the documentation is at http://homepages.cwi.nl/~ploeg/openrecvardocs/). The only thing it requires is the availability of a closed type family that compares two type level symbols (i.e. the ordering on strings): type family (m :: Symbol) <=.? (n :: Symbol) :: Bool patches to GHC that implement this built-in closed type family are also at the github site.

I would like to generate some discussion about: * Is this the interface we would like for open records and variants? * Would it be worthwhile to invest in syntactic sugar for open record operations? * Any comments on the interface and or its implementation?

Cheers!

Atze van der Ploeg

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Atze van der Ploeg

9:43 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

...

very cool work! Thanks for sharing! The ordering on symbols to get rid of the permutation issue is slick.

Thanks!

...

you should perhaps also share on the ghc-devs list, or maybe throw your

design ideas on a page on the ghc wiki, to document them? Good idea! I'll do that friday, i'll keep you posted.

...

On Wed, Nov 27, 2013 at 3:08 AM, Atze van der Ploeg

...

...
Actually, after your email I played around with variants and it turns

out they don't work at all. I'll rework the variants. The records work beatifully though.

...
...
First, decomp is like lookup to me, so I expected the Right constructor for when that lookup succeeds.

Agreed!

...
Second, (Label :: Label "x") is a pretty unacceptable way to write what is just x in other languages. One idea would be to follow <

http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-Labelable.h...

...
...
and create values/labels which will do something like:

x .=. 1 -- call to inj perhaps?

v ^? x -- call to decomp

Another idea is to make `x stand for (Label :: Label "x"), much like 'x and ''x in template haskell. Trying out a good syntax by using a quasiquoter or preprocessor before getting something into ghc is probably worth doing. One example that has not been as useful as originally thought it would be is: < http://hackage.haskell.org/package/HList-0.3.0.1/docs/Data-HList-RecordPuns....

...
Yes, that is currently the most painful bit of the syntax. It should be

...

...
...
On a somewhat related note, would your strategy of having sorted labels give better compile times for for code which uses records that are a bit larger than a toy example: http://code.haskell.org/~aavogt/xmonad-hlist/

Depends, as far as I understand HList record sometimes require searching

for a permutation of l such that l~l' which seems expensive to me. This is not necessary if we keep the row sorted. For projections and decompositions

...

...
The real benefit of keeping the row sorted is that { x = 0 , y = 0 } and

{ y = 0, x = 0 } have the same type. When a row is not sorted, as in HList,

...

...
( using whishful syntax ) f :: Rec [ x = Int , y = Int ] -> Int

If the row is not ordered, then f { y = 0 , x = 0 } will not typecheck and will require a manual call to permute the row.

Cheers!

Atze

...
Regards, Adam Vogt

On Tue, Nov 26, 2013 at 12:09 PM, Atze van der Ploeg

wrote:

...
...
Dear all,

Extensible records have been a long outstanding feature request for GHC. Using the new closed type families and type literals, it is actually possible to implement Daan Leijen's "`Extensible Records with Scoped Labels" system as a library.

I have implemented this library at https://github.com/atzeus/openrec(the documentation is at http://homepages.cwi.nl/~ploeg/openrecvardocs/). The only thing it requires is the availability of a closed type family

wrote: possible to adopt HList labelable. I would like a small syntactic extension that allows 'x for (Label :: Label "x") indeed. I'll probably hack this up later. the performance is (theoretically) the same: linear searching in a list (sorted or unsorted list) is O(n). then if we for example have an instance Eq for a row (because all elements support Eq) then for using (==) both arguments would have to the same order in the row or we need a manual call to a permutation function. When keeping the row ordered, this is not necessary. The same kind of problem occurs when we fix the type of a function to a specific row: that

...

...
...
...
compares two type level symbols (i.e. the ordering on strings): type family (m :: Symbol) <=.? (n :: Symbol) :: Bool patches to GHC that implement this built-in closed type family are also at the github site.

I would like to generate some discussion about: * Is this the interface we would like for open records and variants? * Would it be worthwhile to invest in syntactic sugar for open record operations? * Any comments on the interface and or its implementation?

Cheers!

Atze van der Ploeg

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

adam vogt

6:37 p.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

On Wed, Nov 27, 2013 at 3:08 AM, Atze van der Ploeg wrote:

...

Yes, that is currently the most painful bit of the syntax. It should be possible to adopt HList labelable. I would like a small syntactic extension that allows 'x for (Label :: Label "x") indeed. I'll probably hack this up later.

Did you mean to type 'x as opposed to `x? Using the former is going to make -XLabelQuotes (or whatever you like to call it) conflict with -XTemplateHaskell. I suppose you could make a class to disambiguate: class LeadingPrime s a where fromLeadingPrime :: Either String Language.Haskell.TH.Name -> Label s -> a -- standard 'name instance (name ~ Name) => LeadingPrime s name where fromLeadingPrime x _ = either error id x instance (s ~ s', label ~ Label) => LeadingPrime s (label s') where fromLeadingPrime _ x = x -- | this instance might go in HList... still -- you could get problems if another library, -- say Vinyl, also wants to do the same thing -- that doesn't fit in with the current Labelable instance (Labelable l p f s t a b, x ~ (a -> f b), y ~ (Record s -> f (Record t))) => LeadingPrime l (p x y) where fromLeadingPrime _ x = hLens' x The compiler would then replace 'x with fromLeadingPrime (Left "x not in scope") (Label :: Label "x"), or the Right contains the usual Name. This might have gone overboard with extensions. But I'm not sure you would be able to mix the following: $(varE 'x) -- normal template haskell \ record -> record ! 'x ! 'y -- 2nd instance \ record -> record^.'x.'y -- Labelable Another option would be to steal the leading backquote ` for Label only, which adds quite a bit of noise when you can't accept just a Label: \ record -> record^.hLens `x.hLens `y

...

...
On a somewhat related note, would your strategy of having sorted labels give better compile times for for code which uses records that are a bit larger than a toy example: http://code.haskell.org/~aavogt/xmonad-hlist/

Depends, as far as I understand HList record sometimes require searching for a permutation of l such that l~l' which seems expensive to me. This is not necessary if we keep the row sorted. For projections and decompositions the performance is (theoretically) the same: linear searching in a list (sorted or unsorted list) is O(n).

I see. I did a bit of a benchmark on compiling a module that just creates one record of size N, http://i.imgur.com/iiZwUgX.png. It's not exactly O(n^2) as residuals http://i.imgur.com/TGeq9Qx.png show. My guess is the check for duplicate labels is to blame for this bad performance. A record of size 100 might be absurd and probably most people have better CPUs than the Core(TM)2 Duo CPU T7100 @ 1.80GHz I used, but it's still an issue. I imagine your ordered labels will fix this slow compile issue, but I guess somebody actually has to try it out to see. The full code is something like http://code.haskell.org/~aavogt/HList-benchmark/ Regards, Adam

AntC

9:24 p.m.

New subject: [Haskell-cafe] Return of the revenge of the revisit of the extensible records, reiterated

...

Atze van der Ploeg writes:

Extensible records have been a long outstanding feature request for GHC.

Yes, that's because getting it right is difficult -- even if there were strong agreement on the objectives.

...

Using the new closed type families and type literals, it is actually possible to implement Daan Leijen's "`Extensible Records with Scoped Labels" system as a library.

Atze, I apologise for pouring cold water on your efforts, but the only merit of Leijen's proposal is that it's (relatively) easy to implement. It's been "actually possible" at least since HList showed the way. This from Leijen's abstract: "A novel aspect of this work is that records can contain duplicate labels, ..." tells why it's easy to implement. It also tells why it's completely pointless. So if [**] what you've implemented allows duplicate labels in a record, I do not want it. (In fact, I wouldn't call it a record system.) Please refer to text books on the Relational Model, as to why it's a BAD THING. It's a big weakness in SQL, for example, but SQL doesn't get it as badly wrong as Leijen's proposal. The difficult part of extensible records is exactly avoiding duplicate labels. TRex achieved it, but needed costly language extensions. HList achieves it, using a fragile combination of extensions (and giving impenetrable type errors if your program gets it wrong). I think our best hope of building something workable is when Richard's overlapping/closed type functions gets into the language (and we probably have to allow time for the wrinkles to get ironed out). [**] I can't see from the docos whether you do allow duplicate labels. Since you refer to Leijen, I assume so. If you don't, I suggest you avoid the reference. There's plenty of other (better) proposals out there. AntC

adam vogt

28 Nov 28 Nov

12:10 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

On Wed, Nov 27, 2013 at 4:24 PM, AntC wrote:

...

The difficult part of extensible records is exactly avoiding duplicate labels. TRex achieved it, but needed costly language extensions. HList achieves it, using a fragile combination of extensions (and giving impenetrable type errors if your program gets it wrong).

Hi AntC, Why do you say that errors regarding duplicate instances are impenetrable? They are verbose, but the information you need is there if you ignore that "possible fix" stuff. That is the same issue with type classes you may be familiar with in haskell98. An example of that is what you get from: mean xs = sum xs / length xs The HList type error looks like: No instance for (Fail * (DuplicatedLabel Symbol "x")) arising from a use of `.*.' Possible fix: add an instance declaration for (Fail * (DuplicatedLabel Symbol "x")) In the expression: x .=. 1 .*. x .=. 2 .*. emptyRecord In an equation for `r': r = x .=. 1 .*. x .=. 2 .*. emptyRecord -- the file responsible: {-# LANGUAGE DataKinds #-} import Data.HList.CommonMain; import GHC.TypeLits x = Label :: Label "x" r = x .=. 1 .*. x .=. 2 .*. emptyRecord

...

I think our best hope of building something workable is when Richard's overlapping/closed type functions gets into the language (and we probably have to allow time for the wrinkles to get ironed out).

Atze's code uses them. Failing when you insert an element whose label already exists can be done by adding another case in the `type family Inject' (line 113 of OpenRecVar.hs) like: Inject l t (l := t ::: x) = Failure '("duplicated label", l) This Failure is the same trick as done with Fail in HList except translated to type families. Unfortunately the type family Failure only shows up when you use the `x' below, not when you define it (as happens in the previous example). {-# LANGUAGE DataKinds, TypeFamilies, PolyKinds, UndecidableInstances #-} import GHC.TypeLits -- accepted, but not usable x :: F Int Double x = undefined type family F a b where F a a = Bool F a b = Failure '("type",a, "should be equal to",b) type family Failure (a :: k) {- the "error" you get out of trying to use x is now something like: No instance for (Show (Failure ((,,,) Symbol * Symbol *) '("type", Int, "should be equal to", Double))) Or perhaps something like: Expected type: (blah blah) Actual type: Failure ( ... ) -} The same type error is also delayed if you make a closed type family for Failure, which might be a ghc bug. One promising idea is to make an infinite loop to bring the error to attention earlier: Context reduction stack overflow; size = 201 Use -fcontext-stack=N to increase stack size to N Fail ((,,,) Symbol * Symbol *) '("type", Int, "should be equal to", Double) (Fail Symbol "infinite loop to bring this to your attention: don't raise the context stack please" t0) ~ a0 In the expression: undefined In an equation for ‛x’: x = undefined -- the above message comes from: type family Fail (x::k) (a :: *) where Fail x a = Fail x (Fail "infinite loop to bring this to your attention: don't raise the context stack please" a) which gets used like: F a b = Fail '("type",a, "should be equal to",b) () Regards, Adam

AntC

2:22 a.m.

...

adam vogt writes:

...
On Wed, Nov 27, 2013 at 4:24 PM, AntC wrote: The difficult part of extensible records is exactly avoiding duplicate labels. TRex achieved it, but needed costly language extensions. HList achieves it, using a fragile combination of extensions (and giving impenetrable type errors if your program gets it wrong).

Why do you say that errors regarding duplicate instances are impenetrable?

Hi Adam, So do you think you're disproving me by showing a message "No instance for ..."? _Obviously_ "No instances" means duplicate instances -- NOT! Errm anyway, I didn't say "duplicate instances": I said errors re duplicate **labels**. And I'm talking about HList, not Atze's work. Yes, one of the (sources of) errors is alleged missing instances. Another is iresolvably undecidable instances. Another is type mis-matches. Often a single mistake in the program triggers a whole slew of errors. (There might be multiple different record types in the pgm, variously mashed together, as happens in real-life databases. Finding the one mistake can be hard work.) Don't get me wrong, HList is a hugely impressive piece of work. Especially given how 'primitive' was Haskell at the time. It relies on overlapping instances. And Oleg (at least) has repeatedly expressed doubts about the whole overlap business. (There's hints of that in the HList paper.) Whether or not you agree with him, as one of the authors, I think we should pay attention.

...

...
... Richard's overlapping/closed type functions ...

...

Atze's code uses them.

OK, good. (I didn't look at the code so much as at the stated design.) I did notice that Atze uses FunDeps as well as type functions. But that he doesn't use OverlappingInstances. (Or perhaps they're smuggled in via some of the imports?) I'm puzzled by Atze also needing extensions to the compiler to support closed DataKinds.

...

Failing when you insert an element whose label already exists can be done ...

Then PLEASE implement that. ...

...

... This Failure is the same trick as done with Fail in HList ...

... AND show that the resulting error handling is tractable. (If I remember, the authors of HList regarded that way of failing as a 'hack', and were not very happy. Using Richard's closed type functions it should be possible to avoid that hack -- which is why I was wanting to use them. So I'm more puzzled. OTOH, I did feed raise that unhappy hack with Richard at the time, as a counter-example that I couldn't understand how he'd handle. So perhaps he didn't.) And PLEASE say that records _can't_ contain duplicate labels. And PLEASE don't mention Leijen's paper. Instead: compare/contrast the semantics to TRex, and to MPJ/SPJ's 'Lightweight Extensible' proposal (which was found to be a bit too heavyweight to implement ;-) AntC

adam vogt

4:16 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

On Wed, Nov 27, 2013 at 9:22 PM, AntC wrote:

...

...
adam vogt writes:

...
On Wed, Nov 27, 2013 at 4:24 PM, AntC wrote: So do you think you're disproving me by showing a message "No instance for ..."? _Obviously_ "No instances" means duplicate instances -- NOT!

Errm anyway, I didn't say "duplicate instances": I said errors re duplicate **labels**.

Hi AntC, Sorry. I typed "duplicate instances" while I intended to write "duplicate labels". The example that follows is about error messages when you try to make a record with duplicate labels.

...

And I'm talking about HList, not Atze's work.

Me too (at that point). I have not built a ghc with the necessary modifications to try out the openrec.

...

...
Atze's code uses them.

OK, good. (I didn't look at the code so much as at the stated design.) I did notice that Atze uses FunDeps as well as type functions. But that he doesn't use OverlappingInstances. (Or perhaps they're smuggled in via some of the imports?)

Closed type families (type family Foo a where Foo () = ..; ...) have equations that behave like overlapping instances. To use them you just need ghc-7.7 (or wait a few weeks for ghc-7.8) and just enable -XTypeFamilies.

...

...
... This Failure is the same trick as done with Fail in HList ...

... AND show that the resulting error handling is tractable. (If I remember, the authors of HList regarded that way of failing as a 'hack', and were not very happy. Using Richard's closed type functions it should be possible to avoid that hack -- which is why I was wanting to use them. So I'm more puzzled. OTOH, I did feed raise that unhappy hack with Richard at the time, as a counter-example that I couldn't understand how he'd handle. So perhaps he didn't.)

Do you recall where that discussion was? The two versions of type family Failure I give are less satisfying than that the (class Fail a). I don't know of a type-level `error' besides the three approaches that have been named in this thread. Sometimes you might be able to re-work some things to get the type-level equivalent of a pattern match failure. Though it seems ghc isn't eager enough to point out "you're using a closed type family in a way that I will never be able to satisfy". For example: f x y | x == y = () -- becomes type family F x y where F x x = () But I'm not sure how to translate g x y | x /= y = () Besides re-writing it as g' x y | x == y = error "mistake" | otherwise = () Which translates to type family G' x y where G' x x = Failure "mistake" () G' x y = ()

...

And PLEASE say that records _can't_ contain duplicate labels.

I suppose there's one hole left. You're allowed to compile the following using HList: boom = boom :: Record [LVPair "x" Int, LVPair "x" Double] But I'm not sure that error is common enough to justify sticking a HRLabelSet constraint on every operation in the Record.hs. And even then you'd still be allowed boom in your program. Another idea is to copy the concept of a "smart constructor" for the type level. I don't think that is practical until HRLabelSet can be written as a type family. Doing the conversion means dropping ghc-7.6 support, since you can't write TypeEq as a type family without closed type families. I will wait until 7.8 is out (and a better Failure type family) before trying to re-write that bit. Regards, Adam

AntC

5:32 a.m.

...

adam vogt writes:

...
On Wed, Nov 27, 2013 at 9:22 PM, AntC wrote:

Closed type families ... wait a few weeks for ghc-7.8)

Yes, I'm in eager anticipation!

...

...
... OTOH, I did feed raise that unhappy hack with Richard at the time, as a counter-example that I couldn't understand how he'd handle. So perhaps he didn't.)

Do you recall where that discussion was?

http://typesandkinds.wordpress.com/2013/04/29/coincident-overlap-in-type- families/ My message on June 24. (It also mentions discussion on GHC-Users.) This is exactly an example of a records mechanism with duplicate 'labels'. Actually being what the HList paper calls Type-Indexed Products. Note that the code I give there works beautifully with old-fashioned overlapping instances; and produces a helpful error message without any additional ancillary Fail classes. It works because my records are tuples, _not_ HLists. So instance resolution is working with a 'flat' structure where it can see all the types. Contrast that HCons effectively hides types in its tail. I suspect that Richard's implementation also effectively hides potential duplicates by putting them in a later 'case' of the type family.

...

... Though it seems ghc isn't eager enough ...

This is what I find most frustrating with ghc (as contrasted with dear old Hugs): instance failures are 'lazy'. You have to backtrack through a lot of code/other modules to figure out what's going wrong. I was hoping that with Richard's work, instance validation could be eager: reject the instance definition as overlapping _at_the_point_of_definition_. I think that a long time ago ghc took a wrong turn and allowed partially overlapping instances. It therefore has to wait until it finds a usage to see if it is actually ambiguous. I think that partially overlapping instances should be banned. Instances should be either disjunctive or wholly overlapping. (Note that you can always reorganise a partial overlap to fit this rule.)

...

... Which translates to

type family G' x y where G' x x = Failure "mistake" () G' x y = ()

That's exactly the unhappy hack I was wanting to avoid. If you still have to do that with closed type families, then I'm disappointed. I wanted disequality restraints. (See the Sulzmann and Stuckey paper I mention earlier on Richard's discussion page.) Which would be one single stand-alone instance: type instance G'' x y | x /~ y = () (I also think this has more perspicuous type inference rules, and fits better with a mental model of all the instances 'competing' to be chosen at the use site; with type improvement progressing until exactly one matches. This does not involve instance search, which as SPJ points out would be death to coherence.) Cheers AntC

Richard Eisenberg

2 Dec 2 Dec

4:48 p.m.

A few reactions to this thread, in no particular order: - Atze, thanks for doing this work! Powering extensible records was one of the use cases I had for closed type families in the back of my head, and I'm glad someone is going ahead with this! - Closed type families, as implemented, do nothing to help with error messages that are produced by missing instances (class or type family) or missing equations of a closed type family. For a type family, if no instances/equations are applicable, the use of the type family is "stuck" -- it doesn't reduce. Detecting stuckness using an in-language mechanism risks type safety, because stuckness might change either with type improvement or (in the case of open families) by the inclusion of another module. - It is conceivable that further work can make closed type families better in this regard. For example, I could imagine a check on the use site of a closed type family that could tell that the use would *never* reduce, regardless of any type improvement that might take place. I can also imagine implementing a customizable error message to use in that case. However, these additions aren't on my current docket of work. I'd be happy to point the way forward to someone else who is interested! - I don't see how disequality guards would help with error messages. - The notion of "competing instances" for type families sounds somewhat like delayed overlap checks for class instances. For example, consider these definitions:

...

module A where type instance F a b c | b /~ c = Int

...

module B where import A type instance F a b c | a /~ c = Bool

Questions: 1. Are these instances overlapping? 2. How does `F Int Bool Char` reduce when written in module A? 3. How does `F Int Bool Char` reduce when written in module B? Pondering the answers to these questions makes me doubt this approach. On the other hand, perhaps we could require that all instances of a type family that have disequality guards are given in the same module. Then, the reduction of something like `F Int Bool Int` is hard to figure out. In short, I'm not sure what is gained by this approach. (Disclaimer: it is easy enough to find very counter-intuitive examples using closed type families as implemented. I ask these questions more to ponder what an alternative might look like than to say that closed type families are perfect.) - While I agree that closed type families don't fit as well with the mental model of competing instances, I believe they work better with programmers' sensibilities at the term level. One could argue which is better to aim for. - Banning instance overlap that is currently permitted would break existing code. Is this a big enough wart to incite a breaking change? Also, what would happen if two different packages define instances (mainly for internal use) that conflict? These instances would necessarily be orphans. Could these packages then be used together? Perhaps this case never comes up in practice, but it would worry me. Richard On Nov 28, 2013, at 12:32 AM, AntC wrote:

...

...
adam vogt writes:

...
On Wed, Nov 27, 2013 at 9:22 PM, AntC wrote:

Closed type families ... wait a few weeks for ghc-7.8)

Yes, I'm in eager anticipation!

...
...
... OTOH, I did feed raise that unhappy hack with Richard at the time, as a counter-example that I couldn't understand how he'd handle. So perhaps he didn't.)

Do you recall where that discussion was?

http://typesandkinds.wordpress.com/2013/04/29/coincident-overlap-in-type- families/ My message on June 24. (It also mentions discussion on GHC-Users.)

This is exactly an example of a records mechanism with duplicate 'labels'. Actually being what the HList paper calls Type-Indexed Products.

Note that the code I give there works beautifully with old-fashioned overlapping instances; and produces a helpful error message without any additional ancillary Fail classes.

It works because my records are tuples, _not_ HLists. So instance resolution is working with a 'flat' structure where it can see all the types. Contrast that HCons effectively hides types in its tail. I suspect that Richard's implementation also effectively hides potential duplicates by putting them in a later 'case' of the type family.

...
... Though it seems ghc isn't eager enough ...

This is what I find most frustrating with ghc (as contrasted with dear old Hugs): instance failures are 'lazy'. You have to backtrack through a lot of code/other modules to figure out what's going wrong.

I was hoping that with Richard's work, instance validation could be eager: reject the instance definition as overlapping _at_the_point_of_definition_.

I think that a long time ago ghc took a wrong turn and allowed partially overlapping instances. It therefore has to wait until it finds a usage to see if it is actually ambiguous.

I think that partially overlapping instances should be banned. Instances should be either disjunctive or wholly overlapping. (Note that you can always reorganise a partial overlap to fit this rule.)

...
... Which translates to

type family G' x y where G' x x = Failure "mistake" () G' x y = ()

That's exactly the unhappy hack I was wanting to avoid. If you still have to do that with closed type families, then I'm disappointed.

I wanted disequality restraints. (See the Sulzmann and Stuckey paper I mention earlier on Richard's discussion page.)

Which would be one single stand-alone instance:

type instance G'' x y | x /~ y = ()

(I also think this has more perspicuous type inference rules, and fits better with a mental model of all the instances 'competing' to be chosen at the use site; with type improvement progressing until exactly one matches. This does not involve instance search, which as SPJ points out would be death to coherence.)

Cheers AntC

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

AntC

3 Dec 3 Dec

5:17 a.m.

New subject: [Haskell-cafe] Return of the revenge of the revisit of the extensible records, reiterated

...

Richard Eisenberg writes:

A few reactions to this thread, in no particular order:

Thank you Richard for your comprehensive reply. There's been so much e-ink spilt on the subject over the years, I don't really want to repeat it, so a few key points (in reverse of your order ;-) ...

...

- Banning instance overlap that is currently permitted would break existing code. Is this a big enough wart to incite a breaking change?

Errm, no: overlap is not "currently permitted". Overlap is not part of H98. Overlap is not sanctioned as part of H2010. Overlap is not even on the slate for Haskell Prime. Its status is clearly marked as experimental/unsafe. MTL was reorganised to remove Overlap. There's various trickery in the base classes to avoid overlap. (Such as showlist.) That said, I know there's plenty of code out there. (I've written it myself.) I think there is very little code with genuinely partial overlaps, as opposed to total overlaps. (And total overlaps are easily turned into disequality guards.) The difficulties that Oleg & Ralf ran into with overlaps, it turns out can be made to work -- even in Hugs. That is, if you use overlaps in a very disciplined way. That way amounts to disequality guards. I wish I had the time and competence to implement the guards idea. Then we could experiment with the issues under debate ...

...

- While I agree that closed type families don't fit as well with the mental model of competing instances, I believe they work better with programmers' sensibilities at the term level. One could argue which is better to aim for.

So I'm going to argue ;-): - Guards are already familiar at the term level. - I'm not convinced that instance selection is ever-so like the term level. (For example, you've introduced some look-ahead coincidence-checking, to improve the chances for closed families.) - We're already used to widely-scattered definitions of class methods at the term level. - I'd love Haskell to support widely-scattered definitions of functions (It's sometimes a pain to have to put all of the patterns together in sequence.) - (IOW, let's make terms more like instances, rather than v.v.)

...

- The notion of "competing instances" for type families sounds somewhat like delayed overlap checks for class instances. ...

No. No delay. The point is to validate eagerly at the point of declaring the instance (and whether or not it's in some other module). The guards guarantee that no instances overlap. Importing an overlapping instance is trapped immediately; no risk of incoherence. Then we know that any usage cannot possibly match more than one instance. We can let all instances compete. Whichever one 'wins', we can be sure it's the only one. (This of course doesn't guarantee there will be a winner at all; instance matching can still get 'stuck', as you say.)

...

For example, consider these definitions:

...
module A where type instance F a b c | b /~ c = Int

...
module B where import A type instance F a b c | a /~ c = Bool

Questions: 1. Are these instances overlapping?

Yes, so compile fails at the instances. No need to look at usages. (I gave the rules by which it would fail in some message way back; it is algorithmic, and based on what I think is existing ghc behaviour for overlaps.) So your follow-on questions don't apply

...

2. How does `F Int Bool Char` reduce when written in module A? 3. How does `F Int Bool Char` reduce when written in module B?

...

- I don't see how disequality guards would help with error messages.

OK, the example is to check that there's exactly one occurence of a type- level label in an HList. This is hard, so Leijen's approach doesn't try. Having found an instance with the label, HList uses an auxiliary class `Lacks` to validate that the label doesn't appear in the Hlist's tail, like this: instance (Lacks lab l') => Has lab (HCons (lab, val) l') How to code `Lacks`? instance Lacks lab HNil -- got to the end of the tail OK instance (Lacks lab l') => Lacks lab (HCons (lab2, val) l') -- not here, recurse on the tail But that doesn't work! the instance matches even when `lab` is the same as `lab2`. We need an extra instance, which we want to treat as failure: instance (Fail SomeMessage) => Lacks lab (HCons (lab, val) l') -- repeated typevar ^^^ Deliberately to trigger the failure, we make sure there's no instance for `Fail`. So the compiler error message is 'No instance for Fail ...'; but it _means_ there _is_ an instances for `Lacks`. With a disequality guards we'd have only one instance: instance (Lacks lab l') => Lacks lab (HCons (lab2, val) l') | lab /~ lab2 And the compiler message would be 'No instance for Lacks L1 (HCons (L1, Int) ...)`. We can see the repeated label. Now I concede that all this design involves a kinda double-negative. At least we get focus on the class causing the failure. AntC

John Lato

5:43 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

(a bit late to the discussion, so please ignore if this is completely off-base) On Mon, Dec 2, 2013 at 9:17 PM, AntC wrote:

...

...
Richard Eisenberg writes:

- The notion of "competing instances" for type families sounds somewhat like delayed overlap checks for class instances. ...

No. No delay. The point is to validate eagerly at the point of declaring the instance (and whether or not it's in some other module). The guards guarantee that no instances overlap. Importing an overlapping instance is trapped immediately; no risk of incoherence.

How can this possibly work with open type families? What happens in this case?

...

module A where type instance F a b c | b /~ c = Int

...

module B where type instance F a b c | a /~ c = Bool

During compilation, neither A nor B is aware of the other. What happens in a module that imports both?

AntC

7:25 a.m.

...

John Lato writes:

...

On Mon, Dec 2, 2013 at 9:17 PM, AntC wrote:

...
... Importing an overlapping instance is trapped immediately; no risk of incoherence.

How can this possibly work with open type families? What happens in this case?

...
module A where type instance F a b c | b /~ c = Int module B where type instance F a b c | a /~ c = Bool

During compilation, neither A nor B is aware of the other. What happens in a module that imports both?

Thanks John, a good use case! The trapping is needed with imports for any approach to open instances (not just type families). Suppose I have NoOverlappingInstances everywhere:

...

module A where instance C a b c where ... module B where instance C a b c where ... module D where instance C Int Bool Char where ...

And a module that imports all three. Any importer has to validate all instances sometime or other. (Currently ghc sticks its head in the sand, and hopes there won't be a usage that trips over the ambiguity.) All we're talking about is _when_ we validate. I'd rather know at the point of declaring the instance, or of importing the instance. AntC

Atze van der Ploeg

5 Dec 5 Dec

2:38 p.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

Dear all, I have written a document describing the design, usage and motivation for these extensible records (now called CTRex, a nod to Trex). It is available at: ====> http://www.haskell.org/haskellwiki/CTRex <====== Improvements, comments and questions welcome! Below are some long overdue responses to comments in this post. Cheers! Atze ================================================== Oleg: You said:

...

This is all very true. However, if we wish to pass the function f above the record {y=0, x=0} (with permuted fields), we most likely wish to pass that function a record {x=0, y=0, z='a'} with more fields. Most of the time when we deal with extensible records, we really wish to explore their extensibility. Keeping fields sorted does not help at all in the latter problem -- we must manually insert the call to the subtyping coercion function. Once we do that, the problem with the order of the fields disappears.

In my design such coercion calls are unnecessary, or am I missing something here?

...

I also would like to point out that there are two sorts of record types. One sort is Rec [x: Int, y:Bool] in the imagine syntax. Current HList types are uglier versions of the above. But there is another sort: (HasField r x Int, HasField r y Bool) => r It represents an extensible record type. Extensibility is build in, and the order of the fields is immaterial. Quite a few functions on records can be given the above type. Furthermore, the second sort can be used not only with structural subtyping but also with nominal subtyping.

I agree. There is a difference between systems with records and row polymorphic records (if I understand you correctly). My system supports both, the former is written as: Rec ("x" ::= Int .| "y" ::= Bool .| Empty) the latter is written as Rec ("x" ::= Int .| "y" ::= Bool .| r) where r is the rest of the row, not the whole row as you write it. Notice that .| is a type level *function* (not a constructor!) that inserts a label-type value into the row. Hence Rec ("x" ::= Int .| "y" ::= Bool .| Empty) ~ Rec ("y" ::= Bool .| "x" ::= Int .| Empty) I do not understand you point about nominal subtyping, how can we nominally subtype r? It has no name? I can only imagine structural subtyping on records. ===================================================== Adam Vogt: Could you check whether the comparison with HList records is accurate? Check http://www.haskell.org/haskellwiki/CTRex#Comparison_with_other_approaches . Thanks! ============================================= AntC: You seem to have strong opinions on why allowing duplicate labels is a bad thing. Currently, my design supports both extension with scoping and the lacks predicate (see http://www.haskell.org/haskellwiki/CTRex#Duplicate_labels.2C_and_lacks). I think duplicate labels are nice in some situations and bad in other situations. 2013/12/3 AntC

...

...
John Lato writes:

...
On Mon, Dec 2, 2013 at 9:17 PM, AntC wrote:

...
... Importing an overlapping instance is trapped immediately; no risk of incoherence.

How can this possibly work with open type families? What happens in this case?

...
module A where type instance F a b c | b /~ c = Int module B where type instance F a b c | a /~ c = Bool

During compilation, neither A nor B is aware of the other. What happens in a module that imports both?

Thanks John, a good use case!

The trapping is needed with imports for any approach to open instances (not just type families). Suppose I have NoOverlappingInstances everywhere:

...
module A where instance C a b c where ... module B where instance C a b c where ... module D where instance C Int Bool Char where ...

And a module that imports all three. Any importer has to validate all instances sometime or other.

(Currently ghc sticks its head in the sand, and hopes there won't be a usage that trips over the ambiguity.)

All we're talking about is _when_ we validate. I'd rather know at the point of declaring the instance, or of importing the instance.

AntC

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

adam vogt

5:01 p.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

On Thu, Dec 5, 2013 at 9:38 AM, Atze van der Ploeg wrote:

...

Adam Vogt:

Could you check whether the comparison with HList records is accurate? Check http://www.haskell.org/haskellwiki/CTRex#Comparison_with_other_approaches . Thanks!

Hi Atze, I noted there is an equivalent of "constrained record operations" in HList, but it's less convenient than your Forall class. Regards, Adam

AntC

6 Dec 6 Dec

12:31 a.m.

...

Atze van der Ploeg writes:

(see http://www.haskell.org/haskellwiki/CTRex#Duplicate_labels.2C_and_lacks). I think duplicate labels are nice in some situations and bad in other situations.

Thank you Atze for a well-written description. I think there might be a couple of typos there? (c, r'') = decomp r x -- rhs s/b: decomp r' x ?? extendUnique :: (..., l :\r ) => ... -- s/b: r :\ l ?? Your "motivation" example is hard to follow without knowing what `decomp` does. (IOW, it is not showing me a motivation ;-) I'm puzzled by this in the implementation notes 4.2 Records: "Here we see that a record is actually just a map from string to the sequence of values. Notice that it is a sequence of values and not a single value, because the record may contain duplicate labels." It sounds like there's an overhead in being able to support duplicate labels (even if I don't want duplicates in my records)? Is there a performance penalty at run-time with extending/prepending and restricting/pretruncating, to keep the invariant re the i-th value? Leijen allowed duplicate labels to make a virtue of necessity IMO. There has not been an extensible records proposal before or since for duplicate labels. (TRex certainly didn't do it.) His 'necessity' was ease of implementation. This sequence of values stuff seems to make a more difficult implementation for the sake of providing a 'feature' that nobody's asked for(?) There's one 'advanced feature' of extensible records that I'd be interested in: merging records by label, as is done for 'Natural Join'. a row with labels {x, y, z} merge labels {y, z, w} returns a Maybe row with {x, y, z, w} providing the types paired with y and z are the same and the values are the same (otherwise return Nothing) It's absolutely essential _not_ to duplicate labels in this case. AntC

AntC

2:50 a.m.

New subject: [Haskell-cafe] Return of the revenge of the revisit of the extensible records, reiterated

...

AntC writes:

Actually, I think there's more wrong with that line than a typo:

...

extendUnique :: (..., l :\r ) => ... -- s/b: r :\ l ??

(It's supposed to do renaming with non-duplicate labels?) Talking of renaming, how does it go with duplicate labels? The comment on `rename` says it can be expressed using the "above operations" (presumably restrict followed by extend with the new label, as per Gastar&Jones and Leijen). If that's genuinely equivalent, then rename will 'unhide' any duplicate label. So presumably the implementation must split the HashMap into two keys, rather than changing the label on the existing Seq(?) AntC

Atze van der Ploeg

10:04 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

...

Thank you Atze for a well-written description.

Cheers! :)

...

I think there might be a couple of typos there?

Sadly, due to the finiteness of life I cannot guarantee perfection in all my communication. However, if you see typos, I would greatly appreciate it if you fix them (it's a wiki). :)

...

(c, r'') = decomp r x -- rhs s/b: decomp r' x ?? Your "motivation" example is hard to follow without knowing what `decomp` does. (IOW, it is not showing me a motivation ;-)

Woops! Sorry! I messed up the example, i've changed it now to: g :: Rec r -> Rec ("p" ::= String .| r) g r = let r' = f (x := 10 .| r) (c,r'') = (r'.!x, r' .- x) v = if c then "Yes" else "Nope" in p := v .| r''

...

I'm puzzled by this in the implementation notes 4.2 Records:

...

"Here we see that a record is actually just a map from string to the sequence of values. Notice that it is a sequence of values and not a single value, because the record may contain duplicate labels."

...

It sounds like there's an overhead in being able to support duplicate labels (even if I don't want duplicates in my records)? Is there a performance penalty at run-time with extending/prepending and restricting/pretruncating, to keep the invariant re the i-th value?

Well, supposing you have no duplicate labels, then all sequences are of length 1. Hence the overhead is that we have a sequence of length 1 instead of just a value (i.e. one extra reference to follow). This is a very small overhead, and in my opinion is justified by the advantages of allowing duplicate labels. Notice also that (.!) always accesses the head of the sequence, since only the leftmost label is accessible (to access shadowed labels, restrict the record with that label).

...

Leijen allowed duplicate labels to make a virtue of necessity IMO. There has not been an extensible records proposal before or since for duplicate labels. (TRex certainly didn't do it.) His 'necessity' was ease of implementation. This sequence of values stuff seems to make a more difficult implementation for the sake of providing a 'feature' that nobody's asked for(?)

Well, I think Leijen makes two points: * Duplicate labels are nice and allow shadowing in records, which is good. * Duplicate labels allow us to construct a type system lacking a "lacks" predicate, which makes it simpler. I am mainly interested in the first point, see my example. As I said, whether you want duplicate labels depends on the situation. As another use case for duplicate labels: consider implementing an interpreter for some embedded DSL, and you want to carry the state of the variables in the an extensible record. Declaring a new variable in the embedded language then causes us to extend the record. Since the embedded language allows shadowing (as most languages do), we can simply extend the record, we do not have to jump through hoops to make sure there are no duplicate labels. Once the variable goes out of scope, we remove the label again to bring the old "variable" into scope.

...

Actually, I think there's more wrong with that line than a typo:

extendUnique :: (..., l :\r ) => ... -- s/b: r :\ l ??

(It's supposed to do renaming with non-duplicate labels?)

Sorry! Another mistake, I've fixed it. It is now as follows: renameUnique :: (KnownSymbol l, KnownSymbol l', r :\ l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r)

...

Talking of renaming, how does it go with duplicate labels? The comment on `rename` says it can be expressed using the "above operations" (presumably restrict followed by extend with the new label, as per Gastar&Jones and Leijen).

...

If that's genuinely equivalent, then rename will 'unhide' any duplicate label. So presumably the implementation must split the HashMap into two keys, rather than changing the label on the existing Seq(?)

Yes, exactly. Renaming is implemented as follows, which is equivalent to what you said: rename :: (KnownSymbol l, KnownSymbol l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r) rename l l' r = extend l' (r .! l) (r .- l) renameUnique :: (KnownSymbol l, KnownSymbol l', r :\ l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r) renameUnique = rename 2013/12/6 AntC

...

...
Atze van der Ploeg writes:

(see http://www.haskell.org/haskellwiki/CTRex#Duplicate_labels.2C_and_lacks). I think duplicate labels are nice in some situations and bad in other situations.

Thank you Atze for a well-written description.

I think there might be a couple of typos there?

(c, r'') = decomp r x -- rhs s/b: decomp r' x ??

extendUnique :: (..., l :\r ) => ... -- s/b: r :\ l ??

Your "motivation" example is hard to follow without knowing what `decomp` does. (IOW, it is not showing me a motivation ;-)

I'm puzzled by this in the implementation notes 4.2 Records: "Here we see that a record is actually just a map from string to the sequence of values. Notice that it is a sequence of values and not a single value, because the record may contain duplicate labels."

It sounds like there's an overhead in being able to support duplicate labels (even if I don't want duplicates in my records)? Is there a performance penalty at run-time with extending/prepending and restricting/pretruncating, to keep the invariant re the i-th value?

Leijen allowed duplicate labels to make a virtue of necessity IMO. There has not been an extensible records proposal before or since for duplicate labels. (TRex certainly didn't do it.) His 'necessity' was ease of implementation. This sequence of values stuff seems to make a more difficult implementation for the sake of providing a 'feature' that nobody's asked for(?)

There's one 'advanced feature' of extensible records that I'd be interested in: merging records by label, as is done for 'Natural Join'.

a row with labels {x, y, z} merge labels {y, z, w} returns a Maybe row with {x, y, z, w} providing the types paired with y and z are the same and the values are the same (otherwise return Nothing)

It's absolutely essential _not_ to duplicate labels in this case.

AntC

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Atze van der Ploeg

10:13 a.m.

New subject: Return of the revenge of the revisit of the extensible records, reiterated

You might also wonder why wonder why I use a Sequence instead of a List, since we only query the head and prepend. This is implement record merge (.+) more efficiently since we can then use (><) (O(1)) instead of (++) (O(n)) as follows: (.++) :: Rec l -> Rec r -> Rec (l :++ r) (OR l) .++ (OR r) = OR $ M.unionWith (><) l r 2013/12/6 Atze van der Ploeg

...

...
Thank you Atze for a well-written description.

Cheers! :)

...
I think there might be a couple of typos there?

Sadly, due to the finiteness of life I cannot guarantee perfection in all my communication. However, if you see typos, I would greatly appreciate it if you fix them (it's a wiki). :)

...
(c, r'') = decomp r x -- rhs s/b: decomp r' x ?? Your "motivation" example is hard to follow without knowing what `decomp` does. (IOW, it is not showing me a motivation ;-)

Woops! Sorry! I messed up the example, i've changed it now to:

g :: Rec r -> Rec ("p" ::= String .| r) g r = let r' = f (x := 10 .| r) (c,r'') = (r'.!x, r' .- x) v = if c then "Yes" else "Nope" in p := v .| r''

...
I'm puzzled by this in the implementation notes 4.2 Records:

...
"Here we see that a record is actually just a map from string to the sequence of values. Notice that it is a sequence of values and not a single value, because the record may contain duplicate labels."

...
It sounds like there's an overhead in being able to support duplicate labels (even if I don't want duplicates in my records)? Is there a performance penalty at run-time with extending/prepending and restricting/pretruncating, to keep the invariant re the i-th value?

Well, supposing you have no duplicate labels, then all sequences are of length 1. Hence the overhead is that we have a sequence of length 1 instead of just a value (i.e. one extra reference to follow). This is a very small overhead, and in my opinion is justified by the advantages of allowing duplicate labels. Notice also that (.!) always accesses the head of the sequence, since only the leftmost label is accessible (to access shadowed labels, restrict the record with that label).

...
Leijen allowed duplicate labels to make a virtue of necessity IMO. There has not been an extensible records proposal before or since for duplicate labels. (TRex certainly didn't do it.) His 'necessity' was ease of implementation. This sequence of values stuff seems to make a more difficult implementation for the sake of providing a 'feature' that nobody's asked for(?)

Well, I think Leijen makes two points: * Duplicate labels are nice and allow shadowing in records, which is good. * Duplicate labels allow us to construct a type system lacking a "lacks" predicate, which makes it simpler.

I am mainly interested in the first point, see my example. As I said, whether you want duplicate labels depends on the situation.

As another use case for duplicate labels: consider implementing an interpreter for some embedded DSL, and you want to carry the state of the variables in the an extensible record. Declaring a new variable in the embedded language then causes us to extend the record. Since the embedded language allows shadowing (as most languages do), we can simply extend the record, we do not have to jump through hoops to make sure there are no duplicate labels. Once the variable goes out of scope, we remove the label again to bring the old "variable" into scope.

...
Actually, I think there's more wrong with that line than a typo:

...
extendUnique :: (..., l :\r ) => ... -- s/b: r :\ l ??

(It's supposed to do renaming with non-duplicate labels?)

Sorry! Another mistake, I've fixed it. It is now as follows:

renameUnique :: (KnownSymbol l, KnownSymbol l', r :\ l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r)

...
Talking of renaming, how does it go with duplicate labels? The comment on `rename` says it can be expressed using the "above operations" (presumably restrict followed by extend with the new label, as per Gastar&Jones and Leijen).

...
If that's genuinely equivalent, then rename will 'unhide' any duplicate label. So presumably the implementation must split the HashMap into two keys, rather than changing the label on the existing Seq(?)

Yes, exactly. Renaming is implemented as follows, which is equivalent to what you said:

rename :: (KnownSymbol l, KnownSymbol l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r) rename l l' r = extend l' (r .! l) (r .- l) renameUnique :: (KnownSymbol l, KnownSymbol l', r :\ l') => Label l -> Label l' -> Rec r -> Rec (Rename l l' r) renameUnique = rename

2013/12/6 AntC

...
...
Atze van der Ploeg writes:

(see http://www.haskell.org/haskellwiki/CTRex#Duplicate_labels.2C_and_lacks). I think duplicate labels are nice in some situations and bad in other situations.

Thank you Atze for a well-written description.

I think there might be a couple of typos there?

(c, r'') = decomp r x -- rhs s/b: decomp r' x ??

extendUnique :: (..., l :\r ) => ... -- s/b: r :\ l ??

Your "motivation" example is hard to follow without knowing what `decomp` does. (IOW, it is not showing me a motivation ;-)

I'm puzzled by this in the implementation notes 4.2 Records: "Here we see that a record is actually just a map from string to the sequence of values. Notice that it is a sequence of values and not a single value, because the record may contain duplicate labels."

It sounds like there's an overhead in being able to support duplicate labels (even if I don't want duplicates in my records)? Is there a performance penalty at run-time with extending/prepending and restricting/pretruncating, to keep the invariant re the i-th value?

Leijen allowed duplicate labels to make a virtue of necessity IMO. There has not been an extensible records proposal before or since for duplicate labels. (TRex certainly didn't do it.) His 'necessity' was ease of implementation. This sequence of values stuff seems to make a more difficult implementation for the sake of providing a 'feature' that nobody's asked for(?)

There's one 'advanced feature' of extensible records that I'd be interested in: merging records by label, as is done for 'Natural Join'.

a row with labels {x, y, z} merge labels {y, z, w} returns a Maybe row with {x, y, z, w} providing the types paired with y and z are the same and the values are the same (otherwise return Nothing)

It's absolutely essential _not_ to duplicate labels in this case.

AntC

_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

4243

Age (days ago)

4253

Last active (days ago)

List overview

Download

20 comments

6 participants

participants (6)

adam vogt
AntC
Atze van der Ploeg
Carter Schonwald
John Lato
Richard Eisenberg