RFC: Unpacking sum types - ghc-devs - Haskell.org

newer
the platform has outgrown Travis-CI

RFC: Unpacking sum types

older
ArrayArrays

Johan Tibell

1 Sep 2015 1 Sep '15

5:23 p.m.

I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on: * the writing and clarity of the proposal and * the proposal itself. https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes -- Johan

Attachments:

attachment.html (text/html — 445 bytes)

Reply

Sign in to reply online Use email software

Show replies by date

Dan Doel

1 Sep 1 Sep

6:31 p.m.

I wonder: are there issues with strict/unpacked fields in the sum type, with regard to the 'fill in stuff' behavior? For example: data C = C1 !Int | C2 ![Int] data D = D1 !Double {-# UNPACK #-} !C Naively we might think: data D' = D1 !Double !Tag !Int ![Int] But this is obviously not going to work at the Haskell-implemented-level. Since we're at a lower level, we could just not seq the things from the opposite constructor, but are there problems that arise from that? Also of course the !Int will probably also be unpacked, so such prim types need different handling (fill with 0, I guess). -- Also, I guess this is orthogonal, but having primitive, unboxed sums (analogous to unboxed tuples) would be nice as well. Conceivably they could be used as part of the specification of unpacked sums, since we can apparently put unboxed tuples in data types now. I'm not certain if they would cover all cases, though (like the strictness concerns above). -- Dan On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell wrote:

I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:

* the writing and clarity of the proposal and * the proposal itself.

https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes

-- Johan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Johan Tibell

2 Sep 2 Sep

1:09 a.m.

After some discussions with SPJ I've now rewritten the proposal in terms of unboxed sums (which should suffer from the extra seq problem you mention above). On Tue, Sep 1, 2015 at 11:31 AM, Dan Doel wrote:

I wonder: are there issues with strict/unpacked fields in the sum type, with regard to the 'fill in stuff' behavior?

For example:

data C = C1 !Int | C2 ![Int]

data D = D1 !Double {-# UNPACK #-} !C

Naively we might think:

data D' = D1 !Double !Tag !Int ![Int]

But this is obviously not going to work at the Haskell-implemented-level. Since we're at a lower level, we could just not seq the things from the opposite constructor, but are there problems that arise from that? Also of course the !Int will probably also be unpacked, so such prim types need different handling (fill with 0, I guess).

--

Also, I guess this is orthogonal, but having primitive, unboxed sums (analogous to unboxed tuples) would be nice as well. Conceivably they could be used as part of the specification of unpacked sums, since we can apparently put unboxed tuples in data types now. I'm not certain if they would cover all cases, though (like the strictness concerns above).

-- Dan

On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell wrote:

...
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:

* the writing and clarity of the proposal and * the proposal itself.

https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes

-- Johan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Ryan Newton

1:44 a.m.

Just a small comment about syntax. Why is there an "_n" suffix on the type constructor? Isn't it syntactically evident how many things are in the |# .. | .. #| block? More generally, are the parser changes and the wild new syntax strictly necessary? Could we instead just have a new keyword, but have at look like a normal type constructor? For example, the type: (Sum# T1 T2 T3) Where "UnboxedSum" can't be partially applied, and is variable arity. Likewise, "MkSum#" could be a keyword/syntactic-form: (MkSum# 1 3 expr) case x of MkSum# 1 3 v -> e Here "1" and "3" are part of the syntactic form, not expressions. But it can probably be handled after parsing and doesn't require the "_n_m" business. -Ryan On Tue, Sep 1, 2015 at 6:10 PM Johan Tibell wrote:

After some discussions with SPJ I've now rewritten the proposal in terms of unboxed sums (which should suffer from the extra seq problem you mention above).

On Tue, Sep 1, 2015 at 11:31 AM, Dan Doel wrote:

...
I wonder: are there issues with strict/unpacked fields in the sum type, with regard to the 'fill in stuff' behavior?

For example:

data C = C1 !Int | C2 ![Int]

data D = D1 !Double {-# UNPACK #-} !C

Naively we might think:

data D' = D1 !Double !Tag !Int ![Int]

But this is obviously not going to work at the Haskell-implemented-level. Since we're at a lower level, we could just not seq the things from the opposite constructor, but are there problems that arise from that? Also of course the !Int will probably also be unpacked, so such prim types need different handling (fill with 0, I guess).

--

Also, I guess this is orthogonal, but having primitive, unboxed sums (analogous to unboxed tuples) would be nice as well. Conceivably they could be used as part of the specification of unpacked sums, since we can apparently put unboxed tuples in data types now. I'm not certain if they would cover all cases, though (like the strictness concerns above).

-- Dan

On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell wrote:

...
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:

* the writing and clarity of the proposal and * the proposal itself.

https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes

-- Johan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Joachim Breitner

2:12 a.m.

Hi, Am Mittwoch, den 02.09.2015, 01:44 +0000 schrieb Ryan Newton:

Why is there an "_n" suffix on the type constructor? Isn't it syntactically evident how many things are in the |# .. | .. #| block?

Correct.

More generally, are the parser changes and the wild new syntax strictly necessary?

If we just add it to Core, to support UNPACK, then there is no parser involved anyways, and the pretty-printer may do fancy stuff. (Why not unicode subscript numbers like ₂ :-)) But we probably want to provide this also on the Haskell level, just like unboxed products, right? Then we should have a nice syntax. Personally, I find (# a | b | c #) visually more pleasing. (The disadvantage is that this works only for two or more alternatives, but the one-alternative-unboxed-union is isomorphic to the one-element -unboxed-tuple anyways, isn’t it?)

Likewise, "MkSum#" could be a keyword/syntactic-form:

(MkSum# 1 3 expr) case x of MkSum# 1 3 v -> e

Here "1" and "3" are part of the syntactic form, not expressions. But it can probably be handled after parsing and doesn't require the "_n_m" business.

If we expose it on the Haskell level, I find MkSum_1_2# the right thing to do: It makes it clear that (conceptually) there really is a constructor of that name, and it is distinct from MkSum_2_2#, and the user cannot do computation with these indices. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Reply

Sign in to reply online Use email software

Ryan Newton

2:22 a.m.

If we expose it on the Haskell level, I find MkSum_1_2# the right thing to do: It makes it clear that (conceptually) there really is a constructor of that name, and it is distinct from MkSum_2_2#, and the user cannot do computation with these indices.

I don't mind MkSum_1_2#, it avoids the awkwardness of attaching it to a closing delimiter. But... it does still introduce the idea of cutting up tokens to get numbers out of them, which is kind of hacky. (There seems to be a conserved particle of hackiness here that can't be eliminate, but it doesn't bother me too much.)

Reply

Sign in to reply online Use email software

Joachim Breitner

5:58 a.m.

Hi, just an idea that crossed my mind: Can we do without the worker/wrapper dance for data constructors if we instead phrase that in terms of pattern synonyms? Maybe that's a refactoring/code consolidation opportunity. Good night, Joachim Am 1. September 2015 10:23:35 PDT, schrieb Johan Tibell :

I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:

* the writing and clarity of the proposal and * the proposal itself.

https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes

-- Johan

------------------------------------------------------------------------

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Joachim Breitner

7 Sep 7 Sep

11:02 a.m.

Hi, Am Dienstag, den 01.09.2015, 10:23 -0700 schrieb Johan Tibell:

I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:

* the writing and clarity of the proposal and * the proposal itself.

https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes

The current proposed layout for a data D a = D a {-# UNPACK #-} !(Maybe a) would be [D’s pointer] [a] [tag (0 or 1)] [Just’s a] So the representation of D foo (Just bar) is [D_info] [&foo] [1] [&bar] and of D foo Nothing is [D_info] [&foo] [0] [&dummy] where dummy is something that makes the GC happy. But assuming this dummy object is something that is never a valid heap objects of its own, then this should be sufficient to distinguish the two cases, and we could actually have that the representation of D foo (Just bar) is [D_info] [&foo] [&bar] and of D foo Nothing is [D_info] [&foo] [&dummy] and an case analysis on D would compare the pointer in the third word with the well-known address of dummy to determine if we have Nothing or Just. This saves one word. If we generate a number of such static dummy objects, we can generalize this tag-field avoiding trick to other data types than Maybe. It seems that it is worth doing that if * the number of constructors is no more than the number of static dummy objects, and * there is one constructor which has more pointer fields than all other constructors. Also, this trick cannot be applied repeatedly: If we have data D = D {-# UNPACK #-} !(Maybe a) | D'Nothing data E = E {-# UNPACK #-} !(D a) then it cannot be applied when unpacking D into E. (Or maybe it can, but care has to be taken that D’s Nothing is represented by a different dummy object than Maybe’s Nothing.) Anyways, this is an optimization that can be implemented once unboxed sum type are finished and working reliably. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Reply

Sign in to reply online Use email software

Simon Peyton Jones

2:35 p.m.

New subject: Unpacking sum types

Good start. I have updated the page to separate the source-language design (what the programmer sees) from the implementation. And I have included boxed sums as well – it would be deeply strange not to do so. Looks good to me! Simon From: Johan Tibell [mailto:johan.tibell@gmail.com] Sent: 01 September 2015 18:24 To: Simon Peyton Jones; Simon Marlow; Ryan Newton Cc: ghc-devs@haskell.org Subject: RFC: Unpacking sum types I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on: * the writing and clarity of the proposal and * the proposal itself. https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes -- Johan

Reply

Sign in to reply online Use email software

Dan Doel

5:53 p.m.

New subject: Unpacking sum types

Are we okay with stealing some operator sections for this? E.G. (x ||). I think the boxed sums larger than 2 choices are all technically overlapping with sections. On Mon, Sep 7, 2015 at 10:35 AM, Simon Peyton Jones wrote:

Good start.

I have updated the page to separate the source-language design (what the programmer sees) from the implementation.

And I have included boxed sums as well – it would be deeply strange not to do so.

Looks good to me!

Simon

From: Johan Tibell [mailto:johan.tibell@gmail.com] Sent: 01 September 2015 18:24 To: Simon Peyton Jones; Simon Marlow; Ryan Newton Cc: ghc-devs@haskell.org Subject: RFC: Unpacking sum types

I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:

* the writing and clarity of the proposal and

* the proposal itself.

https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes

-- Johan

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Simon Peyton Jones

7:25 p.m.

New subject: Unpacking sum types

| Are we okay with stealing some operator sections for this? E.G. (x | ||). I think the boxed sums larger than 2 choices are all technically | overlapping with sections. I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact. Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there. But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using. I've updated the wiki page Simon | -----Original Message----- | From: Dan Doel [mailto:dan.doel@gmail.com] | Sent: 07 September 2015 18:53 | To: Simon Peyton Jones | Cc: Johan Tibell; Simon Marlow; Ryan Newton; ghc-devs@haskell.org | Subject: Re: Unpacking sum types | | Are we okay with stealing some operator sections for this? E.G. (x | ||). I think the boxed sums larger than 2 choices are all technically | overlapping with sections. | | On Mon, Sep 7, 2015 at 10:35 AM, Simon Peyton Jones | wrote: | > Good start. | > | > | > | > I have updated the page to separate the source-language design (what | the | > programmer sees) from the implementation. | > | > | > | > And I have included boxed sums as well – it would be deeply strange not | to | > do so. | > | > | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > From: Johan Tibell [mailto:johan.tibell@gmail.com] | > Sent: 01 September 2015 18:24 | > To: Simon Peyton Jones; Simon Marlow; Ryan Newton | > Cc: ghc-devs@haskell.org | > Subject: RFC: Unpacking sum types | > | > | > | > I have a draft design for unpacking sum types that I'd like some | feedback | > on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | > | > | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs@haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | >

Reply

Sign in to reply online Use email software

Joachim Breitner

8:21 p.m.

New subject: AnonymousSums data con syntax

Hi, Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones:

...
Are we okay with stealing some operator sections for this? E.G. (x

...
...
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.

I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.

Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.

But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.

I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars? I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers. Given that of is already a keyword, how about something involving "3 of 4"? For example (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) and case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ... (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Reply

Sign in to reply online Use email software

Lennart Kolmodin

8 Sep 8 Sep

6:11 a.m.

New subject: AnonymousSums data con syntax

2015-09-07 21:21 GMT+01:00 Joachim Breitner :

Hi,

Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones:

...
...
Are we okay with stealing some operator sections for this? E.G. (x

...
...
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.

I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.

Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.

But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.

I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars?

I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers.

I reacted the same way to the proposed syntax. Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types. What about explicitly stating the index as a number? (1 | Int) :: ( String | Int | Bool ) (#1 | Int #) :: (# String | Int | Bool #) case sum of (0 | myString ) -> ... (1 | myInt ) -> ... (2 | myBool ) -> ... This allows you to at least add new constructors at the end without changing existing code. Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal; case sum of (0 of 3 | myString ) -> ... (1 of 3 | myInt ) -> ... (2 of 3 | myBool ) -> ... .. and at least you don't have to count bars.

Given that of is already a keyword, how about something involving "3 of 4"? For example

(Put# True in 3 of 5) :: (# a | b | Bool | d | e #)

and

case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ...

(If "as" were a keyword, (Put# x as 2 of 3) would sound even better.)

I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better?

Greetings, Joachim

-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Simon Peyton Jones

8:28 a.m.

New subject: AnonymousSums data con syntax

I can see the force of this discussion about data type constructors for sums, but · We already do this for tuples: (,,,,) is a type constructor and you have to count commas. We could use a number here but we don’t. · Likewise tuple sections. (,,e,) means (\xyz. (x,y,e,z)) I do not expect big sums in practice. That said, (2/5| True) instead of (|True|||) would be ok I suppose. Or something like that. Simon From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Lennart Kolmodin Sent: 08 September 2015 07:12 To: Joachim Breitner Cc: ghc-devs@haskell.org Subject: Re: AnonymousSums data con syntax 2015-09-07 21:21 GMT+01:00 Joachim Breitner mailto:mail@joachim-breitner.de>: Hi, Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones:

...
Are we okay with stealing some operator sections for this? E.G. (x

...
...
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.

I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.

Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.

But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.

I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars? I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers. I reacted the same way to the proposed syntax. Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types. What about explicitly stating the index as a number? (1 | Int) :: ( String | Int | Bool ) (#1 | Int #) :: (# String | Int | Bool #) case sum of (0 | myString ) -> ... (1 | myInt ) -> ... (2 | myBool ) -> ... This allows you to at least add new constructors at the end without changing existing code. Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal; case sum of (0 of 3 | myString ) -> ... (1 of 3 | myInt ) -> ... (2 of 3 | myBool ) -> ... .. and at least you don't have to count bars. Given that of is already a keyword, how about something involving "3 of 4"? For example (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) and case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ... (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.demailto:mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.demailto:nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.orgmailto:nomeata@debian.org _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

David Kraeutmann

10:12 a.m.

New subject: AnonymousSums data con syntax

For what's it worth, I feel like (|True|||) looks better than (2/5|True) or (2 of 5|True). Not sure if the confusion w/r/t (x||) as or section or 3-ary anonymous sum is worth it though. On Tue, Sep 8, 2015 at 10:28 AM, Simon Peyton Jones wrote:

I can see the force of this discussion about data type constructors for sums, but

· We already do this for tuples: (,,,,) is a type constructor and you have to count commas. We could use a number here but we don’t.

· Likewise tuple sections. (,,e,) means (\xyz. (x,y,e,z))

I do not expect big sums in practice.

That said, (2/5| True) instead of (|True|||) would be ok I suppose. Or something like that.

Simon

From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Lennart Kolmodin Sent: 08 September 2015 07:12 To: Joachim Breitner Cc: ghc-devs@haskell.org Subject: Re: AnonymousSums data con syntax

2015-09-07 21:21 GMT+01:00 Joachim Breitner :

Hi,

Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones:

...
...
Are we okay with stealing some operator sections for this? E.G. (x

...
...
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.

I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.

Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.

But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.

I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars?

I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers.

I reacted the same way to the proposed syntax.

Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types.

What about explicitly stating the index as a number?

(1 | Int) :: ( String | Int | Bool )

(#1 | Int #) :: (# String | Int | Bool #)

case sum of

(0 | myString ) -> ...

(1 | myInt ) -> ...

(2 | myBool ) -> ...

This allows you to at least add new constructors at the end without changing existing code.

Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal;

case sum of

(0 of 3 | myString ) -> ...

(1 of 3 | myInt ) -> ...

(2 of 3 | myBool ) -> ...

.. and at least you don't have to count bars.

Given that of is already a keyword, how about something involving "3 of 4"? For example

(Put# True in 3 of 5) :: (# a | b | Bool | d | e #)

and

case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ...

(If "as" were a keyword, (Put# x as 2 of 3) would sound even better.)

I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better?

Greetings, Joachim

-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Simon Marlow

7:53 a.m.

New subject: Unpacking sum types

On 07/09/2015 15:35, Simon Peyton Jones wrote:

Good start.

I have updated the page to separate the source-language design (what the programmer sees) from the implementation.

And I have included boxed sums as well – it would be deeply strange not to do so.

How did you envisage implementing anonymous boxed sums? What is their heap representation? One option is to use some kind of generic object with a dynamic number of pointers and non-pointers, and one field for the tag. The layout would need to be stored in the object. This isn't a particularly efficient representation, though. Perhaps there could be a family of smaller specialised versions for common sizes. Do we have a use case for the boxed version, or is it just for consistency? Cheers Simon

Looks good to me!

Simon

*From:*Johan Tibell [mailto:johan.tibell@gmail.com] *Sent:* 01 September 2015 18:24 *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton *Cc:* ghc-devs@haskell.org *Subject:* RFC: Unpacking sum types

I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:

* the writing and clarity of the proposal and

* the proposal itself.

https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes

-- Johan

Reply

Sign in to reply online Use email software

Joachim Breitner

8:14 a.m.

New subject: Unpacking sum types

Hi, Am Dienstag, den 08.09.2015, 08:53 +0100 schrieb Simon Marlow:

On 07/09/2015 15:35, Simon Peyton Jones wrote:

...
Good start.

I have updated the page to separate the source-language design (what the programmer sees) from the implementation.

And I have included boxed sums as well – it would be deeply strange not to do so.

How did you envisage implementing anonymous boxed sums? What is their heap representation?

One option is to use some kind of generic object with a dynamic number of pointers and non-pointers, and one field for the tag.

Why a dynamic number of pointers? All constructors of an anonymous sum type contain precisely one pointer (just like Left and Right do), as they are normal boxed, polymorphic data types. Also the constructors (0 of 1 | _ ) (0 of 2 | _ ) (0 of 3 | _ ) (using Lennart’s syntax here) can all use the same info-table: At runtime, we only care about the tag, not the arity of the sum type. So just like for products, we could statically generate info tables for the constructors (0 of ? | _ ) (1 of ? | _ ) ⁞ (63 of ? | _ ) and simply do not support more than these. (Or, if we really want to support these, start to nest them. 63² will already go a long way... :-)) Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Reply

Sign in to reply online Use email software

Simon Peyton Jones

8:31 a.m.

New subject: Unpacking sum types

| How did you envisage implementing anonymous boxed sums? What is their | heap representation? *Exactly* like tuples; that is, we have a family of data type declarations: data (a|b) = (_|) a | (|_) b data (a|b|c) = (_||) a | (|_|) b | (||_) c ..etc. Simon | | One option is to use some kind of generic object with a dynamic number | of pointers and non-pointers, and one field for the tag. The layout | would need to be stored in the object. This isn't a particularly | efficient representation, though. Perhaps there could be a family of | smaller specialised versions for common sizes. | | Do we have a use case for the boxed version, or is it just for | consistency? | | Cheers | Simon | | | > Looks good to me! | > | > Simon | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] | > *Sent:* 01 September 2015 18:24 | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton | > *Cc:* ghc-devs@haskell.org | > *Subject:* RFC: Unpacking sum types | > | > I have a draft design for unpacking sum types that I'd like some | > feedback on. In particular feedback both on: | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > -- Johan | >

Reply

Sign in to reply online Use email software

Simon Marlow

8:54 a.m.

New subject: Unpacking sum types

On 08/09/2015 09:31, Simon Peyton Jones wrote:

| How did you envisage implementing anonymous boxed sums? What is their | heap representation?

*Exactly* like tuples; that is, we have a family of data type declarations:

data (a|b) = (_|) a | (|_) b

data (a|b|c) = (_||) a | (|_|) b | (||_) c ..etc.

I see, but then you can't have multiple fields, like ( (# Int,Bool #) |) You'd have to box the inner tuple too. Ok, I suppose. Cheers Simon

Simon

| | One option is to use some kind of generic object with a dynamic number | of pointers and non-pointers, and one field for the tag. The layout | would need to be stored in the object. This isn't a particularly | efficient representation, though. Perhaps there could be a family of | smaller specialised versions for common sizes. | | Do we have a use case for the boxed version, or is it just for | consistency? | | Cheers | Simon | | | > Looks good to me! | > | > Simon | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] | > *Sent:* 01 September 2015 18:24 | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton | > *Cc:* ghc-devs@haskell.org | > *Subject:* RFC: Unpacking sum types | > | > I have a draft design for unpacking sum types that I'd like some | > feedback on. In particular feedback both on: | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > -- Johan | >

Reply

Sign in to reply online Use email software

Simon Peyton Jones

11:48 a.m.

New subject: Unpacking sum types

| I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple. (# (# Int,Bool #) | Int #) Simon | -----Original Message----- | From: Simon Marlow [mailto:marlowsd@gmail.com] | Sent: 08 September 2015 09:55 | To: Simon Peyton Jones; Johan Tibell; Ryan Newton | Cc: ghc-devs@haskell.org | Subject: Re: Unpacking sum types | | On 08/09/2015 09:31, Simon Peyton Jones wrote: | > | How did you envisage implementing anonymous boxed sums? What is | > | their heap representation? | > | > *Exactly* like tuples; that is, we have a family of data type | declarations: | > | > data (a|b) = (_|) a | > | (|_) b | > | > data (a|b|c) = (_||) a | > | (|_|) b | > | (||_) c | > ..etc. | | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. | | Cheers | Simon | | | > Simon | > | > | | > | One option is to use some kind of generic object with a dynamic | > | number of pointers and non-pointers, and one field for the tag. | > | The layout would need to be stored in the object. This isn't a | > | particularly efficient representation, though. Perhaps there | could | > | be a family of smaller specialised versions for common sizes. | > | | > | Do we have a use case for the boxed version, or is it just for | > | consistency? | > | | > | Cheers | > | Simon | > | | > | | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] > *Sent:* | 01 | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; | Ryan | > | Newton > *Cc:* ghc-devs@haskell.org > *Subject:* RFC: Unpacking | > | sum types > > I have a draft design for unpacking sum types that | > | I'd like some > feedback on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | >

Reply

Sign in to reply online Use email software

Richard Eisenberg

12:05 p.m.

New subject: Unpacking sum types

I just added two design notes to the wiki page: 1. If we're stealing syntax, we're stealing quite a few operators. Things like (#|), and (|#) in terms, along with the otherwise-quite-reasonable (x ||). We're also stealing things like (||) and (#||#|) in types. The fact that we're stealing (||) at the type level is quite unfortunate, to me. I won't fight against a growing tide on this issue, but I favor not changing the lexer here and requiring lots of spaces. 2. A previous email in this thread mentioned a (0 of 2 | ...) syntax for data constructors. This might be better than forcing writers and readers to count vertical bars. (Of course, we already require counting commas.) Glad to see this coming together! Richard On Sep 8, 2015, at 7:48 AM, Simon Peyton Jones wrote:

| I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose.

Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple.

(# (# Int,Bool #) | Int #)

Simon

| -----Original Message----- | From: Simon Marlow [mailto:marlowsd@gmail.com] | Sent: 08 September 2015 09:55 | To: Simon Peyton Jones; Johan Tibell; Ryan Newton | Cc: ghc-devs@haskell.org | Subject: Re: Unpacking sum types | | On 08/09/2015 09:31, Simon Peyton Jones wrote: | > | How did you envisage implementing anonymous boxed sums? What is | > | their heap representation? | > | > *Exactly* like tuples; that is, we have a family of data type | declarations: | > | > data (a|b) = (_|) a | > | (|_) b | > | > data (a|b|c) = (_||) a | > | (|_|) b | > | (||_) c | > ..etc. | | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. | | Cheers | Simon | | | > Simon | > | > | | > | One option is to use some kind of generic object with a dynamic | > | number of pointers and non-pointers, and one field for the tag. | > | The layout would need to be stored in the object. This isn't a | > | particularly efficient representation, though. Perhaps there | could | > | be a family of smaller specialised versions for common sizes. | > | | > | Do we have a use case for the boxed version, or is it just for | > | consistency? | > | | > | Cheers | > | Simon | > | | > | | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] > *Sent:* | 01 | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; | Ryan | > | Newton > *Cc:* ghc-devs@haskell.org > *Subject:* RFC: Unpacking | > | sum types > > I have a draft design for unpacking sum types that | > | I'd like some > feedback on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | > _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Dan Doel

9 Sep 9 Sep

4:14 a.m.

New subject: Unpacking sum types

I don't think any #-based operators are stolen at the term level, because # is required at both ends. `(#| x #)` is not a legal operator section (nor is `(#| x |#)`), and (#|_#) is not an operator name. The boxed version only steals operators because you can shove the entire thing to one side. I might have missed something, though. The type level steals type operators involving #, though. On Tue, Sep 8, 2015 at 8:05 AM, Richard Eisenberg wrote:

I just added two design notes to the wiki page: 1. If we're stealing syntax, we're stealing quite a few operators. Things like (#|), and (|#) in terms, along with the otherwise-quite-reasonable (x ||). We're also stealing things like (||) and (#||#|) in types. The fact that we're stealing (||) at the type level is quite unfortunate, to me. I won't fight against a growing tide on this issue, but I favor not changing the lexer here and requiring lots of spaces.

2. A previous email in this thread mentioned a (0 of 2 | ...) syntax for data constructors. This might be better than forcing writers and readers to count vertical bars. (Of course, we already require counting commas.)

Glad to see this coming together! Richard

On Sep 8, 2015, at 7:48 AM, Simon Peyton Jones wrote:

...
| I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose.

Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple.

(# (# Int,Bool #) | Int #)

Simon

| -----Original Message----- | From: Simon Marlow [mailto:marlowsd@gmail.com] | Sent: 08 September 2015 09:55 | To: Simon Peyton Jones; Johan Tibell; Ryan Newton | Cc: ghc-devs@haskell.org | Subject: Re: Unpacking sum types | | On 08/09/2015 09:31, Simon Peyton Jones wrote: | > | How did you envisage implementing anonymous boxed sums? What is | > | their heap representation? | > | > *Exactly* like tuples; that is, we have a family of data type | declarations: | > | > data (a|b) = (_|) a | > | (|_) b | > | > data (a|b|c) = (_||) a | > | (|_|) b | > | (||_) c | > ..etc. | | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. | | Cheers | Simon | | | > Simon | > | > | | > | One option is to use some kind of generic object with a dynamic | > | number of pointers and non-pointers, and one field for the tag. | > | The layout would need to be stored in the object. This isn't a | > | particularly efficient representation, though. Perhaps there | could | > | be a family of smaller specialised versions for common sizes. | > | | > | Do we have a use case for the boxed version, or is it just for | > | consistency? | > | | > | Cheers | > | Simon | > | | > | | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] > *Sent:* | 01 | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; | Ryan | > | Newton > *Cc:* ghc-devs@haskell.org > *Subject:* RFC: Unpacking | > | sum types > > I have a draft design for unpacking sum types that | > | I'd like some > feedback on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | > _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Reply

Sign in to reply online Use email software

Simon Marlow

8 Sep 8 Sep

8:54 a.m.

New subject: Unpacking sum types

On 08/09/2015 09:31, Simon Peyton Jones wrote:

| How did you envisage implementing anonymous boxed sums? What is their | heap representation?

*Exactly* like tuples; that is, we have a family of data type declarations:

data (a|b) = (_|) a | (|_) b

data (a|b|c) = (_||) a | (|_|) b | (||_) c ..etc.

I see, but then you can't have multiple fields, like ( (# Int,Bool #) |) You'd have to box the inner tuple too. Ok, I suppose. Cheers Simon

Simon

| | One option is to use some kind of generic object with a dynamic number | of pointers and non-pointers, and one field for the tag. The layout | would need to be stored in the object. This isn't a particularly | efficient representation, though. Perhaps there could be a family of | smaller specialised versions for common sizes. | | Do we have a use case for the boxed version, or is it just for | consistency? | | Cheers | Simon | | | > Looks good to me! | > | > Simon | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] | > *Sent:* 01 September 2015 18:24 | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton | > *Cc:* ghc-devs@haskell.org | > *Subject:* RFC: Unpacking sum types | > | > I have a draft design for unpacking sum types that I'd like some | > feedback on. In particular feedback both on: | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > -- Johan | >

Reply

Sign in to reply online Use email software

3597

Age (days ago)

3605

Last active (days ago)

Download

22 comments

9 participants

tags

participants (9)

Dan Doel
David Kraeutmann
Joachim Breitner
Johan Tibell
Lennart Kolmodin
Richard Eisenberg
Ryan Newton
Simon Marlow
Simon Peyton Jones