
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on: * the writing and clarity of the proposal and * the proposal itself. https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes -- Johan

I wonder: are there issues with strict/unpacked fields in the sum
type, with regard to the 'fill in stuff' behavior?
For example:
data C = C1 !Int | C2 ![Int]
data D = D1 !Double {-# UNPACK #-} !C
Naively we might think:
data D' = D1 !Double !Tag !Int ![Int]
But this is obviously not going to work at the
Haskell-implemented-level. Since we're at a lower level, we could just
not seq the things from the opposite constructor, but are there
problems that arise from that? Also of course the !Int will probably
also be unpacked, so such prim types need different handling (fill
with 0, I guess).
--
Also, I guess this is orthogonal, but having primitive, unboxed sums
(analogous to unboxed tuples) would be nice as well. Conceivably they
could be used as part of the specification of unpacked sums, since we
can apparently put unboxed tuples in data types now. I'm not certain
if they would cover all cases, though (like the strictness concerns
above).
-- Dan
On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:
* the writing and clarity of the proposal and * the proposal itself.
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes
-- Johan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

After some discussions with SPJ I've now rewritten the proposal in terms of
unboxed sums (which should suffer from the extra seq problem you mention
above).
On Tue, Sep 1, 2015 at 11:31 AM, Dan Doel
I wonder: are there issues with strict/unpacked fields in the sum type, with regard to the 'fill in stuff' behavior?
For example:
data C = C1 !Int | C2 ![Int]
data D = D1 !Double {-# UNPACK #-} !C
Naively we might think:
data D' = D1 !Double !Tag !Int ![Int]
But this is obviously not going to work at the Haskell-implemented-level. Since we're at a lower level, we could just not seq the things from the opposite constructor, but are there problems that arise from that? Also of course the !Int will probably also be unpacked, so such prim types need different handling (fill with 0, I guess).
--
Also, I guess this is orthogonal, but having primitive, unboxed sums (analogous to unboxed tuples) would be nice as well. Conceivably they could be used as part of the specification of unpacked sums, since we can apparently put unboxed tuples in data types now. I'm not certain if they would cover all cases, though (like the strictness concerns above).
-- Dan
On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell
wrote: I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:
* the writing and clarity of the proposal and * the proposal itself.
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes
-- Johan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Just a small comment about syntax.
Why is there an "_n" suffix on the type constructor? Isn't it
syntactically evident how many things are in the |# .. | .. #| block?
More generally, are the parser changes and the wild new syntax strictly
necessary?
Could we instead just have a new keyword, but have at look like a normal
type constructor? For example, the type:
(Sum# T1 T2 T3)
Where "UnboxedSum" can't be partially applied, and is variable arity.
Likewise, "MkSum#" could be a keyword/syntactic-form:
(MkSum# 1 3 expr)
case x of MkSum# 1 3 v -> e
Here "1" and "3" are part of the syntactic form, not expressions. But it
can probably be handled after parsing and doesn't require the "_n_m"
business.
-Ryan
On Tue, Sep 1, 2015 at 6:10 PM Johan Tibell
After some discussions with SPJ I've now rewritten the proposal in terms of unboxed sums (which should suffer from the extra seq problem you mention above).
On Tue, Sep 1, 2015 at 11:31 AM, Dan Doel
wrote: I wonder: are there issues with strict/unpacked fields in the sum type, with regard to the 'fill in stuff' behavior?
For example:
data C = C1 !Int | C2 ![Int]
data D = D1 !Double {-# UNPACK #-} !C
Naively we might think:
data D' = D1 !Double !Tag !Int ![Int]
But this is obviously not going to work at the Haskell-implemented-level. Since we're at a lower level, we could just not seq the things from the opposite constructor, but are there problems that arise from that? Also of course the !Int will probably also be unpacked, so such prim types need different handling (fill with 0, I guess).
--
Also, I guess this is orthogonal, but having primitive, unboxed sums (analogous to unboxed tuples) would be nice as well. Conceivably they could be used as part of the specification of unpacked sums, since we can apparently put unboxed tuples in data types now. I'm not certain if they would cover all cases, though (like the strictness concerns above).
-- Dan
On Tue, Sep 1, 2015 at 1:23 PM, Johan Tibell
wrote: I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:
* the writing and clarity of the proposal and * the proposal itself.
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes
-- Johan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi, Am Mittwoch, den 02.09.2015, 01:44 +0000 schrieb Ryan Newton:
Why is there an "_n" suffix on the type constructor? Isn't it syntactically evident how many things are in the |# .. | .. #| block?
Correct.
More generally, are the parser changes and the wild new syntax strictly necessary?
If we just add it to Core, to support UNPACK, then there is no parser involved anyways, and the pretty-printer may do fancy stuff. (Why not unicode subscript numbers like ₂ :-)) But we probably want to provide this also on the Haskell level, just like unboxed products, right? Then we should have a nice syntax. Personally, I find (# a | b | c #) visually more pleasing. (The disadvantage is that this works only for two or more alternatives, but the one-alternative-unboxed-union is isomorphic to the one-element -unboxed-tuple anyways, isn’t it?)
Likewise, "MkSum#" could be a keyword/syntactic-form:
(MkSum# 1 3 expr) case x of MkSum# 1 3 v -> e
Here "1" and "3" are part of the syntactic form, not expressions. But it can probably be handled after parsing and doesn't require the "_n_m" business.
If we expose it on the Haskell level, I find MkSum_1_2# the right thing to do: It makes it clear that (conceptually) there really is a constructor of that name, and it is distinct from MkSum_2_2#, and the user cannot do computation with these indices. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

If we expose it on the Haskell level, I find MkSum_1_2# the right thing to do: It makes it clear that (conceptually) there really is a constructor of that name, and it is distinct from MkSum_2_2#, and the user cannot do computation with these indices.
I don't mind MkSum_1_2#, it avoids the awkwardness of attaching it to a closing delimiter. But... it does still introduce the idea of cutting up tokens to get numbers out of them, which is kind of hacky. (There seems to be a conserved particle of hackiness here that can't be eliminate, but it doesn't bother me too much.)

Hi,
just an idea that crossed my mind: Can we do without the worker/wrapper dance for data constructors if we instead phrase that in terms of pattern synonyms? Maybe that's a refactoring/code consolidation opportunity.
Good night, Joachim
Am 1. September 2015 10:23:35 PDT, schrieb Johan Tibell
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:
* the writing and clarity of the proposal and * the proposal itself.
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes
-- Johan
------------------------------------------------------------------------
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Hi, Am Dienstag, den 01.09.2015, 10:23 -0700 schrieb Johan Tibell:
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:
* the writing and clarity of the proposal and * the proposal itself.
The current proposed layout for a data D a = D a {-# UNPACK #-} !(Maybe a) would be [D’s pointer] [a] [tag (0 or 1)] [Just’s a] So the representation of D foo (Just bar) is [D_info] [&foo] [1] [&bar] and of D foo Nothing is [D_info] [&foo] [0] [&dummy] where dummy is something that makes the GC happy. But assuming this dummy object is something that is never a valid heap objects of its own, then this should be sufficient to distinguish the two cases, and we could actually have that the representation of D foo (Just bar) is [D_info] [&foo] [&bar] and of D foo Nothing is [D_info] [&foo] [&dummy] and an case analysis on D would compare the pointer in the third word with the well-known address of dummy to determine if we have Nothing or Just. This saves one word. If we generate a number of such static dummy objects, we can generalize this tag-field avoiding trick to other data types than Maybe. It seems that it is worth doing that if * the number of constructors is no more than the number of static dummy objects, and * there is one constructor which has more pointer fields than all other constructors. Also, this trick cannot be applied repeatedly: If we have data D = D {-# UNPACK #-} !(Maybe a) | D'Nothing data E = E {-# UNPACK #-} !(D a) then it cannot be applied when unpacking D into E. (Or maybe it can, but care has to be taken that D’s Nothing is represented by a different dummy object than Maybe’s Nothing.) Anyways, this is an optimization that can be implemented once unboxed sum type are finished and working reliably. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Good start. I have updated the page to separate the source-language design (what the programmer sees) from the implementation. And I have included boxed sums as well – it would be deeply strange not to do so. Looks good to me! Simon From: Johan Tibell [mailto:johan.tibell@gmail.com] Sent: 01 September 2015 18:24 To: Simon Peyton Jones; Simon Marlow; Ryan Newton Cc: ghc-devs@haskell.org Subject: RFC: Unpacking sum types I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on: * the writing and clarity of the proposal and * the proposal itself. https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes -- Johan

Are we okay with stealing some operator sections for this? E.G. (x
||). I think the boxed sums larger than 2 choices are all technically
overlapping with sections.
On Mon, Sep 7, 2015 at 10:35 AM, Simon Peyton Jones
Good start.
I have updated the page to separate the source-language design (what the programmer sees) from the implementation.
And I have included boxed sums as well – it would be deeply strange not to do so.
Looks good to me!
Simon
From: Johan Tibell [mailto:johan.tibell@gmail.com] Sent: 01 September 2015 18:24 To: Simon Peyton Jones; Simon Marlow; Ryan Newton Cc: ghc-devs@haskell.org Subject: RFC: Unpacking sum types
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:
* the writing and clarity of the proposal and
* the proposal itself.
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes
-- Johan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

| Are we okay with stealing some operator sections for this? E.G. (x
| ||). I think the boxed sums larger than 2 choices are all technically
| overlapping with sections.
I hadn't thought of that. I suppose that in distfix notation we could require spaces
(x | |)
since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.
Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.
But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.
I've updated the wiki page
Simon
| -----Original Message-----
| From: Dan Doel [mailto:dan.doel@gmail.com]
| Sent: 07 September 2015 18:53
| To: Simon Peyton Jones
| Cc: Johan Tibell; Simon Marlow; Ryan Newton; ghc-devs@haskell.org
| Subject: Re: Unpacking sum types
|
| Are we okay with stealing some operator sections for this? E.G. (x
| ||). I think the boxed sums larger than 2 choices are all technically
| overlapping with sections.
|
| On Mon, Sep 7, 2015 at 10:35 AM, Simon Peyton Jones
|

Hi, Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones:
Are we okay with stealing some operator sections for this? E.G. (x
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.
I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.
Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.
But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.
I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars? I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers. Given that of is already a keyword, how about something involving "3 of 4"? For example (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) and case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ... (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

2015-09-07 21:21 GMT+01:00 Joachim Breitner
Hi,
Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones:
Are we okay with stealing some operator sections for this? E.G. (x
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.
I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.
Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.
But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.
I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars?
I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers.
I reacted the same way to the proposed syntax. Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types. What about explicitly stating the index as a number? (1 | Int) :: ( String | Int | Bool ) (#1 | Int #) :: (# String | Int | Bool #) case sum of (0 | myString ) -> ... (1 | myInt ) -> ... (2 | myBool ) -> ... This allows you to at least add new constructors at the end without changing existing code. Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal; case sum of (0 of 3 | myString ) -> ... (1 of 3 | myInt ) -> ... (2 of 3 | myBool ) -> ... .. and at least you don't have to count bars.
Given that of is already a keyword, how about something involving "3 of 4"? For example
(Put# True in 3 of 5) :: (# a | b | Bool | d | e #)
and
case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ...
(If "as" were a keyword, (Put# x as 2 of 3) would sound even better.)
I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better?
Greetings, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I can see the force of this discussion about data type constructors for sums, but
· We already do this for tuples: (,,,,) is a type constructor and you have to count commas. We could use a number here but we don’t.
· Likewise tuple sections. (,,e,) means (\xyz. (x,y,e,z))
I do not expect big sums in practice.
That said, (2/5| True) instead of (|True|||) would be ok I suppose. Or something like that.
Simon
From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Lennart Kolmodin
Sent: 08 September 2015 07:12
To: Joachim Breitner
Cc: ghc-devs@haskell.org
Subject: Re: AnonymousSums data con syntax
2015-09-07 21:21 GMT+01:00 Joachim Breitner
Are we okay with stealing some operator sections for this? E.G. (x
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.
I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.
Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.
But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.
I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars? I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers. I reacted the same way to the proposed syntax. Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types. What about explicitly stating the index as a number? (1 | Int) :: ( String | Int | Bool ) (#1 | Int #) :: (# String | Int | Bool #) case sum of (0 | myString ) -> ... (1 | myInt ) -> ... (2 | myBool ) -> ... This allows you to at least add new constructors at the end without changing existing code. Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal; case sum of (0 of 3 | myString ) -> ... (1 of 3 | myInt ) -> ... (2 of 3 | myBool ) -> ... .. and at least you don't have to count bars. Given that of is already a keyword, how about something involving "3 of 4"? For example (Put# True in 3 of 5) :: (# a | b | Bool | d | e #) and case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ... (If "as" were a keyword, (Put# x as 2 of 3) would sound even better.) I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.demailto:mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.demailto:nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.orgmailto:nomeata@debian.org _______________________________________________ ghc-devs mailing list ghc-devs@haskell.orgmailto:ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

For what's it worth, I feel like (|True|||) looks better than
(2/5|True) or (2 of 5|True). Not sure if the confusion w/r/t (x||) as
or section or 3-ary anonymous sum is worth it though.
On Tue, Sep 8, 2015 at 10:28 AM, Simon Peyton Jones
I can see the force of this discussion about data type constructors for sums, but
· We already do this for tuples: (,,,,) is a type constructor and you have to count commas. We could use a number here but we don’t.
· Likewise tuple sections. (,,e,) means (\xyz. (x,y,e,z))
I do not expect big sums in practice.
That said, (2/5| True) instead of (|True|||) would be ok I suppose. Or something like that.
Simon
From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Lennart Kolmodin Sent: 08 September 2015 07:12 To: Joachim Breitner Cc: ghc-devs@haskell.org Subject: Re: AnonymousSums data con syntax
2015-09-07 21:21 GMT+01:00 Joachim Breitner
: Hi,
Am Montag, den 07.09.2015, 19:25 +0000 schrieb Simon Peyton Jones:
Are we okay with stealing some operator sections for this? E.G. (x
). I think the boxed sums larger than 2 choices are all technically overlapping with sections.
I hadn't thought of that. I suppose that in distfix notation we could require spaces (x | |) since vertical bar by itself isn't an operator. But then (_||) x might feel more compact.
Also a section (x ||) isn't valid in a pattern, so we would not need to require spaces there.
But my gut feel is: yes, with AnonymousSums we should just steal the syntax. It won't hurt existing code (since it won't use AnonymousSums), and if you *are* using AnonymousSums then the distfix notation is probably more valuable than the sections for an operator you probably aren't using.
I wonder if this syntax for constructors is really that great. Yes, you there is similarly with the type constructor (which is nice), but for the data constructor, do we really want an unary encoding and have our users count bars?
I believe the user (and also us, having to read core) would be better served by some syntax that involves plain numbers.
I reacted the same way to the proposed syntax.
Imagine already having an anonymous sum type and then deciding adding another constructor. Naturally you'd have to update your code to handle the new constructor, but you also need to update the code for all other constructors as well by adding another bar in the right place. That seems unnecessary and there's no need to do that for named sum types.
What about explicitly stating the index as a number?
(1 | Int) :: ( String | Int | Bool )
(#1 | Int #) :: (# String | Int | Bool #)
case sum of
(0 | myString ) -> ...
(1 | myInt ) -> ...
(2 | myBool ) -> ...
This allows you to at least add new constructors at the end without changing existing code.
Is it harder to resolve by type inference since we're not stating the number of constructors? If so we could do something similar to Joachim's proposal;
case sum of
(0 of 3 | myString ) -> ...
(1 of 3 | myInt ) -> ...
(2 of 3 | myBool ) -> ...
.. and at least you don't have to count bars.
Given that of is already a keyword, how about something involving "3 of 4"? For example
(Put# True in 3 of 5) :: (# a | b | Bool | d | e #)
and
case sum of (Put# x in 1 of 3) -> ... (Put# x in 2 of 3) -> ... (Put# x in 3 of 3) -> ...
(If "as" were a keyword, (Put# x as 2 of 3) would sound even better.)
I don’t find this particular choice very great, but something with numbers rather than ASCII art seems to make more sense here. Is there something even better?
Greetings, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On 07/09/2015 15:35, Simon Peyton Jones wrote:
Good start.
I have updated the page to separate the source-language design (what the programmer sees) from the implementation.
And I have included boxed sums as well – it would be deeply strange not to do so.
How did you envisage implementing anonymous boxed sums? What is their heap representation? One option is to use some kind of generic object with a dynamic number of pointers and non-pointers, and one field for the tag. The layout would need to be stored in the object. This isn't a particularly efficient representation, though. Perhaps there could be a family of smaller specialised versions for common sizes. Do we have a use case for the boxed version, or is it just for consistency? Cheers Simon
Looks good to me!
Simon
*From:*Johan Tibell [mailto:johan.tibell@gmail.com] *Sent:* 01 September 2015 18:24 *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton *Cc:* ghc-devs@haskell.org *Subject:* RFC: Unpacking sum types
I have a draft design for unpacking sum types that I'd like some feedback on. In particular feedback both on:
* the writing and clarity of the proposal and
* the proposal itself.
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes
-- Johan

Hi, Am Dienstag, den 08.09.2015, 08:53 +0100 schrieb Simon Marlow:
On 07/09/2015 15:35, Simon Peyton Jones wrote:
Good start.
I have updated the page to separate the source-language design (what the programmer sees) from the implementation.
And I have included boxed sums as well – it would be deeply strange not to do so.
How did you envisage implementing anonymous boxed sums? What is their heap representation?
One option is to use some kind of generic object with a dynamic number of pointers and non-pointers, and one field for the tag.
Why a dynamic number of pointers? All constructors of an anonymous sum type contain precisely one pointer (just like Left and Right do), as they are normal boxed, polymorphic data types. Also the constructors (0 of 1 | _ ) (0 of 2 | _ ) (0 of 3 | _ ) (using Lennart’s syntax here) can all use the same info-table: At runtime, we only care about the tag, not the arity of the sum type. So just like for products, we could statically generate info tables for the constructors (0 of ? | _ ) (1 of ? | _ ) ⁞ (63 of ? | _ ) and simply do not support more than these. (Or, if we really want to support these, start to nest them. 63² will already go a long way... :-)) Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

| How did you envisage implementing anonymous boxed sums? What is their | heap representation? *Exactly* like tuples; that is, we have a family of data type declarations: data (a|b) = (_|) a | (|_) b data (a|b|c) = (_||) a | (|_|) b | (||_) c ..etc. Simon | | One option is to use some kind of generic object with a dynamic number | of pointers and non-pointers, and one field for the tag. The layout | would need to be stored in the object. This isn't a particularly | efficient representation, though. Perhaps there could be a family of | smaller specialised versions for common sizes. | | Do we have a use case for the boxed version, or is it just for | consistency? | | Cheers | Simon | | | > Looks good to me! | > | > Simon | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] | > *Sent:* 01 September 2015 18:24 | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton | > *Cc:* ghc-devs@haskell.org | > *Subject:* RFC: Unpacking sum types | > | > I have a draft design for unpacking sum types that I'd like some | > feedback on. In particular feedback both on: | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > -- Johan | >

On 08/09/2015 09:31, Simon Peyton Jones wrote:
| How did you envisage implementing anonymous boxed sums? What is their | heap representation?
*Exactly* like tuples; that is, we have a family of data type declarations:
data (a|b) = (_|) a | (|_) b
data (a|b|c) = (_||) a | (|_|) b | (||_) c ..etc.
I see, but then you can't have multiple fields, like ( (# Int,Bool #) |) You'd have to box the inner tuple too. Ok, I suppose. Cheers Simon
Simon
| | One option is to use some kind of generic object with a dynamic number | of pointers and non-pointers, and one field for the tag. The layout | would need to be stored in the object. This isn't a particularly | efficient representation, though. Perhaps there could be a family of | smaller specialised versions for common sizes. | | Do we have a use case for the boxed version, or is it just for | consistency? | | Cheers | Simon | | | > Looks good to me! | > | > Simon | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] | > *Sent:* 01 September 2015 18:24 | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton | > *Cc:* ghc-devs@haskell.org | > *Subject:* RFC: Unpacking sum types | > | > I have a draft design for unpacking sum types that I'd like some | > feedback on. In particular feedback both on: | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > -- Johan | >

| I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple. (# (# Int,Bool #) | Int #) Simon | -----Original Message----- | From: Simon Marlow [mailto:marlowsd@gmail.com] | Sent: 08 September 2015 09:55 | To: Simon Peyton Jones; Johan Tibell; Ryan Newton | Cc: ghc-devs@haskell.org | Subject: Re: Unpacking sum types | | On 08/09/2015 09:31, Simon Peyton Jones wrote: | > | How did you envisage implementing anonymous boxed sums? What is | > | their heap representation? | > | > *Exactly* like tuples; that is, we have a family of data type | declarations: | > | > data (a|b) = (_|) a | > | (|_) b | > | > data (a|b|c) = (_||) a | > | (|_|) b | > | (||_) c | > ..etc. | | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. | | Cheers | Simon | | | > Simon | > | > | | > | One option is to use some kind of generic object with a dynamic | > | number of pointers and non-pointers, and one field for the tag. | > | The layout would need to be stored in the object. This isn't a | > | particularly efficient representation, though. Perhaps there | could | > | be a family of smaller specialised versions for common sizes. | > | | > | Do we have a use case for the boxed version, or is it just for | > | consistency? | > | | > | Cheers | > | Simon | > | | > | | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] > *Sent:* | 01 | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; | Ryan | > | Newton > *Cc:* ghc-devs@haskell.org > *Subject:* RFC: Unpacking | > | sum types > > I have a draft design for unpacking sum types that | > | I'd like some > feedback on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | >

I just added two design notes to the wiki page:
1. If we're stealing syntax, we're stealing quite a few operators. Things like (#|), and (|#) in terms, along with the otherwise-quite-reasonable (x ||). We're also stealing things like (||) and (#||#|) in types. The fact that we're stealing (||) at the type level is quite unfortunate, to me. I won't fight against a growing tide on this issue, but I favor not changing the lexer here and requiring lots of spaces.
2. A previous email in this thread mentioned a (0 of 2 | ...) syntax for data constructors. This might be better than forcing writers and readers to count vertical bars. (Of course, we already require counting commas.)
Glad to see this coming together!
Richard
On Sep 8, 2015, at 7:48 AM, Simon Peyton Jones
| I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose.
Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple.
(# (# Int,Bool #) | Int #)
Simon
| -----Original Message----- | From: Simon Marlow [mailto:marlowsd@gmail.com] | Sent: 08 September 2015 09:55 | To: Simon Peyton Jones; Johan Tibell; Ryan Newton | Cc: ghc-devs@haskell.org | Subject: Re: Unpacking sum types | | On 08/09/2015 09:31, Simon Peyton Jones wrote: | > | How did you envisage implementing anonymous boxed sums? What is | > | their heap representation? | > | > *Exactly* like tuples; that is, we have a family of data type | declarations: | > | > data (a|b) = (_|) a | > | (|_) b | > | > data (a|b|c) = (_||) a | > | (|_|) b | > | (||_) c | > ..etc. | | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. | | Cheers | Simon | | | > Simon | > | > | | > | One option is to use some kind of generic object with a dynamic | > | number of pointers and non-pointers, and one field for the tag. | > | The layout would need to be stored in the object. This isn't a | > | particularly efficient representation, though. Perhaps there | could | > | be a family of smaller specialised versions for common sizes. | > | | > | Do we have a use case for the boxed version, or is it just for | > | consistency? | > | | > | Cheers | > | Simon | > | | > | | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] > *Sent:* | 01 | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; | Ryan | > | Newton > *Cc:* ghc-devs@haskell.org > *Subject:* RFC: Unpacking | > | sum types > > I have a draft design for unpacking sum types that | > | I'd like some > feedback on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | > _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I don't think any #-based operators are stolen at the term level,
because # is required at both ends.
`(#| x #)` is not a legal operator section (nor is `(#| x |#)`), and
(#|_#) is not an operator name. The boxed version only steals
operators because you can shove the entire thing to one side.
I might have missed something, though.
The type level steals type operators involving #, though.
On Tue, Sep 8, 2015 at 8:05 AM, Richard Eisenberg
I just added two design notes to the wiki page: 1. If we're stealing syntax, we're stealing quite a few operators. Things like (#|), and (|#) in terms, along with the otherwise-quite-reasonable (x ||). We're also stealing things like (||) and (#||#|) in types. The fact that we're stealing (||) at the type level is quite unfortunate, to me. I won't fight against a growing tide on this issue, but I favor not changing the lexer here and requiring lots of spaces.
2. A previous email in this thread mentioned a (0 of 2 | ...) syntax for data constructors. This might be better than forcing writers and readers to count vertical bars. (Of course, we already require counting commas.)
Glad to see this coming together! Richard
On Sep 8, 2015, at 7:48 AM, Simon Peyton Jones
wrote: | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose.
Well of course! It's just a parameterised data type, like a tuple. But, just like unboxed tuples, you could have an unboxed tuple (or sum) inside an unboxed tuple.
(# (# Int,Bool #) | Int #)
Simon
| -----Original Message----- | From: Simon Marlow [mailto:marlowsd@gmail.com] | Sent: 08 September 2015 09:55 | To: Simon Peyton Jones; Johan Tibell; Ryan Newton | Cc: ghc-devs@haskell.org | Subject: Re: Unpacking sum types | | On 08/09/2015 09:31, Simon Peyton Jones wrote: | > | How did you envisage implementing anonymous boxed sums? What is | > | their heap representation? | > | > *Exactly* like tuples; that is, we have a family of data type | declarations: | > | > data (a|b) = (_|) a | > | (|_) b | > | > data (a|b|c) = (_||) a | > | (|_|) b | > | (||_) c | > ..etc. | | I see, but then you can't have multiple fields, like | | ( (# Int,Bool #) |) | | You'd have to box the inner tuple too. Ok, I suppose. | | Cheers | Simon | | | > Simon | > | > | | > | One option is to use some kind of generic object with a dynamic | > | number of pointers and non-pointers, and one field for the tag. | > | The layout would need to be stored in the object. This isn't a | > | particularly efficient representation, though. Perhaps there | could | > | be a family of smaller specialised versions for common sizes. | > | | > | Do we have a use case for the boxed version, or is it just for | > | consistency? | > | | > | Cheers | > | Simon | > | | > | | > | > Looks good to me! | > | > | > | > Simon | > | > | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] > *Sent:* | 01 | > | September 2015 18:24 > *To:* Simon Peyton Jones; Simon Marlow; | Ryan | > | Newton > *Cc:* ghc-devs@haskell.org > *Subject:* RFC: Unpacking | > | sum types > > I have a draft design for unpacking sum types that | > | I'd like some > feedback on. In particular feedback both on: | > | > | > | > * the writing and clarity of the proposal and | > | > | > | > * the proposal itself. | > | > | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > | > | > -- Johan | > | > | > _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On 08/09/2015 09:31, Simon Peyton Jones wrote:
| How did you envisage implementing anonymous boxed sums? What is their | heap representation?
*Exactly* like tuples; that is, we have a family of data type declarations:
data (a|b) = (_|) a | (|_) b
data (a|b|c) = (_||) a | (|_|) b | (||_) c ..etc.
I see, but then you can't have multiple fields, like ( (# Int,Bool #) |) You'd have to box the inner tuple too. Ok, I suppose. Cheers Simon
Simon
| | One option is to use some kind of generic object with a dynamic number | of pointers and non-pointers, and one field for the tag. The layout | would need to be stored in the object. This isn't a particularly | efficient representation, though. Perhaps there could be a family of | smaller specialised versions for common sizes. | | Do we have a use case for the boxed version, or is it just for | consistency? | | Cheers | Simon | | | > Looks good to me! | > | > Simon | > | > *From:*Johan Tibell [mailto:johan.tibell@gmail.com] | > *Sent:* 01 September 2015 18:24 | > *To:* Simon Peyton Jones; Simon Marlow; Ryan Newton | > *Cc:* ghc-devs@haskell.org | > *Subject:* RFC: Unpacking sum types | > | > I have a draft design for unpacking sum types that I'd like some | > feedback on. In particular feedback both on: | > | > * the writing and clarity of the proposal and | > | > * the proposal itself. | > | > https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes | > | > -- Johan | >
participants (9)
-
Dan Doel
-
David Kraeutmann
-
Joachim Breitner
-
Johan Tibell
-
Lennart Kolmodin
-
Richard Eisenberg
-
Ryan Newton
-
Simon Marlow
-
Simon Peyton Jones