
I have just postedhttps://github.com/ghc-proposals/ghc-proposals/pull/265#issuecomment-8966266... this, about unlifted data types. Yikes! Simon In accepted proposal #265 on Unlifted Datatypeshttps://github.com/ghc-proposals/ghc-proposals/pull/265 I have just realised that in this accepted, and implemented proposal we have done something entirely new. Consider this (from https://gitlab.haskell.org/ghc/ghc/-/issues/20204) type IntU :: Bool -> Wombat ...LOTS OF CODE... data IntU a b = IntU Int ...MORE CODE... type Wombat =Type -> TYPE UnliftedRep The kind signature for IntU completely changes the semantics of the data IntU declaration, and yet can be separate from it. That is new: generally, signatures can restrict the applicability of something, but don't change its semantics. (Yes, with overlapping instances, certainly incoherent instances, you could change semantics, but the general principal holds.) Even if it is adjacent, the fact that it's unlifted is quite subtle... you have to look to the right of the arrows, and then through the distant (and perhaps imported) type synonym Wombat. I'm not very happy with a distant kind signature having such a profound effect on the semantics of the data type. Indeed in my comment abovehttps://github.com/ghc-proposals/ghc-proposals/pull/265#issuecomment-5257055... I suggested a keyword data unlifted IntU a b = IntU Int to signal that it's an unlifted data type. But then I went AWOL and didn't pursue the matter. I don't know why I was so negligent. So this post is to ask: does anyone else think this is bizarre? I'm inclined to make a proposal to add the keyword, but I thought I'd test the waters first.

I think the distance between the `type IntU` and `data IntU` is the crux of the issue. If we were forced to keep the signature and declaration together, i.e. ``` type IntU :: Bool -> Wombat data IntU a b = IntU Int ``` I wouldn't be bothered by the distance from the definition of `Wombat`. If I'm reading this code, and I already know what Wombat is, there's no problem. If I *don't* know what Wombat is, then I need to look up its definition anyway to make sense of IntU. I've never been particularly fond of Haskell's tolerance for separating signatures and definitions. It has some practical uses -- notably in combination with -XCPP it makes it easy to ensure signatures are consistent across CPP branches -- but in general I think distance between signature and definition is a smell. On Wed, Aug 11, 2021, at 03:48, Simon Peyton Jones via ghc-steering-committee wrote:
I have just posted https://github.com/ghc-proposals/ghc-proposals/pull/265#issuecomment-8966266... this, about unlifted data types. Yikes! Simon
*In accepted proposal #265 on Unlifted Datatypes https://github.com/ghc-proposals/ghc-proposals/pull/265* I have just realised that in this accepted, and implemented proposal we have done something entirely new. Consider this (from https://gitlab.haskell.org/ghc/ghc/-/issues/20204)
`type IntU :: Bool -> Wombat` ` ` `...LOTS OF CODE...` ` ` `data IntU a b = IntU Int` ` ` `...MORE CODE...` ` ` `type Wombat =Type -> TYPE UnliftedRep` The kind signature for `IntU` *completely changes the semantics of the *`*data IntU*`* declaration*, and yet can be separate from it. That is new: generally, signatures can restrict the applicability of something, but *don't change its semantics*. (Yes, with overlapping instances, certainly incoherent instances, you could change semantics, but the general principal holds.)
Even if it is adjacent, the fact that it's unlifted is quite subtle... you have to look to the right of the arrows, and then through the distant (and perhaps imported) type synonym `Wombat`.
I'm not very happy with a distant kind signature having such a profound effect on the semantics of the data type. Indeed in my comment above https://github.com/ghc-proposals/ghc-proposals/pull/265#issuecomment-5257055... I suggested a keyword
`data unlifted IntU a b = IntU Int` to signal that it's an *unlifted* data type. But then I went AWOL and didn't pursue the matter. I don't know why I was so negligent.
So this post is to ask: does anyone else think this is bizarre? I'm inclined to make a proposal to add the keyword, but I thought I'd test the waters first.
_______________________________________________ ghc-steering-committee mailing list ghc-steering-committee@haskell.org https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee

I'm bothered both by distance, and the hidden-ness of Wombat, and (most fundamentally) by the fact that the data type declaration itself is not self-describing. Usually if I look at a type signature for a function and don't really understand it, I can rely on the type checker to ensure that it's only called in well-typed ways. I can also just look at the code for the function, and infer a type that is either the same or more general.
Not so here. I boggled that we can write
data IntU a b = IntU Int
and not know whether it is lifted or not. That's about *semantics* and runtime behaviour, not mere static acceptance/rejection!
I suppose my base principle is:
* If GHC can infer a type for a declaration
* Then it's ok to omit or ignore the type/kind signature
That is, signatures (a) are needed when inference is Too Hard (b) restrict polymorphism. But they should not (c) change semantics.
Now I know that because of dark corners of the language, defaulting etc, the signature of a function *can* affect its semantics, but it is rare and very much a dark corner -- it's a wart. But this is right at the heart of what every data declaration means. I don't want to use one small wart to justify a much larger wartier wart.
I wish I had realised this at the time. Apologies for that.
Simon
| -----Original Message-----
| From: Eric Seidel

Hello, I am retired but still on the list so I read it sometimes :). (feel free to unsubscribe me) But since I was not sure where else to comment, I thought I'd chime in here to say that I also find this pretty confusing. My assumption/intuition is that 'data' should always introduce a lifted type, and 'newtype' should have the same liftedness as its definition. I would much prefer that if we wanted to have some sort of unified 'data' we should have a separate construct for it (e.g., like the proposed 'data unlifted') Iavor On Wed, Aug 18, 2021, 11:33 Simon Peyton Jones via ghc-steering-committee < ghc-steering-committee@haskell.org> wrote:
I'm bothered both by distance, and the hidden-ness of Wombat, and (most fundamentally) by the fact that the data type declaration itself is not self-describing. Usually if I look at a type signature for a function and don't really understand it, I can rely on the type checker to ensure that it's only called in well-typed ways. I can also just look at the code for the function, and infer a type that is either the same or more general.
Not so here. I boggled that we can write data IntU a b = IntU Int and not know whether it is lifted or not. That's about *semantics* and runtime behaviour, not mere static acceptance/rejection!
I suppose my base principle is: * If GHC can infer a type for a declaration * Then it's ok to omit or ignore the type/kind signature
That is, signatures (a) are needed when inference is Too Hard (b) restrict polymorphism. But they should not (c) change semantics.
Now I know that because of dark corners of the language, defaulting etc, the signature of a function *can* affect its semantics, but it is rare and very much a dark corner -- it's a wart. But this is right at the heart of what every data declaration means. I don't want to use one small wart to justify a much larger wartier wart.
I wish I had realised this at the time. Apologies for that.
Simon
| -----Original Message----- | From: Eric Seidel
| Sent: 18 August 2021 03:57 | To: Simon Peyton Jones ; ghc-steering- | committee@haskell.org; Sebastian Graf | Subject: Re: [ghc-steering-committee] Unlifted data types | | I think the distance between the `type IntU` and `data IntU` is the crux of | the issue. If we were forced to keep the signature and declaration together, | i.e. | | ``` | type IntU :: Bool -> Wombat | data IntU a b = IntU Int | ``` | | I wouldn't be bothered by the distance from the definition of `Wombat`. | | If I'm reading this code, and I already know what Wombat is, there's no | problem. If I *don't* know what Wombat is, then I need to look up its | definition anyway to make sense of IntU. | | I've never been particularly fond of Haskell's tolerance for separating | signatures and definitions. It has some practical uses -- notably in | combination with -XCPP it makes it easy to ensure signatures are consistent | across CPP branches -- but in general I think distance between signature and | definition is a smell. | | On Wed, Aug 11, 2021, at 03:48, Simon Peyton Jones via ghc-steering- | committee wrote: | > | > I have just posted | > | < https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co | m%2Fghc-proposals%2Fghc-proposals%2Fpull%2F265%23issuecomment- | 896626642&data=04%7C01%7Csimonpj%40microsoft.com %7C9bd05c205b5d4757d4cd0 | 8d961f3ea95%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637648524474023194% | 7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi | LCJXVCI6Mn0%3D%7C3000&sdata=sr6XzIgF5CSnLyol8jSfjHToJpD86Q7bPHnKkaMjMgI% | 3D&reserved=0> this, about unlifted data types. Yikes! | > Simon | > | > *In accepted proposal #265 on Unlifted Datatypes | > < https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit | > hub.com %2Fghc-proposals%2Fghc-proposals%2Fpull%2F265&data=04%7C01% | > 7Csimonpj%40microsoft.com %7C9bd05c205b5d4757d4cd08d961f3ea95%7C72f988b | > f86f141af91ab2d7cd011db47%7C1%7C0%7C637648524474033189%7CUnknown%7CTWF | > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M | > n0%3D%7C3000&sdata=OWlj9RxqQfTBZcjp1QE5AUSSvnG%2By3wbuXL2VbqEGpw%3 | > D&reserved=0>* I have just realised that in this accepted, and | > implemented proposal we have done something entirely new. Consider | > this (from | > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl | > ab.haskell.org %2Fghc%2Fghc%2F-%2Fissues%2F20204&data=04%7C01%7Csim | > onpj%40microsoft.com %7C9bd05c205b5d4757d4cd08d961f3ea95%7C72f988bf86f1 | > 41af91ab2d7cd011db47%7C1%7C0%7C637648524474033189%7CUnknown%7CTWFpbGZs | > b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D | > %7C3000&sdata=9SNn4GSIs%2Bd40SBQCEpYfYh7hO73XLoXY%2BWE2uuTg0k%3D&a | > mp;reserved=0) | > | > `type IntU :: Bool -> Wombat` ` ` `...LOTS OF CODE...` ` ` `data IntU | > a b = IntU Int` ` ` `...MORE CODE...` ` ` `type Wombat =Type -> TYPE | > UnliftedRep` The kind signature for `IntU` *completely changes the | > semantics of the *`*data IntU*`* declaration*, and yet can be separate | > from it. That is | > new: generally, signatures can restrict the applicability of | > something, but *don't change its semantics*. (Yes, with overlapping | > instances, certainly incoherent instances, you could change semantics, | > but the general principal holds.) | > | > Even if it is adjacent, the fact that it's unlifted is quite subtle... | > you have to look to the right of the arrows, and then through the | > distant (and perhaps imported) type synonym `Wombat`. | > | > I'm not very happy with a distant kind signature having such a | > profound effect on the semantics of the data type. Indeed in my | > comment above | > < https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit | > hub.com %2Fghc-proposals%2Fghc-proposals%2Fpull%2F265%23issuecomment-52 | > 5705557&data=04%7C01%7Csimonpj%40microsoft.com %7C9bd05c205b5d4757d | > 4cd08d961f3ea95%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637648524 | > 474033189%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi | > LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=fB257%2Flsx8ucSnW%2Fgc | > %2FPP50rVz4e2MJJCZgmc%2FB08jo%3D&reserved=0> I suggested a keyword | > | > `data unlifted IntU a b = IntU Int` | > to signal that it's an *unlifted* data type. But then I went AWOL and | > didn't pursue the matter. I don't know why I was so negligent. | > | > So this post is to ask: does anyone else think this is bizarre? I'm | > inclined to make a proposal to add the keyword, but I thought I'd test | > the waters first. | > | > | > _______________________________________________ | > ghc-steering-committee mailing list | > ghc-steering-committee@haskell.org | > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail | > .haskell.org %2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-steering-committee&a | > mp;data=04%7C01%7Csimonpj%40microsoft.com %7C9bd05c205b5d4757d4cd08d961 | > f3ea95%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637648524474033189 | > %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I | > k1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=rwunJpephHqcWKNE%2F0IpMPS21mWSL | > uyYr3V%2FVs6hfCo%3D&reserved=0 | > _______________________________________________ ghc-steering-committee mailing list ghc-steering-committee@haskell.org https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee

On Wed, Aug 18, 2021, at 03:32, Simon Peyton Jones wrote:
Usually if I look at a type signature for a function and don't really understand it, I can rely on the type checker to ensure that it's only called in well-typed ways. I can also just look at the code for the function, and infer a type that is either the same or more general.
Note that we can still rely on the type checker to ensure that IntU is only used in well-typed ways. This is strictly about communicating semantics to other programmers (not suggesting that's less important).
Not so here. I boggled that we can write data IntU a b = IntU Int and not know whether it is lifted or not. That's about *semantics* and runtime behaviour, not mere static acceptance/rejection!
I suppose my base principle is: * If GHC can infer a type for a declaration * Then it's ok to omit or ignore the type/kind signature
That is, signatures (a) are needed when inference is Too Hard (b) restrict polymorphism. But they should not (c) change semantics.
Now I know that because of dark corners of the language, defaulting etc, the signature of a function *can* affect its semantics, but it is rare and very much a dark corner -- it's a wart. But this is right at the heart of what every data declaration means. I don't want to use one small wart to justify a much larger wartier wart.
Type classes (and in particular the monomorphism restriction) are a very prominent example of signatures affecting runtime semantics in fundamental ways. The monomorphism restriction itself is a wart, but the underlying difference between foo :: Num a => a and foo :: Int in terms of runtime representation and sharing are core to the language. And in my opinion, the type signature is *exactly* the right place to capture these semantic differences. We use types to communicate the externally-visible behavior of our programs: the number of arguments to a function and their types, effects incurred in the form of monads, etc. Why should liftedness and the implications for evaluation strategies be different? (In fact, as I've suggested before, I think we should go further and expose the strictness of function arguments in the type system, but that's a separate discussion.) --- On a semi-related note, this prompted me to go back and reread the proposal, and I'm not entirely clear on one point. In ``` data Foo :: TYPE (BoxedRep Unlifted) where MkFoo :: Int -> Foo ``` Foo is unlifted and arguments of type Foo will be evaluated strictly, but is MkFoo itself strict in its Int argument? The proposal doesn't seem to say, and I could see arguments in either direction.

| We use types to communicate the externally-visible behavior of our
| programs: the number of arguments to a function and their types, effects
| incurred in the form of monads, etc. Why should liftedness and the
| implications for evaluation strategies be different?
We use them to *communicate* behaviour, but not to *specify* behaviour. That is done by the definition of the function itself.
I'm keen to hear from other members of the committee.
Maybe I should make a proposal, to seek broader input.
Simon
| -----Original Message-----
| From: Eric Seidel

On Thu, Aug 19, 2021, at 06:14, Simon Peyton Jones wrote:
| We use types to communicate the externally-visible behavior of our | programs: the number of arguments to a function and their types, effects | incurred in the form of monads, etc. Why should liftedness and the | implications for evaluation strategies be different?
We use them to *communicate* behaviour, but not to *specify* behaviour. That is done by the definition of the function itself.
I disagree, types are a great specification language. The whole point of advanced type system features is to enhance the specification language so we can specify more interesting properties. Often, we do this so that the type checker can *verify* our code, but there are plenty of cases where we use types to *drive code generation*. Consider GHC.Generics. We can write a highly-polymorphic JSON serialization function gToJSON :: Generic a => a -> JSON which we can then instantiate at different types, e.g. data Foo fooToJSON :: Foo -> JSON fooToJSON = gToJSON data Bar barToJSON :: Bar -> JSON barToJSON = gToJSON fooToJSON and barToJSON have the exact same definition, but very different semantics. Their complete semantics depends on *both* their definition *and* their type signatures. I think the deeper point here is that Haskell's dynamic semantics depends on its static semantics. We can't evaluate a Haskell program without knowing the types involved.

Hi,
FWIW, I'm neither a proponent of adding an unlifted keyword or modifier,
nor a strong opponent.
I just want to make aware of the point I raise in
https://github.com/ghc-proposals/ghc-proposals/pull/265#issuecomment-9019041...,
namely that
No keyword is sufficient to describe the full spectrum of
levity-polymorphic data typesOur language server is already smart enough
to give the necessary info on hover, without even needing to navigate to
the definition (where we would hope to see `unlifted` or a lack
thereof). It also works with levity-polymorphism.
Cheers,
Sebastian
------ Originalnachricht ------
Von: "Eric Seidel"
On Thu, Aug 19, 2021, at 06:14, Simon Peyton Jones wrote:
| We use types to communicate the externally-visible behavior of our | programs: the number of arguments to a function and their types, effects | incurred in the form of monads, etc. Why should liftedness and the | implications for evaluation strategies be different?
We use them to *communicate* behaviour, but not to *specify* behaviour. That is done by the definition of the function itself.
I disagree, types are a great specification language. The whole point of advanced type system features is to enhance the specification language so we can specify more interesting properties.
Often, we do this so that the type checker can *verify* our code, but there are plenty of cases where we use types to *drive code generation*. Consider GHC.Generics. We can write a highly-polymorphic JSON serialization function
gToJSON :: Generic a => a -> JSON
which we can then instantiate at different types, e.g.
data Foo fooToJSON :: Foo -> JSON fooToJSON = gToJSON
data Bar barToJSON :: Bar -> JSON barToJSON = gToJSON
fooToJSON and barToJSON have the exact same definition, but very different semantics. Their complete semantics depends on *both* their definition *and* their type signatures.
I think the deeper point here is that Haskell's dynamic semantics depends on its static semantics. We can't evaluate a Haskell program without knowing the types involved.
participants (4)
-
Eric Seidel
-
Iavor Diatchki
-
Sebastian Graf
-
Simon Peyton Jones