Unlifted data types

Hello friends, After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal: https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal. Cheers, Edward

You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of foo :: !Maybe Int -> !Maybe Int foo x = x bar = foo (Just undefined) are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct? Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while foo1 :: !Int -> !Int foo1 x = x and foo2 :: Int -> Int foo2 !x = x have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable? Eric On Fri, Sep 4, 2015, at 01:03, Edward Z. Yang wrote:
Hello friends,
After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal:
https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal.
Cheers, Edward _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700:
You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of
foo :: !Maybe Int -> !Maybe Int foo x = x
bar = foo (Just undefined)
are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct?
Yes. So maybe NFData is a *bad* example!
Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while
foo1 :: !Int -> !Int foo1 x = x
and
foo2 :: Int -> Int foo2 !x = x
have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable?
Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions. Edward

Excerpts from Edward Z. Yang's message of 2015-09-04 08:43:48 -0700:
Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions.
Michael Greenberg points out on Twitter that suspend must be a special form, just like lambda abstraction. Edward

| Michael Greenberg points out on Twitter that suspend must be a special | form, just like lambda abstraction. This isn't reflected on the wiki. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Edward | Z. Yang | Sent: 04 September 2015 16:46 | To: Eric Seidel; ghc-devs | Subject: Re: Unlifted data types | | Excerpts from Edward Z. Yang's message of 2015-09-04 08:43:48 -0700: | > Yes. Actually, you have a good point that we'd like to have functions | > 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we | > can't generate 'Coercible' instances for these types unless Coercible | becomes | > polykinded. Perhaps we can make a new type class, or just magic | > polymorphic functions. | | Michael Greenberg points out on Twitter that suspend must be a special | form, just like lambda abstraction. | | Edward | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Another good example would be foo :: ![Int] -> ![Int] Does this force just the first constructor or the whole spine? My guess would be the latter. On Fri, Sep 4, 2015, at 08:43, Edward Z. Yang wrote:
Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700:
You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of
foo :: !Maybe Int -> !Maybe Int foo x = x
bar = foo (Just undefined)
are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct?
Yes. So maybe NFData is a *bad* example!
Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while
foo1 :: !Int -> !Int foo1 x = x
and
foo2 :: Int -> Int foo2 !x = x
have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable?
Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions.
Edward

Hello Eric, You can't tell; the head not withstanding, `[a]` is still a lazy list, so you would need to look at the function body to see if any extra forcing goes on. `Force` does not induce `seq`ing: it is an obligation for the call-site. (Added it to the FAQ). Edward Excerpts from Eric Seidel's message of 2015-09-04 09:06:15 -0700:
Another good example would be
foo :: ![Int] -> ![Int]
Does this force just the first constructor or the whole spine? My guess would be the latter.
On Fri, Sep 4, 2015, at 08:43, Edward Z. Yang wrote:
Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700:
You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of
foo :: !Maybe Int -> !Maybe Int foo x = x
bar = foo (Just undefined)
are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct?
Yes. So maybe NFData is a *bad* example!
Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while
foo1 :: !Int -> !Int foo1 x = x
and
foo2 :: Int -> Int foo2 !x = x
have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable?
Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions.
Edward

All your examples are non-recursive types. So, if I have:
data Nat = Zero | Suc Nat
what is !Nat? Does it just have the outer-most part unlifted?
Is the intention to make the !a in data type declarations first-class,
so that when we say:
data Nat = Zero | Suc !Nat
the !Nat part is now an entity in itself, and it is, for this
declaration, the set of naturals, whereas Nat is the flat domain?
On Fri, Sep 4, 2015 at 4:03 AM, Edward Z. Yang
Hello friends,
After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal:
https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal.
Cheers, Edward _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Excerpts from Dan Doel's message of 2015-09-04 09:57:42 -0700:
All your examples are non-recursive types. So, if I have:
data Nat = Zero | Suc Nat
what is !Nat? Does it just have the outer-most part unlifted?
Just the outermost part.
Is the intention to make the !a in data type declarations first-class, so that when we say:
data Nat = Zero | Suc !Nat
the !Nat part is now an entity in itself, and it is, for this declaration, the set of naturals, whereas Nat is the flat domain?
No, in fact, there is a semantic difference between this and strict fields (which Paul pointed out to me.) There's now an updated proposal on the Trac which partially solves this problem. Edward

Okay. That answers another question I had, which was whether MutVar#
and such would go in the new kind.
So now we have partial, extended natural numbers:
data PNat :: * where
PZero :: PNat
PSuc :: PNat -> PNat
A flat domain of natural numbers:
data FNat :: * where
FZero :: FNat
FSuc :: !FNat -> FNat
And two sets of natural numbers:
Force FNat :: Unlifted
data UNat :: Unlifted where
UZero :: UNat
USuc :: UNat -> UNat
And really perhaps two flat domains (and three sets), if you use Force
instead of !, which would differ on who ensures the evaluation. That's
kind of a lot of incompatible definitions of essentially the same
thing (PNat being the significantly different thing).
I was kind of more enthused about first class !a. For instance, if you
think about the opening quote by Bob Harper, he's basically wrong. The
flat domain FNat is the natural numbers (existing in an overall lazy
language), and has the reasoning properties he wants to teach students
about with very little complication. It'd be satisfying to recognize
that unlifting the outer-most part gets you exactly there, with
whatever performance characteristics that implies. Or to get rid of !
and use Unlifted definitions instead.
Maybe backwards compatibility mandates the duplication, but it'd be
nice if some synthesis could be reached.
----
It'd also be good to think about/specify how this is going to interact
with unpacked/unboxed sums.
On Fri, Sep 4, 2015 at 2:12 PM, Edward Z. Yang
Excerpts from Dan Doel's message of 2015-09-04 09:57:42 -0700:
All your examples are non-recursive types. So, if I have:
data Nat = Zero | Suc Nat
what is !Nat? Does it just have the outer-most part unlifted?
Just the outermost part.
Is the intention to make the !a in data type declarations first-class, so that when we say:
data Nat = Zero | Suc !Nat
the !Nat part is now an entity in itself, and it is, for this declaration, the set of naturals, whereas Nat is the flat domain?
No, in fact, there is a semantic difference between this and strict fields (which Paul pointed out to me.) There's now an updated proposal on the Trac which partially solves this problem.
Edward

Excerpts from Dan Doel's message of 2015-09-04 13:09:26 -0700:
Okay. That answers another question I had, which was whether MutVar# and such would go in the new kind.
So now we have partial, extended natural numbers:
data PNat :: * where PZero :: PNat PSuc :: PNat -> PNat
A flat domain of natural numbers:
data FNat :: * where FZero :: FNat FSuc :: !FNat -> FNat
And two sets of natural numbers:
Force FNat :: Unlifted
data UNat :: Unlifted where UZero :: UNat USuc :: UNat -> UNat
And really perhaps two flat domains (and three sets), if you use Force instead of !, which would differ on who ensures the evaluation. That's kind of a lot of incompatible definitions of essentially the same thing (PNat being the significantly different thing).
I was kind of more enthused about first class !a. For instance, if you think about the opening quote by Bob Harper, he's basically wrong. The flat domain FNat is the natural numbers (existing in an overall lazy language), and has the reasoning properties he wants to teach students about with very little complication. It'd be satisfying to recognize that unlifting the outer-most part gets you exactly there, with whatever performance characteristics that implies. Or to get rid of ! and use Unlifted definitions instead.
Maybe backwards compatibility mandates the duplication, but it'd be nice if some synthesis could be reached.
I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write: let x = undefined y = Strict x in True No error occurs with: data Strict = Strict !a But an error occurs with: data Strict = Strict (Force a) One possibility for how to reconcile the difference for BC is to posit that there are just two different constructors: Strict :: a -> Strict a Strict! :: Force a -> Strict a But this kind of special handling is a bit bothersome. Consider: data SPair a b = SPair (!a, !b) The constructor has what type? Probably SPair :: (Force a, Force b) -> SPair a and not: SPair :: (a, b) -> SPair a
It'd also be good to think about/specify how this is going to interact with unpacked/unboxed sums.
I don't think it interacts any differently than with unpacked/unboxed products today. Edward

On 05/09/15 00:23, Edward Z. Yang wrote:
I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:
let x = undefined y = Strict x in True
No error occurs with:
data Strict = Strict !a
But an error occurs with:
data Strict = Strict (Force a)
At what point does the error occur here? When evaluating True? What about the following two expressions? const False (let x = undefined y = Strict x in True) let x = undefined y = const False (Strict x) in True Roman

On 05/09/15 00:41, Roman Cheplyaka wrote:
On 05/09/15 00:23, Edward Z. Yang wrote:
I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:
let x = undefined y = Strict x in True
No error occurs with:
data Strict = Strict !a
But an error occurs with:
data Strict = Strict (Force a)
At what point does the error occur here? When evaluating True?
What about the following two expressions?
const False (let x = undefined y = Strict x in True)
let x = undefined y = const False (Strict x) in True
On second though, the second one shouldn't even compile because of the kind error, right?

If x :: t, and t :: Unlifted, then
let x = e in e'
has a value that depends on evaluating e regardless of its use
in e' (or other things in the let, if they exist). It would be like writing
let !x = e in e'
today.
-- Dan
On Fri, Sep 4, 2015 at 5:41 PM, Roman Cheplyaka
On 05/09/15 00:23, Edward Z. Yang wrote:
I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:
let x = undefined y = Strict x in True
No error occurs with:
data Strict = Strict !a
But an error occurs with:
data Strict = Strict (Force a)
At what point does the error occur here? When evaluating True?
What about the following two expressions?
const False (let x = undefined y = Strict x in True)
let x = undefined y = const False (Strict x) in True
Roman
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Fri, Sep 4, 2015 at 5:23 PM, Edward Z. Yang
But this kind of special handling is a bit bothersome. Consider:
data SPair a b = SPair (!a, !b)
The constructor has what type? Probably
SPair :: (Force a, Force b) -> SPair a
and not:
SPair :: (a, b) -> SPair a
I don't really understand what this example is showing. I don't think SPair is a legal declaration in any scenario. - In current Haskell it's illegal; you can only put ! directly on fields - If !a :: Unlifted, then (,) (!a) is a kind error (same with Force a)
I don't think it interacts any differently than with unpacked/unboxed products today.
I meant like: If T :: Unlifted, then am I allowed to do: data U = MkU {-# UNPACK #-} T ... and what are its semantics? If T is a sum, presumably it's related to the unpacked sums proposal from a couple days ago. Does stuff from this proposal make that proposal simpler? Should they reference things in one another? Will there be optimizations that turn: data E a b :: Unlifted where L :: a -> E a b R :: b -> E a b into |# a , b #| (or whatever the agreed upon syntax is)? Presumably yes. -- Dan

Excerpts from Dan Doel's message of 2015-09-04 14:48:49 -0700:
I don't really understand what this example is showing. I don't think SPair is a legal declaration in any scenario.
- In current Haskell it's illegal; you can only put ! directly on fields - If !a :: Unlifted, then (,) (!a) is a kind error (same with Force a)
This is true. Perhaps it should be possible to define data types which are levity polymorphic, so SPair can kind as * -> * -> *, Unlifted -> Unlifted -> *, etc.
I don't think it interacts any differently than with unpacked/unboxed products today.
I meant like:
If T :: Unlifted, then am I allowed to do:
data U = MkU {-# UNPACK #-} T ...
and what are its semantics? If T is a sum, presumably it's related to the unpacked sums proposal from a couple days ago. Does stuff from this proposal make that proposal simpler? Should they reference things in one another?
Ah, this is a good question. I think you can just directly UNPACK unlifted types, without a strict bang pattern. I've added a note to the proposal.
Will there be optimizations that turn:
data E a b :: Unlifted where L :: a -> E a b R :: b -> E a b
into |# a , b #| (or whatever the agreed upon syntax is)? Presumably yes.
Yes, it should follow the same rules as https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes#Unpacking Edward

Here are some additional thoughts.
If we examine an analogue of some of your examples:
data MutVar a = MV (MutVar# RealWorld a)
main = do
let mv# = undefined
let mv = MV mv#
putStrLn "Okay."
The above is illegal. Instead we _must_ write:
let !mv# = undefined
which signals that evaluation is occurring. So it is impossible to
accidentally go from:
main = do
let mv = MV undefined
putStrLn "Okay."
which prints "Okay.", to something that throws an exception, without
having a pretty good indication that you're doing so. I would guess
this is desirable, so perhaps it should be mandated for Unlifted as
well.
----
However, the above point confuses me with respect to another example.
The proposal says that:
data Id :: * -> Unlifted where
Id :: a -> Id a
could/should be compiled with no overhead over `a`, like a newtype.
However, if Unlifted things have operational semantics like #, what
does the following do:
let x :: Id a
!x = Id undefined
The ! should evaluate to the Id constructor, but we're not
representing it, so it actually doesn't evaluate anything? But:
let x :: Id a
!x = undefined
throws an exception? Whereas for newtypes, both throw exceptions with
a !x definition, or don't with an x definition? Is it actually
possible to make Id behave this way without any representational
overhead?
I'm a little skeptical. I think that only Force (and Box) might be
able to have no representational overhead.
-- Dan
On Fri, Sep 4, 2015 at 5:23 PM, Edward Z. Yang
Excerpts from Dan Doel's message of 2015-09-04 13:09:26 -0700:
Okay. That answers another question I had, which was whether MutVar# and such would go in the new kind.
So now we have partial, extended natural numbers:
data PNat :: * where PZero :: PNat PSuc :: PNat -> PNat
A flat domain of natural numbers:
data FNat :: * where FZero :: FNat FSuc :: !FNat -> FNat
And two sets of natural numbers:
Force FNat :: Unlifted
data UNat :: Unlifted where UZero :: UNat USuc :: UNat -> UNat
And really perhaps two flat domains (and three sets), if you use Force instead of !, which would differ on who ensures the evaluation. That's kind of a lot of incompatible definitions of essentially the same thing (PNat being the significantly different thing).
I was kind of more enthused about first class !a. For instance, if you think about the opening quote by Bob Harper, he's basically wrong. The flat domain FNat is the natural numbers (existing in an overall lazy language), and has the reasoning properties he wants to teach students about with very little complication. It'd be satisfying to recognize that unlifting the outer-most part gets you exactly there, with whatever performance characteristics that implies. Or to get rid of ! and use Unlifted definitions instead.
Maybe backwards compatibility mandates the duplication, but it'd be nice if some synthesis could be reached.
I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:
let x = undefined y = Strict x in True
No error occurs with:
data Strict = Strict !a
But an error occurs with:
data Strict = Strict (Force a)
One possibility for how to reconcile the difference for BC is to posit that there are just two different constructors:
Strict :: a -> Strict a Strict! :: Force a -> Strict a
But this kind of special handling is a bit bothersome. Consider:
data SPair a b = SPair (!a, !b)
The constructor has what type? Probably
SPair :: (Force a, Force b) -> SPair a
and not:
SPair :: (a, b) -> SPair a
It'd also be good to think about/specify how this is going to interact with unpacked/unboxed sums.
I don't think it interacts any differently than with unpacked/unboxed products today.
Edward

Excerpts from Dan Doel's message of 2015-09-04 18:21:29 -0700:
Here are some additional thoughts.
If we examine an analogue of some of your examples:
data MutVar a = MV (MutVar# RealWorld a)
main = do let mv# = undefined let mv = MV mv# putStrLn "Okay."
The above is illegal. Instead we _must_ write:
This doesn't typecheck, but for a different reason: undefined :: a where a :: *, so you can't match up the kinds. error is actually extremely special in this case: it lives in OpenKind and matches both * and #. But let's suppose that we s/undefined/error "foo"/...
let !mv# = undefined
which signals that evaluation is occurring.
Also not true. Because error "foo" is inferred to have kind #, the bang pattern happens implicitly.
So it is impossible to accidentally go from:
main = do let mv = MV undefined putStrLn "Okay."
which prints "Okay.", to something that throws an exception, without having a pretty good indication that you're doing so. I would guess this is desirable, so perhaps it should be mandated for Unlifted as well.
Nope, if you just float the error call out of MV, you will go from "Okay." to an exception. Notice that *data constructors* are what are used to induce suspension. This is why we don't have a 'suspend' special form; instead, 'Box' is used directly.
However, the above point confuses me with respect to another example. The proposal says that:
data Id :: * -> Unlifted where Id :: a -> Id a
could/should be compiled with no overhead over `a`, like a newtype. However, if Unlifted things have operational semantics like #, what does the following do:
let x :: Id a !x = Id undefined
The ! should evaluate to the Id constructor, but we're not representing it, so it actually doesn't evaluate anything? But:
That's correct. Id is a box containing a lifted value. The box is unlifted, but the inner value can be lifted.
let x :: Id a !x = undefined
throws an exception?
Yes, exactly.
Whereas for newtypes, both throw exceptions with a !x definition, or don't with an x definition?
Also correct. They key thing is to distinguish error in kind * and error in kind #. You can make a table: | Id (error "foo") | error "foo" | ---------------------+-----------------------+-------------------+ newtype Id :: * -> * | error "foo" :: * | error "foo" :: * | data Id :: * -> # | Id (error "foo" :: *) | error "foo" :: # |
Is it actually possible to make Id behave this way without any representational overhead?
Yes. The reason is that an error "foo" :: # *immediately fails* (rather than attempt to allocate an Id). So the outer level of error doesn't ever need to be represented on the heap, so we can just represent the inner liftedness. Here's another way of looking at it: error in kind # is not a bottom at all. It's just a way of bailing immediately. HOWEVER...
I'm a little skeptical. I think that only Force (and Box) might be able to have no representational overhead.
It seems like it might be easier to explain if just Force and Box get optimized, and we don't bother with others; I only really care about those two operators being optimized. Edward

On Sat, Sep 5, 2015 at 3:06 AM, Edward Z. Yang
If we examine an analogue of some of your examples:
data MutVar a = MV (MutVar# RealWorld a)
main = do let mv# = undefined let mv = MV mv# putStrLn "Okay."
The above is illegal. Instead we _must_ write:
This doesn't typecheck, but for a different reason: undefined :: a where a :: *, so you can't match up the kinds.
error is actually extremely special in this case: it lives in OpenKind and matches both * and #. But let's suppose that we s/undefined/error "foo"/...
let !mv# = undefined
which signals that evaluation is occurring.
Also not true. Because error "foo" is inferred to have kind #, the bang pattern happens implicitly.
I tried with `error` first, and it worked exactly the way I described. But I guess it's a type inference weirdness. If I annotate mv# with MutVar# it will work, whereas otherwise it will be inferred that mv# :: a where a :: *, instead of #. Whereas !x is a pattern which requires monomorphism of x, and so it figures out mv# :: MutVar# .... Kind of an odd corner case where breaking cycles causes things _not_ to type check, due to open kinds not being first class. I thought I remembered that at some point it was decided that `let` bindings of unboxed things should be required to have bangs on the bindings, to indicate the evaluation order. Maybe I'm thinking of something else (was it that it was originally required and we got rid of it?).
Nope, if you just float the error call out of MV, you will go from "Okay." to an exception. Notice that *data constructors* are what are used to induce suspension. This is why we don't have a 'suspend' special form; instead, 'Box' is used directly.
I know that it's the floating that makes a difference, not the bang pattern. The point would be to make the syntax require the bang pattern to give a visual indication of when it happens, and make it illegal to look like you're doing a normal let that doesn't change the value (although having it actually be a bang pattern would be bad, because it'd restrict polymorphism of the definition). Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing. For instance, we can go from: let mv = MutVar undefined to: let mv = let mv# :: MutVar# RealWorld a ; mv# = undefined in MutVar mv# and the result is the same, because it is the definition of mv that is lazy. Constructors in complex expressions---and all subexpressions for that matter---just get compiled this way. E.G. let f :: MutVar# RealWorld a -> MutVar a f mv# = f mv# in flip const (f undefined) $ putStrLn "okay" No constructors involved, but no error.
Is it actually possible to make Id behave this way without any representational overhead?
Yes. The reason is that an error "foo" :: # *immediately fails* (rather than attempt to allocate an Id). So the outer level of error doesn't ever need to be represented on the heap, so we can just represent the inner liftedness.
Okay. So, there isn't representational overhead, but there is overhead, where you call a function or something (which will just return its argument), whereas newtype constructors end up not having any cost whatsoever? -- Dan

On Sat, Sep 5, 2015 at 1:35 PM, Dan Doel
Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing.
So, in light of this, `Box` is not necessary to define `suspend`. We can simply write: suspend :: Force a -> a suspend (Force x) = x and the fact that `a` has kind * means that `suspend undefined` only throws an exception if you inspect it. `Box` as currently defined (not the previous GADT definition) is novel in that it allows you to suspend unlifted types that weren't derived from `Force`. And it would probably be useful to have coercions between `Box (Force a)` and `a`, and `Force (Box u)` and `u`. But (I think) it is not necessary for mediating between `Force a` and `a`. -- Dan

Yes, I think you are right. I've restructured the spec so that 'Box' is an optional extension. Excerpts from Dan Doel's message of 2015-09-06 13:56:35 -0700:
On Sat, Sep 5, 2015 at 1:35 PM, Dan Doel
wrote: Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing.
So, in light of this, `Box` is not necessary to define `suspend`. We can simply write:
suspend :: Force a -> a suspend (Force x) = x
and the fact that `a` has kind * means that `suspend undefined` only throws an exception if you inspect it.
`Box` as currently defined (not the previous GADT definition) is novel in that it allows you to suspend unlifted types that weren't derived from `Force`. And it would probably be useful to have coercions between `Box (Force a)` and `a`, and `Force (Box u)` and `u`. But (I think) it is not necessary for mediating between `Force a` and `a`.
-- Dan

Excerpts from Dan Doel's message of 2015-09-05 10:35:44 -0700:
I tried with `error` first, and it worked exactly the way I described.
But I guess it's a type inference weirdness. If I annotate mv# with MutVar# it will work, whereas otherwise it will be inferred that mv# :: a where a :: *, instead of #. Whereas !x is a pattern which requires monomorphism of x, and so it figures out mv# :: MutVar# .... Kind of an odd corner case where breaking cycles causes things _not_ to type check, due to open kinds not being first class.
I thought I remembered that at some point it was decided that `let` bindings of unboxed things should be required to have bangs on the bindings, to indicate the evaluation order. Maybe I'm thinking of something else (was it that it was originally required and we got rid of it?).
Ah yes, I added an explicit type signature, which is why I didn't see
your problem.
As for requiring bang, I think probably you are thinking of:
commit 831a35dd00faff195cf938659c2dd736192b865f
Author: Ian Lynagh
Nope, if you just float the error call out of MV, you will go from "Okay." to an exception. Notice that *data constructors* are what are used to induce suspension. This is why we don't have a 'suspend' special form; instead, 'Box' is used directly.
I know that it's the floating that makes a difference, not the bang pattern. The point would be to make the syntax require the bang pattern to give a visual indication of when it happens, and make it illegal to look like you're doing a normal let that doesn't change the value (although having it actually be a bang pattern would be bad, because it'd restrict polymorphism of the definition).
I think this is a reasonable thing to ask for. I also think, with the commit set above, this very discussion happened in 2010, and was resolved in favor of not warning in this case for unboxed types. Maybe the situation is different with unlifted data types; it's hard for me to tell.
Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing. For instance, we can go from:
let mv = MutVar undefined
to:
let mv = let mv# :: MutVar# RealWorld a ; mv# = undefined in MutVar mv#
and the result is the same, because it is the definition of mv that is lazy. Constructors in complex expressions---and all subexpressions for that matter---just get compiled this way. E.G.
let f :: MutVar# RealWorld a -> MutVar a f mv# = f mv# in flip const (f undefined) $ putStrLn "okay"
No constructors involved, but no error.
Yes, you are right. I incorrectly surmised that a suspension function would have to be special form, but in fact, it does not need to be.
Okay. So, there isn't representational overhead, but there is overhead, where you call a function or something (which will just return its argument), whereas newtype constructors end up not having any cost whatsoever?
You might hope that it can get inlined away. But yes, a coercion would be best. Edward

| After many discussions and beers at ICFP, I've written up my current | best understanding of the unlifted data types proposal: | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes Too many beers! I like some, but not all, of this. There are several distinct things being mixed up. (1) First, a proposal to allow a data type to be declared to be unlifted. On its own, this is a pretty simple proposal: * Data types are always boxed, and so would unlifted data types * But an unlifted data type does not include bottom, and operationally is always represented by a pointer to the value. Just like Array#. * ALL the evaluation rules are IDENTICAL to those for other unlifted types such as Int# and Array#. Lets are strict, and cannot be recursive, function arguments are evaluated before the call. Literally nothing new here. * The code generator can generate more efficient case expressions, because the pointer always points to a value, never to a thunk or (I believe) an indirection. I think there are some special cases in GC and the RTS to ensure that this invariant holds. And that's it. Syntax: I'd suggest something more prominent than an Unlifted return kind, such as data unlifted T a = L a | R a but I would not die for this. I would really like to see this articulated as a stand-alone proposal. It makes sense by itself, and is really pretty simple. (2) Second, we cannot expect levity polymorphism. Consider map f (x:xs) = f x : map f xs Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out. So I vote against splitting # into two: plain will do just fine. (3) Third, the stuff about Force and suspend. Provided you do no more than write library code that uses the above new features I'm fine. But there seems to be lots of stuff that dances around the hope that (Force a) is represented the same way as 'a'. I don't' know how to make this fly. Is there a coercion in FC? If so then (a ~R Force a). And that seems very doubtful since we must do some evaluation. I got lost in all the traffic about it. (4) Fourth, you don't mention a related suggestion, namely to allow newtype T = MkT Int# with T getting kind #. I see no difficulty here. We do have (T ~R Int#). It's just a useful way of wrapping a newtype around an unlifted type. My suggestion: let's nail down (1), including a boxed version of Force an suspend as plain library code, if you want, and perhaps (4); and only THEN tackle the trickiness of unboxing Force. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Edward | Z. Yang | Sent: 04 September 2015 09:04 | To: ghc-devs | Subject: Unlifted data types | | Hello friends, | | After many discussions and beers at ICFP, I've written up my current | best understanding of the unlifted data types proposal: | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes | | Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, | Edward Kmett, and any others who I may have forgotten for crystallizing | this proposal. | | Cheers, | Edward | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Mon, Sep 7, 2015 at 4:00 PM, Simon Peyton Jones
(2) Second, we cannot expect levity polymorphism. Consider map f (x:xs) = f x : map f xs Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out.
So I vote against splitting # into two: plain will do just fine.
I don't understand how that last bit follows from the previous stuff (or, I don't understand the sentence). Splitting # into two kinds is useful even if functions can't be levity polymorphic. # contains a bunch of types that aren't represented uniformly. Int# might be 32 bits while Double# is 64, etc. But Unlifted would contain only types that are uniformly represented as pointers, so you could write functions that are polymorphic over types of kind Unlifted. This is not true for Unboxed/# (unless we implement C++ style polymorphism-as-code-generation). ---- Also, with regard to the previous mail, it's not true that `suspend` has to be a special form. All expressions with types of kind * are 'special forms' in the necessary sense. -- Dan

| Splitting # into two kinds is useful even if functions can't be levity | polymorphic. # contains a bunch of types that aren't represented | uniformly. Int# might be 32 bits while Double# is 64, etc. But | Unlifted would contain only types that are uniformly represented as | pointers, so you could write functions that are polymorphic over types | of kind Unlifted. Yes, I agree that's true, provided they are *not* also polymorphic over things of kind *. But it's an orthogonal proposal. What you say is already true of Array# and IORef#. Perhaps there are functions that are usefully polymorphic over boxed-but-unlifted things. But our users have not been crying out for this polymorphism despite the existence of a menagerie of existing such types, including Array# and IORef# Let's tackle things one at a time, with separate proposals and separate motivation. Simon | C++ style polymorphism-as-code-generation). | | ---- | | Also, with regard to the previous mail, it's not true that `suspend` | has to be a special form. All expressions with types of kind * are | 'special forms' in the necessary sense. | | -- Dan

On Mon, Sep 7, 2015 at 5:37 PM, Simon Peyton Jones
But it's an orthogonal proposal. What you say is already true of Array# and IORef#. Perhaps there are functions that are usefully polymorphic over boxed-but-unlifted things. But our users have not been crying out for this polymorphism despite the existence of a menagerie of existing such types, including Array# and IORef#
Well, evidently people over in the ArrayArray thread want it, for one. But also, if general unlifted types get accepted, there are many possible uses. For instance, people working with concurrency have to worry about work being done in the correct thread, and there are functions on MVars and whatnot that ensure that threads don't simply pass thunks between each other. But, if you have unlifted types, then you can have: data UMVar (a :: Unlifted) and then the type rules out the possibility of passing thunks through a reference (at least at the top level). But this requires polymorphism to avoid having to create a separate type for each unlifted type. This is also a use case of `Force`, since it is likely that we want to put ordinary data types in the MVars, just ensure that we aren't passing thunks with delayed work. ---- I'm kind of down on being polymorphic over choice of evaluation order, as well. At least without any further motivation. ---- Also, I'd still like to synthesize some of the redundancies introduced by the proposal. Perhaps it could be done by making `Force` a more primitive building block than !. I.E. data Nat = Zero | Suc !Nat can be, under this proposal, considered sugar for something like: data Nat = Zero | Suc_INTERNAL# (Force Nat) pattern Suc x = Suc_INTERNAL# (Force x) and all stipulations about what you can UNPACK are actually about Unlifted fields, rather than ! fields (which just inherit them from Force). That still leaves `Force LiftedSum` vs. `UnliftedSum`, though. And to be honest, I'm not sure we need arbitrary data types in Unlifted; Force (which would be primitive) might be enough. -- Dan

| And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough. That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot: type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a) mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs) mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) That doesn't look terribly convenient. | ensure that threads don't simply | pass thunks between each other. But, if you have unlifted types, then | you can have: | | data UMVar (a :: Unlifted) | | and then the type rules out the possibility of passing thunks through | a reference (at least at the top level). Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x I don't see Force helping here. Simon

I have put up an alternate set of proposals on
https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
These sidestep around `Force` and `suspend` but probably have other problems. They make heavy use of levity polymorphism.
Back story: this all was developed in a late-evening Haskell Symposium session that took place in the hotel bar. It seems Edward and I walked away with quite different understandings of what had taken place. I've written up my understanding. Most likely, the Right Idea is a combination of this all!
See what you think.
Thanks!
Richard
On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones
| And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough.
That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot:
type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a)
mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs)
mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs)
That doesn't look terribly convenient.
| ensure that threads don't simply | pass thunks between each other. But, if you have unlifted types, then | you can have: | | data UMVar (a :: Unlifted) | | and then the type rules out the possibility of passing thunks through | a reference (at least at the top level).
Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x
I don't see Force helping here.
Simon _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I think the wiki page is imprecise when it says:
data unlifted UBool = UTrue | UFalse
Intuitively, if you have x :: UBool in scope, you are guaranteed to have UTrue or UFalse, and not bottom.
But I still can say: foo :: UBool foo = foo and now foo contains bottom. I know that any attempt to use foo will lead to its immediate evaluation, but that is not exactly the same as "not containing a bottom". Or am I missing something here? Janek Dnia wtorek, 8 września 2015, Richard Eisenberg napisał:
I have put up an alternate set of proposals on
https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
These sidestep around `Force` and `suspend` but probably have other problems. They make heavy use of levity polymorphism.
Back story: this all was developed in a late-evening Haskell Symposium session that took place in the hotel bar. It seems Edward and I walked away with quite different understandings of what had taken place. I've written up my understanding. Most likely, the Right Idea is a combination of this all!
See what you think.
Thanks! Richard
On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones
wrote: | And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough.
That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot:
type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a)
mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs)
mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs)
That doesn't look terribly convenient.
| ensure that threads don't simply | pass thunks between each other. But, if you have unlifted types, then | you can have: | | data UMVar (a :: Unlifted) | | and then the type rules out the possibility of passing thunks through | a reference (at least at the top level).
Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x
I don't see Force helping here.
Simon _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Sep 8, 2015, at 10:15 AM, Jan Stolarek
But I still can say:
foo :: UBool foo = foo
... Or am I missing something here?
I'm afraid you are. Top-level variables may not have an unlifted type, for exactly this reason. If you were to do this on a local let, your program would loop when it hits the let, so there's no problem there. Richard

I would say, by the way, that your question still makes some sense.
When discussing strict evaluation, one can think of _values_ of a type
and _expressions_ of a type. The (denotational) semantics of values
would be unlifted, while expressions are lifted. And functions take
values to expressions. So:
f :: Int# -> Int#
f x = f x
is a valid definition, even though the result of the function is
bottom, but has type Int#. And you are allowed:
let i :: Int#
i = error "whoa"
in ...
just not the same definition at the top level, since it'd be annoying
for GHC to have to worry about evaluating all possible strict errors
in every reachable module eagerly on program start up (I assume that's
the/a reason).
That may not be the only way to think about it, but it's how I tend to.
-- Dan
On Tue, Sep 8, 2015 at 11:26 AM, Jan Stolarek
Top-level variables may not have an unlifted type Ah, that makes much more sense now. Thanks.
Janek _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

| > data unlifted UBool = UTrue | UFalse | > | > Intuitively, if you have x :: UBool in scope, you are guaranteed to | > have UTrue or UFalse, and not bottom. | | But I still can say: | | foo :: UBool | foo = foo | | and now foo contains bottom. You definitely CANNOT have a top-level declaration for a value of an unlifted type, any more than you can have for an Int# or unboxed tuple today. That should resolve your question. Simon

I like your suggestion!
I think it'd be better to have
TYPE :: TypeShape -> *
data TypeShape = Unboxed | Boxed Levity
data Levity = Lifted | Unlifted
Now the awkward "unboxed/lifted" combination doesn't arise.
Now, 'error' is "TypeShape-polymorphic":
error :: forall (ts :: TypeShape) (a :: TYPE ts). String -> a
What functions (if any) are "Levity-polymorphic". That is, they have identical code regardless of whether their (boxed) type args are lifted or unlifted? Answer: data constructors. As Richard says
Cons :: forall (v1::Levity) (v2::Levity) (a::TYPE (Boxed v1)).
a -> List v1 v2 a -> List v1 v2 a
Why can it be levity-polymorphic? Because data constructors guarantee to construct no intermediate values (which they would be unsure whether or not to evaluate).
Typically, though, functions over lists would not be levity-polymorphic, for reasons previously discussed.
The awkward spot is the runtime system. Currently it goes to some lengths to ensure that it never introduces an indirection for a boxed-but-unlifted type. Simon Marlow would know exactly where. So I suspect that we WOULD need two versions of each (levity-polymorphic) data constructor, alas. And we'd need to know which version to use at every call site, which means the code would need to be levity-monomorphic. So we really would get very little levity polymorphism ineed. I think.
All this fits nicely with Dan Doel's point about making ! a first class type constructor.
Simon
| -----Original Message-----
| From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of
| Richard Eisenberg
| Sent: 08 September 2015 14:46
| To: ghc-devs
| Subject: Re: Unlifted data types
|
| I have put up an alternate set of proposals on
|
| https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
|
| These sidestep around `Force` and `suspend` but probably have other
| problems. They make heavy use of levity polymorphism.
|
| Back story: this all was developed in a late-evening Haskell Symposium
| session that took place in the hotel bar. It seems Edward and I walked
| away with quite different understandings of what had taken place. I've
| written up my understanding. Most likely, the Right Idea is a
| combination of this all!
|
| See what you think.
|
| Thanks!
| Richard
|
| On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones

On Sep 9, 2015, at 8:28 AM, Simon Peyton Jones
I think it'd be better to have
TYPE :: TypeShape -> *
data TypeShape = Unboxed | Boxed Levity data Levity = Lifted | Unlifted
Yes, of course.
So we really would get very little levity polymorphism ineed. I think.
That's right. The levity polymorphism is, essentially, only to have a nice type inference story. Once the code gets passed to the back end, the polymorphism would have to be removed. My idea was to use it to allow users to gloss (somewhat) over the ! vs. no-! distinction by having the compiler to the Right Thing during inference. Richard

| That's right. The levity polymorphism is, essentially, only to have a
| nice type inference story. Once the code gets passed to the back end,
| the polymorphism would have to be removed. My idea was to use it to
| allow users to gloss (somewhat) over the ! vs. no-! distinction by
| having the compiler to the Right Thing during inference.
Can you be more specific? What does "gloss over" mean?
S
| -----Original Message-----
| From: Richard Eisenberg [mailto:eir@cis.upenn.edu]
| Sent: 09 September 2015 13:35
| To: Simon Peyton Jones
| Cc: ghc-devs
| Subject: Re: Unlifted data types
|
|
| On Sep 9, 2015, at 8:28 AM, Simon Peyton Jones

On Sep 9, 2015, at 8:40 AM, Simon Peyton Jones
Can you be more specific? What does "gloss over" mean?
I mean that, for example, `length` will work over both strict lists and lazy lists. It will infer the strictness of its argument through ordinary type inference. So users have to be aware of strictness, but they will be able to use the same functions in both cases. Richard

| I mean that, for example, `length` will work over both strict lists | and lazy lists. It will infer the strictness of its argument through | ordinary type inference. So users have to be aware of strictness, but | they will be able to use the same functions in both cases. I didn't understand that. You mean that 'length' will be levity-polymorphic, but 'map' will not? What are the rules for determining which is which? (Which functions, exactly, can be levity-polymorphic???

On Sep 9, 2015, at 8:57 AM, Simon Peyton Jones
| I mean that, for example, `length` will work over both strict lists | and lazy lists. It will infer the strictness of its argument through | ordinary type inference. So users have to be aware of strictness, but | they will be able to use the same functions in both cases.
I didn't understand that. You mean that 'length' will be levity-polymorphic, but 'map' will not? What are the rules for determining which is which? (Which functions, exactly, can be levity-polymorphic???
No functions (excepting `error` and friends) are truly levity polymorphic. We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it. For example, `map :: (a -> b) -> [a] -> [b]` has 4 places where levity polymorphism comes into play: the type a, the type b, the source list, and the result list. Any of these could be lazy or strict. And, I believe, all the combinations make sense. This means that we have one source Haskell declaration -- map -- that corresponds to 16 compiled functions. Obviously we wish to optimize somehow, and perhaps if we can't, this idea is bogus. But this is the basic idea. To be clear, nothing is different about `length` than `map` -- the fact that length is a consumer is utterly irrelevant in my example. Richard

On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg
No functions (excepting `error` and friends) are truly levity polymorphic.
I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had: MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted) then: takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc. I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is. But this:
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.
Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.) -- Dan

I think ultimately the two views of levity that we've been talking diverge
along the same lines as the pi vs forall discussion from your Levity
polymorphism talk.
I've been focused entirely on situations where forall suffices, and no
distinction is needed in how you compile for both levities.
Maybe could be polymorphic using a mere forall in the levity of the boxed
argument it carries as it doesn't care what it is, it never forces it,
pattern matching on it just gives it back when you pattern match on it.
Eq or Ord could just as easily work over anything boxed. The particular Eq
_instance_ needs to care about the levity.
Most of the combinators for working with Maybe do need to care about that
levity however.
e.g. consider fmap in Functor, the particular instances would care. Because
you ultimately wind up using fmap to build 'f a' values and those need to
know how the let binding should work. There seems to be a pi at work there.
Correct operational behavior would depend on the levity.
But if we look at what inference should probably grab for the levity of
Functor:
you'd get:
class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where
fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b
Baed on the notion that given current practices, f would cause us to pick a
common kind for a and b, and the results of 'f'. Depending on how and if we
decided to default to * unless annotated in various situations would drive
this closer and closer to the existing Functor by default.
These are indeed distinct functors with distinct operational behavior, and
we could implement each of them by supplying separate instances, as the
levity would take part in the instance resolution like any other kind
argument.
Whether we could expect an average Haskeller to be willing to do so is an
entirely different matter.
-Edward
On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel
No functions (excepting `error` and friends) are truly levity
On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg
wrote: polymorphic. I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:
MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)
then:
takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s
are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.
I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.
But this:
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.
Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)
-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

These observations from Ed and Dan are quite helpful. Could one of you put them on the wiki page? I hadn't considered the possibility of truly parametric levity polymorphism.
Thanks!
Richard
On Sep 9, 2015, at 3:30 PM, Edward Kmett
I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk.
I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities.
Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it.
Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity.
Most of the combinators for working with Maybe do need to care about that levity however.
e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity.
But if we look at what inference should probably grab for the levity of Functor:
you'd get:
class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b
Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default.
These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument.
Whether we could expect an average Haskeller to be willing to do so is an entirely different matter.
-Edward
On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel
wrote: On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg wrote: No functions (excepting `error` and friends) are truly levity polymorphic.
I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:
MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)
then:
takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s
are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.
I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.
But this:
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.
Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)
-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I've added a section on parametric levity polymorphism to the wiki.
Sorry it took so long.
I might add some thoughts about first-class `!a` being the only
semantic hole in our current strict data type situation later if I
remember to do so.
On Thu, Sep 10, 2015 at 10:26 AM, Richard Eisenberg
These observations from Ed and Dan are quite helpful. Could one of you put them on the wiki page? I hadn't considered the possibility of truly parametric levity polymorphism.
Thanks! Richard
On Sep 9, 2015, at 3:30 PM, Edward Kmett
wrote: I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk.
I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities.
Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it.
Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity.
Most of the combinators for working with Maybe do need to care about that levity however.
e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity.
But if we look at what inference should probably grab for the levity of Functor:
you'd get:
class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b
Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default.
These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument.
Whether we could expect an average Haskeller to be willing to do so is an entirely different matter.
-Edward
On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel
wrote: On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg
wrote: No functions (excepting `error` and friends) are truly levity polymorphic.
I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:
MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)
then:
takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s
are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.
I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.
But this:
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.
Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)
-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

| I've added a section on parametric levity polymorphism to the wiki. | Sorry it took so long. What's the wiki page? Simon | | I might add some thoughts about first-class `!a` being the only | semantic hole in our current strict data type situation later if I | remember to do so.

On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones
What's the wiki page?

Hello,
I've added a section with my notes on the minimal semantics required
to address what Haskell lacks with respect to strict types.
Ed Kmett pointed me to some stuff that I think may fix all the
problems with the !T sort of solution. It builds on the new constraint
being considered for handling impredicativity. The quick sketch goes
like this. Given the declaration:
data Nat = Z | S !Nat
then:
Nat :: *
!Nat :: Unlifted
S :: Nat -> Nat
But we also have:
!Nat <~ Nat
and the witness of this is just an identity function, because all
values of type !Nat are legitimate values of type Nat. Then we can
have:
case n of
S m -> ...
Z -> ...
where m has type !Nat, but we can still call `S m` and the like,
because !Nat <~ Nat. If we do use `S m`, the S call will do some
unnecessary evaluation of m, but this can (hopefully) be fixed with an
optimization based on knowing that m has type !Nat, which we are
weakening to Nat.
Thoughts?
-- Dan
On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg
On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones
wrote: What's the wiki page?

The idea of treating !S as a subtype of S and then relying on the potential
for new impredicativity machinery to let us just talk about how !S <= S
makes me really happy.
data Nat = Z | S !Nat
Pattern matching on S could give back the tighter type !Nat rather than Nat
for the argument, and if we ever have to show that to a user, the
'approximation' machinery would show it to users as Nat, concealing this
implementation detail. Similarly matching with an as-pattern as part of a
pattern that evaluates could do the same.
The constructor is a bit messier. It should really give back S :: Nat ->
Nat as what the constructor should behave as rather than S :: !Nat -> Nat,
because this will match existing behavior. Then the exposed constructor
would force the argument before storing it away, just like we do today and
we could recover via a sort of peephole optimization the elimination of the
jump into the closure to evaluate when it is fed something known to be of
type !Nat by some kind of "case/(!)-coercion" rule in the optimizer.
I'm partial to those parts of the idea and think it works pretty well.
I'm not sure how well it mixes with all the other discussions on levity
polymorphism though. Notably: Trying to get to having !Nat live in an
Unlifted kind, while Nat has a different kind seems likely to cause all
sorts of headaches. =/
-Edward
On Tue, Oct 27, 2015 at 7:42 PM, Dan Doel
Hello,
I've added a section with my notes on the minimal semantics required to address what Haskell lacks with respect to strict types.
Ed Kmett pointed me to some stuff that I think may fix all the problems with the !T sort of solution. It builds on the new constraint being considered for handling impredicativity. The quick sketch goes like this. Given the declaration:
data Nat = Z | S !Nat
then:
Nat :: * !Nat :: Unlifted S :: Nat -> Nat
But we also have:
!Nat <~ Nat
and the witness of this is just an identity function, because all values of type !Nat are legitimate values of type Nat. Then we can have:
case n of S m -> ... Z -> ...
where m has type !Nat, but we can still call `S m` and the like, because !Nat <~ Nat. If we do use `S m`, the S call will do some unnecessary evaluation of m, but this can (hopefully) be fixed with an optimization based on knowing that m has type !Nat, which we are weakening to Nat.
Thoughts?
-- Dan
On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg
wrote: On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones
wrote:
What's the wiki page?
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I'm out of bandwidth at the moment, but let me just remark that this is swampy territory. It's easy to mess up.
A particular challenge is polymorphism:
map :: forall a b. (a->b) -> [a] -> [b]
map f (x:xs) = (f x) : (map f xs)
In the compiled code for map, is a thunk built for (f x), or is it evaluated eagerly. Well, if you can instantiate the 'b' with !Int, say, then it should be evaluated eagerly. But if you instantiate with Int, then build a thunk. So, we really need two compiled versions of 'map'. Or perhaps four if we take 'b' into account. In general an exponential number.
That's one reason that GHC doesn't let you instantiate a polymorphic type variable with an unlifted type, even if it is boxed.
That is the reason that in "Types are calling conventions" we turn the entire intermediate language around, to make it call-by-value; and add a lazy type constructor. Not Int and (Strict Int), but rather Int and (Lazy Int).
Another big issue is that *any* mixture of subtyping and (Haskell-style) parametric polymorphism gets very complicated very fast. Especially when you add higher kinds. (Then you need variance annotations, and before long you want variance polymorphism.) I'm extremely dubious about adding subtyping to Haskell. That's one reason Scala is so complicated.
As a way to keep the discussion sane, I'd suggest focusing initially on the *intermediate language*, ignoring the source-language issues. That's what we did in "Types as calling conventions". There is a chance of writing out a complete syntax, a complete type system, and a complete operational semantics; and proving soundness etc. Once that foundation is done we can think about the source language.
Finally, there may be low-hanging fruit to be had. Maybe we can get a lot more benefit without completely re-imagining GHC! Let's do that first. (I think that some of the proposals on the wiki page were in that direction.) But even for the low hanging fruit, getting the intermediate language changes (Core) nailed first would be good.
Bur re-imagining GHC is good too. Swampy territory it may be, but it's also important, and there really *ought* to be a more seamless of combining strictness and laziness.
Simon
| -----Original Message-----
| From: Dan Doel [mailto:dan.doel@gmail.com]
| Sent: 27 October 2015 23:42
| To: Richard Eisenberg
| Cc: Simon Peyton Jones; ghc-devs
| Subject: Re: Unlifted data types
|
| Hello,
|
| I've added a section with my notes on the minimal semantics required to
| address what Haskell lacks with respect to strict types.
|
| Ed Kmett pointed me to some stuff that I think may fix all the problems with
| the !T sort of solution. It builds on the new constraint being considered
| for handling impredicativity. The quick sketch goes like this. Given the
| declaration:
|
| data Nat = Z | S !Nat
|
| then:
|
| Nat :: *
| !Nat :: Unlifted
| S :: Nat -> Nat
|
| But we also have:
|
| !Nat <~ Nat
|
| and the witness of this is just an identity function, because all values of
| type !Nat are legitimate values of type Nat. Then we can
| have:
|
| case n of
| S m -> ...
| Z -> ...
|
| where m has type !Nat, but we can still call `S m` and the like, because
| !Nat <~ Nat. If we do use `S m`, the S call will do some unnecessary
| evaluation of m, but this can (hopefully) be fixed with an optimization
| based on knowing that m has type !Nat, which we are weakening to Nat.
|
| Thoughts?
|
| -- Dan
|
|
| On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg

On Wed, Oct 28, 2015 at 5:05 AM, Simon Peyton Jones
I'm out of bandwidth at the moment, but let me just remark that this is swampy territory. It's easy to mess up.
A particular challenge is polymorphism:
map :: forall a b. (a->b) -> [a] -> [b] map f (x:xs) = (f x) : (map f xs)
In the compiled code for map, is a thunk built for (f x), or is it evaluated eagerly. Well, if you can instantiate the 'b' with !Int, say, then it should be evaluated eagerly. But if you instantiate with Int, then build a thunk. So, we really need two compiled versions of 'map'. Or perhaps four if we take 'b' into account. In general an exponential number.
That's one reason that GHC doesn't let you instantiate a polymorphic type
variable with an unlifted type, even if it is boxed.
This is one of the things we'd like to be able to fix. Right now I have a small explosion of code going on that is being duplicated over and over to parameterize over different unlifted types. In the discussions about levity/lifting so far Dan and I have been trying to tease apart what cases can be handled "forall" style rather than "pi" style to borrow the split from Richard's presentation, just to get at a sense of what really could be talked about without needing different calling conventions, despite lifting. There are situations where we are truly polymorphic in lifting, e.g. (==) from Eq and compare from Ord don't care if the arguments of type 'a' are lifted or not. Until you go to write a function application that returns a value of that type. If all you do is rearrange them then that machinery can be parametric in the choice. `map` on the other hand, cares about the selection because of the `f x` application. (Similarly, `min` and `max` from Ord do care about the convention on hand.) One could see a world wherein you could parameterize such an instance on levity explicitly, but it is pretty exhausting to think about.
Another big issue is that *any* mixture of subtyping and (Haskell-style) parametric polymorphism gets very complicated very fast. Especially when you add higher kinds. (Then you need variance annotations, and before long you want variance polymorphism.) I'm extremely dubious about adding subtyping to Haskell. That's one reason Scala is so complicated.
I was actually quite surprised to see a subtyping relationship rear its head in: https://ghc.haskell.org/trac/ghc/attachment/wiki/ImpredicativePolymorphism/I... But re-imagining GHC is good too. Swampy territory it may be, but it's
also important, and there really *ought* to be a more seamless of combining strictness and laziness.
I'm somewhat dubious of most approaches that try to mix strictness and laziness under one umbrella. That is why trying to tease out the small handful of cases where we are truly parametric in levity seems interesting. Finding out some situations existed where we really don't care if a type is lifted or not was eye opening to me personally, at least. -Edward
| -----Original Message----- | From: Dan Doel [mailto:dan.doel@gmail.com] | Sent: 27 October 2015 23:42 | To: Richard Eisenberg | Cc: Simon Peyton Jones; ghc-devs | Subject: Re: Unlifted data types | | Hello, | | I've added a section with my notes on the minimal semantics required to | address what Haskell lacks with respect to strict types. | | Ed Kmett pointed me to some stuff that I think may fix all the problems with | the !T sort of solution. It builds on the new constraint being considered | for handling impredicativity. The quick sketch goes like this. Given the | declaration: | | data Nat = Z | S !Nat | | then: | | Nat :: * | !Nat :: Unlifted | S :: Nat -> Nat | | But we also have: | | !Nat <~ Nat | | and the witness of this is just an identity function, because all values of | type !Nat are legitimate values of type Nat. Then we can | have: | | case n of | S m -> ... | Z -> ... | | where m has type !Nat, but we can still call `S m` and the like, because | !Nat <~ Nat. If we do use `S m`, the S call will do some unnecessary | evaluation of m, but this can (hopefully) be fixed with an optimization | based on knowing that m has type !Nat, which we are weakening to Nat. | | Thoughts? | | -- Dan | | | On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg
wrote: | > | > On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones | wrote: | > | >> What's the wiki page? | > | > https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I don't have terribly much to add, but I do want to counter one point:
On Oct 28, 2015, at 5:48 AM, Edward Kmett
There are situations where we are truly polymorphic in lifting, e.g. (==) from Eq and compare from Ord don't care if the arguments of type 'a' are lifted or not.
But these do, I think. In running code, if (==) is operating over a lazy type, it has to check if the pointer points to a thunk. If (==) is operating over a strict one, it can skip the check. This is not a big difference, but it *is* a difference. A little more thinking about this has led here: The distinction isn't really forall vs. pi. That is, in the cases where the levity matters, we don't really need to pi-quantify. Instead, it's exactly like type classes. Imagine, for a moment, that we have an alternate universe where strict is the default, and we have
data Lazy a = Thunk (() -> a) | WHNF a
The WHNF is a bit of a lie, because this representation would mean that the contents of a WHNF are fully evaluated. But let's not get hung up on that point. Then, we have
type family BaseType a where BaseType (Lazy a) = a BaseType a = a
class Forceable a where force :: a -> BaseType a
instance Forceable a where force = id
instance Forceable (Lazy a) where force (Thunk f) = f () force (WHNF a) = a
Things that need to behave differently depending on strictness just take a dictionary of the Forceable class. Equivalently, they make a runtime decision of whether to force or not. So we don't need an exponential number of maps. We need a map that makes some runtime decisions. And the optimizer can be taught that specializing these decisions may be helpful. Of course, none of this would be exposed. And I'm not suggesting that we need to make the core language strict-by-default. But it clarified things, for me at least, when I realized this is regular old ad-hoc polymorphism, not something fancier. Richard

On Wed, Oct 28, 2015 at 9:19 AM, Richard Eisenberg
I don't have terribly much to add, but I do want to counter one point:
On Oct 28, 2015, at 5:48 AM, Edward Kmett
wrote: There are situations where we are truly polymorphic in lifting, e.g. (==) from Eq and compare from Ord don't care if the arguments of type 'a' are lifted or not.
But these do, I think. In running code, if (==) is operating over a lazy type, it has to check if the pointer points to a thunk. If (==) is operating over a strict one, it can skip the check. This is not a big difference, but it *is* a difference.
Yes, but this is the job of the particular instance. Remember the instance gets to know the type it is working at, and its corresponding levity. class Eq (l :: Levity) (t :: Type l) where (==) :: a -> a -> Bool instance Eq @Unlifted (SmallMutableArray s a) where (==) = sameSmallMutableArray instance Eq @Lifted [] where (==) = ... Your objection arises for things like instance Eq @l (Foo @l) Where the same code has to execute with different levities, but if I can't even case or seq on a value with polymorphic levity, and can't construct such a value but merely pass it around then such code is still sound. It isn't safe to write functions that return values of polymorphic levity. I can however hand them back as (# a #). This is how we handle indexing into a array today. If we had a Maybe that was levity polymorphic in its argument Maybe :: forall (l :: Levity). Type l -> Lifted instance Eq @l a => Eq @Lifted (Maybe @l a) where Just a == Just b = a == b _ == _ = False is still okay under these rules, it never case analyzes a value of polymorphic levity, never seq's it. Neither of those things is legal because you can't 'enter' the closure. If it was levity polymorphic in the result type Maybe :: forall (i :: Levity) (j :: Levity). Type i -> Type j then your objection comes true. I can't naively write: instance Eq @i a => Eq j (Maybe @i @j a) where Just a == Just b = a == b _ == _ = False without compiling the same code twice, because of the act of case analysis. If we don't have real 'strict data types' in Lifted this situation never arises though. Even if we do I can write separate: instance Eq @i a => Eq Lifted (Maybe @i Lifted a) instance Eq @i a => Eq Unlifted (Maybe @i Unlifted a) instances, unless we can do better by writing a class constraint on the levity that we can use in a clever way here. I'm mostly concerned with the middle case where we don't overload data types on their levity, and try to recover the ability to talk about strict data types by other more explicit means, but rather permit them to accept arguments of different levities. There none of the code I care about actually needs to act differently based on levity. Anything that results in a function space there has to care about levity, but until a type occurs on the right hand side of an (->) or I go to seq a value of that type or case analyze it, I don't need to care about if its lifted or unlifted. With Dan's (!S) then things get more complicated in ways I don't fully understand the ramifications of yet, as you might be able to lift some of those restrictions. A little more thinking about this has led here: The distinction isn't
really forall vs. pi. That is, in the cases where the levity matters, we don't really need to pi-quantify. Instead, it's exactly like type classes.
In many ways pi comes down to doing typeclass like things, you're tracking information from the type system. The vehicle we have for doing that today is typeclasses. I've been thinking about anything that i have that actually needs the "pi" there as a form of "constraint" passing all along, with the constraint being whatever introspection you need to allow on the type to carry on. Imagine, for a moment, that we have an alternate universe where strict is
the default, and we have
data Lazy a = Thunk (() -> a) | WHNF a
The WHNF is a bit of a lie, because this representation would mean that the contents of a WHNF are fully evaluated. But let's not get hung up on that point.
Then, we have
type family BaseType a where BaseType (Lazy a) = a BaseType a = a
class Forceable a where force :: a -> BaseType a
instance Forceable a where force = id
instance Forceable (Lazy a) where force (Thunk f) = f () force (WHNF a) = a
Things that need to behave differently depending on strictness just take a dictionary of the Forceable class. Equivalently, they make a runtime decision of whether to force or not. So we don't need an exponential number of maps. We need a map that makes some runtime decisions. And the optimizer can be taught that specializing these decisions may be helpful.
Of course, none of this would be exposed. And I'm not suggesting that we need to make the core language strict-by-default. But it clarified things, for me at least, when I realized this is regular old ad-hoc polymorphism, not something fancier.
Yep. -Edward

On Wed, Oct 28, 2015 at 5:05 AM, Simon Peyton Jones
I'm out of bandwidth at the moment, but let me just remark that this is swampy territory. It's easy to mess up.
A particular challenge is polymorphism:
map :: forall a b. (a->b) -> [a] -> [b] map f (x:xs) = (f x) : (map f xs)
In the compiled code for map, is a thunk built for (f x), or is it evaluated eagerly. Well, if you can instantiate the 'b' with !Int, say, then it should be evaluated eagerly. But if you instantiate with Int, then build a thunk. So, we really need two compiled versions of 'map'. Or perhaps four if we take 'b' into account. In general an exponential number.
You can't insantiate `b` to `!Int` without map quantifying over levity, which is somewhat orthogonal (although it would have interactions). Without that, b has kind *, and !Int has kind Unlifted, so this is a kind error. Or, map is explicitly written having b :: Unlifted, in which case it already has the right calling convention.
That's one reason that GHC doesn't let you instantiate a polymorphic type variable with an unlifted type, even if it is boxed.
It's a reason to not put boxed, unlifted types in *. I don't think it's a reason not to allow quantifying over a separate kind of uniformly represented, unlifted types (i.e. doesn't include Int#, Double#, etc.).
Another big issue is that *any* mixture of subtyping and (Haskell-style) parametric polymorphism gets very complicated very fast. Especially when you add higher kinds. (Then you need variance annotations, and before long you want variance polymorphism.) I'm extremely dubious about adding subtyping to Haskell. That's one reason Scala is so complicated.
I'm not sure how much of this would be required. But how are you saved from this in the impredicativity proposal, which is already attempting to handle more of the induced subtyping of higher-rank types? I would guess it's mostly a matter of, "we're not going to support all that." Which might be fine for the unlifted stuff as well. Richard wrote:
But these do, I think. In running code, if (==) is operating over a lazy type, it has to check if the pointer points to a thunk. If (==) is operating over a strict one, it can skip the check. This is not a big difference, but it *is* a difference.
Yeah. It's not that (==) is parametric really. It's that Eq might be able to be parameterized on levity. But this is very useful, since it allows you to instantiate Eq on a bunch of unlifted reference types instead of having to use one-off functions like sameMutableArray#.
A little more thinking about this has led here: The distinction isn't really forall vs. pi. That is, in the cases where the levity matters, we don't really need to pi-quantify. Instead, it's exactly like type classes.
Yes, I think so, too. It is ad-hoc polymorphism that can be handled by type classes. There is no reason for an explicit value-level witness of levity that can be cased on, which is presumably what pi would allow.

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
On Thu, Oct 8, 2015 at 6:02 AM, Simon Peyton Jones
| I've added a section on parametric levity polymorphism to the wiki. | Sorry it took so long.
What's the wiki page?
Simon | | I might add some thoughts about first-class `!a` being the only | semantic hole in our current strict data type situation later if I | remember to do so. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Would this allow having a strict monoid instance for maybe, given the right
hinting at the use site?
On Wednesday, September 9, 2015, Edward Kmett
I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk.
I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities.
Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it.
Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity.
Most of the combinators for working with Maybe do need to care about that levity however.
e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity.
But if we look at what inference should probably grab for the levity of Functor:
you'd get:
class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b
Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default.
These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument.
Whether we could expect an average Haskeller to be willing to do so is an entirely different matter.
-Edward
On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel
javascript:_e(%7B%7D,'cvml','dan.doel@gmail.com');> wrote: No functions (excepting `error` and friends) are truly levity
On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg
javascript:_e(%7B%7D,'cvml','eir@cis.upenn.edu');> wrote: polymorphic. I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:
MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)
then:
takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s
are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.
I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.
But this:
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.
Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)
-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org javascript:_e(%7B%7D,'cvml','ghc-devs@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On 11/09/15 06:22, Carter Schonwald wrote:
Would this allow having a strict monoid instance for maybe, given the right hinting at the use site?
That's a fantastic idea, especially if it could be generalized to Applicative functors, where the problem of "inner laziness" is pervasive. But that'd be tricky, because functions have the Lifted kind, and so <*> would have to be crazily levity-polymorphic. (Or is this not crazy?) Roman

On Sep 11, 2015, at 4:28 AM, Roman Cheplyaka
On 11/09/15 06:22, Carter Schonwald wrote:
Would this allow having a strict monoid instance for maybe, given the right hinting at the use site?
That's a fantastic idea, especially if it could be generalized to Applicative functors, where the problem of "inner laziness" is pervasive.
But that'd be tricky, because functions have the Lifted kind, and so <*> would have to be crazily levity-polymorphic. (Or is this not crazy?)
No more crazy than other things. Right now, we have (<*>) :: forall (a :: *) (b :: *) (f :: * -> *). Applicative f => f (a -> b) -> f a -> f b Under this proposal, we would have (ignore the Boxity stuff) (<*>) :: forall (v1 :: Levity) (v2 :: Levity) (v3 :: Levity) (a :: TYPE v1) (b :: TYPE v2) (f :: forall (v4 :: Levity). TYPE v4 -> TYPE v3). Applicative f => f @'Lifted (a -> b) -> f @v1 a -> f @v2 b The higher-rank levity-polymorphism is necessary in case `a` and `b` have different levities. This may be getting wildly out-of-hand, but I don't think it's actually breaking. I would like to point out that using forall here is really quite wrong. As others have pointed out, levity polymorphism is ad-hoc polymorphism, not parametric. Using 'pi' would be much closer to it, but it implies the existence of more dependent types than we really need for this. Richard
Roman
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I'm not so sure how useful an observation this is, but Dunfield had a paper at this very ICFP "Elaborating Evaluation-Order Polymorphism". He argues that polymorphism over evaluation order should be thought of as a form of intersection type. Edward Excerpts from Richard Eisenberg's message of 2015-09-14 08:59:36 -0700:
On Sep 11, 2015, at 4:28 AM, Roman Cheplyaka
wrote: On 11/09/15 06:22, Carter Schonwald wrote:
Would this allow having a strict monoid instance for maybe, given the right hinting at the use site?
That's a fantastic idea, especially if it could be generalized to Applicative functors, where the problem of "inner laziness" is pervasive.
But that'd be tricky, because functions have the Lifted kind, and so <*> would have to be crazily levity-polymorphic. (Or is this not crazy?)
No more crazy than other things. Right now, we have
(<*>) :: forall (a :: *) (b :: *) (f :: * -> *). Applicative f => f (a -> b) -> f a -> f b
Under this proposal, we would have (ignore the Boxity stuff)
(<*>) :: forall (v1 :: Levity) (v2 :: Levity) (v3 :: Levity) (a :: TYPE v1) (b :: TYPE v2) (f :: forall (v4 :: Levity). TYPE v4 -> TYPE v3). Applicative f => f @'Lifted (a -> b) -> f @v1 a -> f @v2 b
The higher-rank levity-polymorphism is necessary in case `a` and `b` have different levities. This may be getting wildly out-of-hand, but I don't think it's actually breaking.
I would like to point out that using forall here is really quite wrong. As others have pointed out, levity polymorphism is ad-hoc polymorphism, not parametric. Using 'pi' would be much closer to it, but it implies the existence of more dependent types than we really need for this.
Richard
Roman
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On 09/09/2015 13:28, Simon Peyton Jones wrote:
The awkward spot is the runtime system. Currently it goes to some lengths to ensure that it never introduces an indirection for a boxed-but-unlifted type. Simon Marlow would know exactly where. So I suspect that we WOULD need two versions of each (levity-polymorphic) data constructor, alas. And we'd need to know which version to use at every call site, which means the code would need to be levity-monomorphic. So we really would get very little levity polymorphism ineed. I think.
I *think* we're ok here. The RTS doesn't have any special machinery to avoid indirections to unlifted things that I'm aware of. Did you have a particular problem in mind? Indirections appear in various ways, but always as the result of a thunk being there in the first place - updates, selector thunks, and suspending duplicate work (during parallel evaluation) are all thunk-related. So does that mean data constructors can be levity-polymorphic? I don't see why not, but maybe I'm missing something. Cheers Simon
All this fits nicely with Dan Doel's point about making ! a first class type constructor.
Simon
| -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Richard Eisenberg | Sent: 08 September 2015 14:46 | To: ghc-devs | Subject: Re: Unlifted data types | | I have put up an alternate set of proposals on | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes | | These sidestep around `Force` and `suspend` but probably have other | problems. They make heavy use of levity polymorphism. | | Back story: this all was developed in a late-evening Haskell Symposium | session that took place in the hotel bar. It seems Edward and I walked | away with quite different understandings of what had taken place. I've | written up my understanding. Most likely, the Right Idea is a | combination of this all! | | See what you think. | | Thanks! | Richard | | On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones
| wrote: | | > | And to | > | be honest, I'm not sure we need arbitrary data types in Unlifted; | > | Force (which would be primitive) might be enough. | > | > That's an interesting thought. But presumably you'd have to use | 'suspend' (a terrible name) a lot: | > | > type StrictList a = Force (StrictList' a) data StrictList' a = Nil | | > Cons !a (StrictList a) | > | > mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs | = | > mapStrict' f (suspend xs) | > | > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' | f | > Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) | > | > | > That doesn't look terribly convenient. | > | > | ensure that threads don't simply | > | pass thunks between each other. But, if you have unlifted types, | > | then you can have: | > | | > | data UMVar (a :: Unlifted) | > | | > | and then the type rules out the possibility of passing thunks | > | through a reference (at least at the top level). | > | > Really? Presumably UMVar is a new primitive? With a family of | operations like MVar? If so can't we just define | > newtype UMVar a = UMV (MVar a) | > putUMVar :: UMVar a -> a -> IO () | > putUMVar (UMVar v) x = x `seq` putMVar v x | > | > I don't see Force helping here. | > | > Simon | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs@haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

| > The awkward spot is the runtime system. Currently it goes to some | > lengths to ensure that it never introduces an indirection for a | > boxed-but-unlifted type. Simon Marlow would know exactly where. So | | I *think* we're ok here. The RTS doesn't have any special machinery | to avoid indirections to unlifted things that I'm aware of. Did you | have a particular problem in mind? Well I can't point to anything very specific. I just recall that in various places, if a pointer was to an Array# we would immediately, eagerly, recurse in the GC rather than add the Array# to the queue for later processing. Maybe that is no longer true. Maybe it was never true. It shouldn't be hard to find out. If true, there would be a run-time system test that returns true for boxed-but-unlifted heap objects. I think it would be worth a look because if I'm right it could have a significant impact on the design. Simon

On 10/09/2015 10:41, Simon Peyton Jones wrote:
| > The awkward spot is the runtime system. Currently it goes to some | > lengths to ensure that it never introduces an indirection for a | > boxed-but-unlifted type. Simon Marlow would know exactly where. So | | I *think* we're ok here. The RTS doesn't have any special machinery | to avoid indirections to unlifted things that I'm aware of. Did you | have a particular problem in mind?
Well I can't point to anything very specific. I just recall that in various places, if a pointer was to an Array# we would immediately, eagerly, recurse in the GC rather than add the Array# to the queue for later processing. Maybe that is no longer true. Maybe it was never true.
Maybe that was an earlier variant of the GC. The current GC just treats unlifted objects like other objects in a breadth-first way, and it never introduces indirections except when reducing a selector thunk. So I think this is ok. Cheers Simon
It shouldn't be hard to find out. If true, there would be a run-time system test that returns true for boxed-but-unlifted heap objects.
I think it would be worth a look because if I'm right it could have a significant impact on the design. > Simon

On Tue, Sep 8, 2015 at 3:52 AM, Simon Peyton Jones
| And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough.
That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot:
type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a)
mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs)
mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs)
That doesn't look terribly convenient.
It's missing the part that makes it convenient. type StrictList a = Force (StrictList' a) data StrictList' a = Nil' | Cons' !a (StrictList a) pattern Nil = Force Nil' pattern Cons x xs = Force (Cons' x xs) mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f Nil = Nil mapStrict f (Cons x xs) = Cons (f x) (mapStrict f xs) But, really, my point is that we already almost have StrictList _today_: data StrictList a = Nil | Cons !a !(StrictList a) The only difference between this and the previous definition (denotationally, at least) is the outer-most level. That's why I liked the original proposal (which probably disappeared too fast for most people to read it), which was more like being able to talk about `!a` as a thing in itself. It's the only semantic gap in being able to define totally unlifted data types right now. So maybe it's also the only operational gap that needs to be plugged, as well. But that was vetoed because `!a` in a data declaration doesn't make a constructor with type `!a -> ...`, but `a -> ...` which evaluates.
Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x
I don't see Force helping here.
Yes, true. It only helps ensure that the implementation is correct, rather than enabling a previously impossible construction. Kind of like certain uses of GADTs vs. phantom types. But the ArrayArray people already want UMVar (and the like) anyway, because it cuts out a layer of indirection for types that are already unlifted. -- Dan

| (denotationally, at least) is the outer-most level. That's why I liked
| the original proposal (which probably disappeared too fast for most
| people to read it), which was more like being able to talk about `!a`
| as a thing in itself. It's the only semantic gap in being able to
That's another interesting idea. Currently ! is an annotation on a data constructor argument (only). It is not a type-former; so that !Int or !(Int ->Int) are not types. I think your point is that if we spell "Force" as "!" then it becomes a first-class type former. Let's try that:
* t and !t are distinct types
* You get from one to the other by using ! as a term-level data constructor, so you can use it in terms and in pattern matching. Thus
- if (e::t) then (!e :: !t)
- if (e::!t) then case of !x -> blah
here (x::t)
* As with newtype constructors, pattern-matching on !t doesn't imply evaluation; after all, it's already evaluated.
* In Richard's notation, the type constructor (!) has kind
(!) :: TYPE 'Boxed l -> TYPE 'Boxed Unlifted
The wrinkle for data types is this. The declaration
data T = MkT !Int !a
produces a data constructor with type
MkT :: forall a. Int -> a -> T
But if (!t) is a first-class type, then you'd expect to get a data constructor with type
MkT :: forall a. !Int -> !a -> T
and that in turn would force many calls to look like (MkT !e1 !e2).
But we could re-cast the special treatment for the top-level bang on data constructor arguments, to say that we get a worker/wrapper story so that the programmer's eye view is indeed that MkT :: forall a. Int -> a -> T, and the compiler generates the eval code to match things up. (Which is what happens now.)
But if you wrote, say,
data S = MkS (!Int, !a)
then this magic would not happen and you really would get
MkS :: (!Int, !a) -> S
Interesting.
Simon
| -----Original Message-----
| From: Dan Doel [mailto:dan.doel@gmail.com]
| Sent: 09 September 2015 03:44
| To: Simon Peyton Jones
| Cc: Edward Z. Yang; ghc-devs
| Subject: Re: Unlifted data types
|
| On Tue, Sep 8, 2015 at 3:52 AM, Simon Peyton Jones
|

Hello Simon,
There are several distinct things being mixed up.
I've split the document into three (four?) distinct subproposals. Proposals 1 and 2 stand alone.
(1) First, a proposal to allow a data type to be declared to be unlifted. On its own, this is a pretty simple proposal: [snip]
I would really like to see this articulated as a stand-alone proposal. It makes sense by itself, and is really pretty simple.
This is now "Proposal 1".
(2) Second, we cannot expect levity polymorphism. Consider map f (x:xs) = f x : map f xs Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out.
So I vote against splitting # into two: plain will do just fine.
Levity polymorphism will not work without generating two copies of 'map', but plain polymorphism over 'Unlifted' is useful (as Dan has also pointed out.) In any case, I've extracted this out into a separate subproposal "Proposal 1.1". https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes#Proposal1.1:Polymorp... (reordering here.)
(4) Fourth, you don't mention a related suggestion, namely to allow newtype T = MkT Int# with T getting kind #. I see no difficulty here. We do have (T ~R Int#). It's just a useful way of wrapping a newtype around an unlifted type.
This is now "Proposal 2".
(3) Third, the stuff about Force and suspend. Provided you do no more than write library code that uses the above new features I'm fine. But there seems to be lots of stuff that dances around the hope that (Force a) is represented the same way as 'a'. I don't' know how to make this fly. Is there a coercion in FC? If so then (a ~R Force a). And that seems very doubtful since we must do some evaluation.
I agree that we can't introduce a coercion between 'Force a' and 'a', for the reason you mentioned. (there's also a second reason which is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have different kinds.) I've imagined that we might be able to just continue representing Force explicitly in Core, and somehow "compile it away" at STG time, but I am definitely fuzzy about how this is supposed to work. Perhaps Force should not actually be a data type, and we should have 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter of which compiles to a no-op.) Cheers, Edward

| I've split the document into three (four?) distinct subproposals. | Proposals 1 and 2 stand alone. I've re-numbered them 1,2,3,4, since 1.1 is (to me) a pretty major deal, stands in its own right, and certainly isn't a sub-proposal of (1). Under (new) 2, I'm very dubious about "Boxed levity polymorphism in types (and functions with extra code generation)". It's certainly true that we could generate two copied of the code for every function; but by generating three, or perhaps four copies we could also deal with Int# and Float#. Maybe one more for Double#. .NET does this on the fly, incidentally. Where do you stop? Also remember it's not just an issue of GC pointers. The semantics of the function changes, because things that are thunks for the lifted version become strict in the unlifted version. Your 'umap' is a bit more convincing. But for now (2) would be low on my priority list, until we encounter user pressure which (note) we have not encountered so far despite the range of boxed but unlifted types. Why is now the right time? (1) and (3) seem solid. I'll leave (4) for another message. Simon | -----Original Message----- | From: Edward Z. Yang [mailto:ezyang@mit.edu] | Sent: 07 September 2015 22:36 | To: Simon Peyton Jones | Cc: ghc-devs | Subject: RE: Unlifted data types | | Hello Simon, | | > There are several distinct things being mixed up. | | I've split the document into three (four?) distinct subproposals. | Proposals 1 and 2 stand alone. | | > (1) First, a proposal to allow a data type to be declared to be | unlifted. On its own, this is a pretty simple proposal: [snip] | > | > I would really like to see this articulated as a stand-alone proposal. | It makes sense by itself, and is really pretty simple. | | This is now "Proposal 1". | | > (2) Second, we cannot expect levity polymorphism. Consider | > map f (x:xs) = f x : map f xs | > Is the (f x) a thunk or is it evaluated strictly? Unless you are going | to clone the code for map (which levity polymorphism is there to avoid), | we can't answer "it depends on the type of (f x)". So, no, I think | levity polymorphism is out. | > | > So I vote against splitting # into two: plain will do just fine. | | Levity polymorphism will not work without generating two copies of | 'map', but plain polymorphism over 'Unlifted' is useful (as Dan has | also pointed out.) In any case, I've extracted this out into a | separate subproposal "Proposal 1.1". | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes#Proposal1.1:Polym | orphismoveranewUnliftedkind | | (reordering here.) | | > (4) Fourth, you don't mention a related suggestion, namely to allow | > newtype T = MkT Int# | > with T getting kind #. I see no difficulty here. We do have (T ~R | Int#). It's just a useful way of wrapping a newtype around an unlifted | type. | | This is now "Proposal 2". | | > (3) Third, the stuff about Force and suspend. Provided you do no more | than write library code that uses the above new features I'm fine. But | there seems to be lots of stuff that dances around the hope that (Force | a) is represented the same way as 'a'. I don't' know how to make this | fly. Is there a coercion in FC? If so then (a ~R Force a). And that | seems very doubtful since we must do some evaluation. | | I agree that we can't introduce a coercion between 'Force a' and | 'a', for the reason you mentioned. (there's also a second reason which | is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have | different kinds.) | | I've imagined that we might be able to just continue representing | Force explicitly in Core, and somehow "compile it away" at STG time, | but I am definitely fuzzy about how this is supposed to work. Perhaps | Force should not actually be a data type, and we should have | 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter | of which compiles to a no-op.) | | Cheers, | Edward

| I agree that we can't introduce a coercion between 'Force a' and | 'a', for the reason you mentioned. (there's also a second reason which | is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have | different kinds.) | | I've imagined that we might be able to just continue representing | Force explicitly in Core, and somehow "compile it away" at STG time, | but I am definitely fuzzy about how this is supposed to work. Perhaps | Force should not actually be a data type, and we should have | 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter | of which compiles to a no-op.) I'm still doubtful. What is the problem you are trying to solve here? How does Force help us? Note that a singleton unboxed tuple (# e #) has the effect of suspending; e.g. f x = (# x+1 #) return immediately, returning a pointer to a thunk for (x+1). I'm not sure if that is relevant. Simon

Excerpts from Simon Peyton Jones's message of 2015-09-07 14:55:09 -0700:
I'm still doubtful. What is the problem you are trying to solve here? How does Force help us?
The problem 'Force' is trying to solve is the fact that Haskell currently has many existing lifted data types, and they all have ~essentially identical unlifted versions. But for a user to write the lifted and unlifted version, they have to copy paste their code or use 'Force'.
Note that a singleton unboxed tuple (# e #) has the effect of suspending; e.g. f x = (# x+1 #) return immediately, returning a pointer to a thunk for (x+1). I'm not sure if that is relevant.
I don't think so? Unboxed tuples take a computation with kind * and represent it in kind #. But 'suspend' takes a computation in kind # and represents in kind *. Edward

| The problem 'Force' is trying to solve is the fact that Haskell | currently has many existing lifted data types, and they all have | ~essentially identical unlifted versions. But for a user to write the | lifted and unlifted version, they have to copy paste their code or use | 'Force'. But (Force [a]) will only be head-strict. You still have to make an essentially-identical version if you want a strict list. Ditto all components of a data structure. Is the gain (of head-strictness) really worth it? Incidentally, on the Unlifted-vs-# discussion, I'm not against making the distinction. I can see advantages in carving out a strict subset of Haskell, which this would help to do. Simon

Hi devs,
Apologies if I'm missing context.
Is it more better that "Unlifted" kind is other name, such as
"BoxedUnlifted", "Bul", "%",... ?
I'm studying Unlifted data types [1].
It's good proposals to optimize code by users.
But I worry that newcomers may be confusing about its kind name.
In my understanding [2], "unlifted" types may include "unboxed, unlifted"
type.
But "unlifted" kind does not include "unboxed, unlifted" types.
[1]: https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
[2]:
http://takenobu-hs.github.io/downloads/haskell_lazy_evaluation.pdf#page=190
(sorry, large)
I wish you a happy new year.
Regards,
Takenobu
2015-09-04 17:03 GMT+09:00 Edward Z. Yang
Hello friends,
After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal:
https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes
Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal.
Cheers, Edward _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (12)
-
Carter Schonwald
-
Dan Doel
-
Edward Kmett
-
Edward Z. Yang
-
Eric Seidel
-
Jan Stolarek
-
Richard Eisenberg
-
Roman Cheplyaka
-
Ryan Yates
-
Simon Marlow
-
Simon Peyton Jones
-
Takenobu Tani