Unlifted data types

older
Re: [GHC] #11239: GHC HEAD: cannot...

Edward Z. Yang

4 Sep 2015 4 Sep '15

8:03 a.m.

Hello friends, After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal: https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal. Cheers, Edward

Show replies by date

Eric Seidel

4 Sep 4 Sep

3:29 p.m.

You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of foo :: !Maybe Int -> !Maybe Int foo x = x bar = foo (Just undefined) are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct? Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while foo1 :: !Int -> !Int foo1 x = x and foo2 :: Int -> Int foo2 !x = x have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable? Eric On Fri, Sep 4, 2015, at 01:03, Edward Z. Yang wrote:

...

Hello friends,

After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal:

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal.

Cheers, Edward _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Edward Z. Yang

3:43 p.m.

Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700:

...

You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of

foo :: !Maybe Int -> !Maybe Int foo x = x

bar = foo (Just undefined)

are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct?

Yes. So maybe NFData is a *bad* example!

...

Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while

foo1 :: !Int -> !Int foo1 x = x

and

foo2 :: Int -> Int foo2 !x = x

have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable?

Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions. Edward

Edward Z. Yang

3:45 p.m.

Excerpts from Edward Z. Yang's message of 2015-09-04 08:43:48 -0700:

...

Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions.

Michael Greenberg points out on Twitter that suspend must be a special form, just like lambda abstraction. Edward

Simon Peyton Jones

7 Sep 7 Sep

7:41 p.m.

| Michael Greenberg points out on Twitter that suspend must be a special | form, just like lambda abstraction. This isn't reflected on the wiki. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Edward | Z. Yang | Sent: 04 September 2015 16:46 | To: Eric Seidel; ghc-devs | Subject: Re: Unlifted data types | | Excerpts from Edward Z. Yang's message of 2015-09-04 08:43:48 -0700: | > Yes. Actually, you have a good point that we'd like to have functions | > 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we | > can't generate 'Coercible' instances for these types unless Coercible | becomes | > polykinded. Perhaps we can make a new type class, or just magic | > polymorphic functions. | | Michael Greenberg points out on Twitter that suspend must be a special | form, just like lambda abstraction. | | Edward | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Eric Seidel

4 Sep 4 Sep

4:06 p.m.

Another good example would be foo :: ![Int] -> ![Int] Does this force just the first constructor or the whole spine? My guess would be the latter. On Fri, Sep 4, 2015, at 08:43, Edward Z. Yang wrote:

...

Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700:

...
You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of

foo :: !Maybe Int -> !Maybe Int foo x = x

bar = foo (Just undefined)

are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct?

Yes. So maybe NFData is a *bad* example!

...
Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while

foo1 :: !Int -> !Int foo1 x = x

and

foo2 :: Int -> Int foo2 !x = x

have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable?

Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions.

Edward

Edward Z. Yang

6:14 p.m.

Hello Eric, You can't tell; the head not withstanding, `[a]` is still a lazy list, so you would need to look at the function body to see if any extra forcing goes on. `Force` does not induce `seq`ing: it is an obligation for the call-site. (Added it to the FAQ). Edward Excerpts from Eric Seidel's message of 2015-09-04 09:06:15 -0700:

...

Another good example would be

foo :: ![Int] -> ![Int]

Does this force just the first constructor or the whole spine? My guess would be the latter.

On Fri, Sep 4, 2015, at 08:43, Edward Z. Yang wrote:

...
Excerpts from Eric Seidel's message of 2015-09-04 08:29:59 -0700:

...
You mention NFData in the motivation but then say that !Maybe !Int is not allowed. This leads me to wonder what the semantics of

foo :: !Maybe Int -> !Maybe Int foo x = x

bar = foo (Just undefined)

are. Based on the FAQ it sounds like foo would *not* force the undefined, is that correct?

Yes. So maybe NFData is a *bad* example!

...
Also, there's a clear connection between these UnliftedTypes and BangPatterns, but as I understand it the ! is essentially a new type constructor. So while

foo1 :: !Int -> !Int foo1 x = x

and

foo2 :: Int -> Int foo2 !x = x

have the same runtime behavior, they have different types, so you can't pass a regular Int to foo1. Is that desirable?

Yes. Actually, you have a good point that we'd like to have functions 'force :: Int -> !Int' and 'suspend :: !Int -> Int'. Unfortunately, we can't generate 'Coercible' instances for these types unless Coercible becomes polykinded. Perhaps we can make a new type class, or just magic polymorphic functions.

Edward

Dan Doel

4:57 p.m.

All your examples are non-recursive types. So, if I have: data Nat = Zero | Suc Nat what is !Nat? Does it just have the outer-most part unlifted? Is the intention to make the !a in data type declarations first-class, so that when we say: data Nat = Zero | Suc !Nat the !Nat part is now an entity in itself, and it is, for this declaration, the set of naturals, whereas Nat is the flat domain? On Fri, Sep 4, 2015 at 4:03 AM, Edward Z. Yang wrote:

...

Hello friends,

After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal:

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal.

Cheers, Edward _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Edward Z. Yang

6:12 p.m.

Excerpts from Dan Doel's message of 2015-09-04 09:57:42 -0700:

...

All your examples are non-recursive types. So, if I have:

data Nat = Zero | Suc Nat

what is !Nat? Does it just have the outer-most part unlifted?

Just the outermost part.

...

Is the intention to make the !a in data type declarations first-class, so that when we say:

data Nat = Zero | Suc !Nat

the !Nat part is now an entity in itself, and it is, for this declaration, the set of naturals, whereas Nat is the flat domain?

No, in fact, there is a semantic difference between this and strict fields (which Paul pointed out to me.) There's now an updated proposal on the Trac which partially solves this problem. Edward

Dan Doel

8:09 p.m.

Okay. That answers another question I had, which was whether MutVar# and such would go in the new kind. So now we have partial, extended natural numbers: data PNat :: * where PZero :: PNat PSuc :: PNat -> PNat A flat domain of natural numbers: data FNat :: * where FZero :: FNat FSuc :: !FNat -> FNat And two sets of natural numbers: Force FNat :: Unlifted data UNat :: Unlifted where UZero :: UNat USuc :: UNat -> UNat And really perhaps two flat domains (and three sets), if you use Force instead of !, which would differ on who ensures the evaluation. That's kind of a lot of incompatible definitions of essentially the same thing (PNat being the significantly different thing). I was kind of more enthused about first class !a. For instance, if you think about the opening quote by Bob Harper, he's basically wrong. The flat domain FNat is the natural numbers (existing in an overall lazy language), and has the reasoning properties he wants to teach students about with very little complication. It'd be satisfying to recognize that unlifting the outer-most part gets you exactly there, with whatever performance characteristics that implies. Or to get rid of ! and use Unlifted definitions instead. Maybe backwards compatibility mandates the duplication, but it'd be nice if some synthesis could be reached. ---- It'd also be good to think about/specify how this is going to interact with unpacked/unboxed sums. On Fri, Sep 4, 2015 at 2:12 PM, Edward Z. Yang wrote:

...

Excerpts from Dan Doel's message of 2015-09-04 09:57:42 -0700:

...
All your examples are non-recursive types. So, if I have:

data Nat = Zero | Suc Nat

what is !Nat? Does it just have the outer-most part unlifted?

Just the outermost part.

...
Is the intention to make the !a in data type declarations first-class, so that when we say:

data Nat = Zero | Suc !Nat

the !Nat part is now an entity in itself, and it is, for this declaration, the set of naturals, whereas Nat is the flat domain?

No, in fact, there is a semantic difference between this and strict fields (which Paul pointed out to me.) There's now an updated proposal on the Trac which partially solves this problem.

Edward

Edward Z. Yang

9:23 p.m.

Excerpts from Dan Doel's message of 2015-09-04 13:09:26 -0700:

...

Okay. That answers another question I had, which was whether MutVar# and such would go in the new kind.

So now we have partial, extended natural numbers:

data PNat :: * where PZero :: PNat PSuc :: PNat -> PNat

A flat domain of natural numbers:

data FNat :: * where FZero :: FNat FSuc :: !FNat -> FNat

And two sets of natural numbers:

Force FNat :: Unlifted

data UNat :: Unlifted where UZero :: UNat USuc :: UNat -> UNat

And really perhaps two flat domains (and three sets), if you use Force instead of !, which would differ on who ensures the evaluation. That's kind of a lot of incompatible definitions of essentially the same thing (PNat being the significantly different thing).

I was kind of more enthused about first class !a. For instance, if you think about the opening quote by Bob Harper, he's basically wrong. The flat domain FNat is the natural numbers (existing in an overall lazy language), and has the reasoning properties he wants to teach students about with very little complication. It'd be satisfying to recognize that unlifting the outer-most part gets you exactly there, with whatever performance characteristics that implies. Or to get rid of ! and use Unlifted definitions instead.

Maybe backwards compatibility mandates the duplication, but it'd be nice if some synthesis could be reached.

I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write: let x = undefined y = Strict x in True No error occurs with: data Strict = Strict !a But an error occurs with: data Strict = Strict (Force a) One possibility for how to reconcile the difference for BC is to posit that there are just two different constructors: Strict :: a -> Strict a Strict! :: Force a -> Strict a But this kind of special handling is a bit bothersome. Consider: data SPair a b = SPair (!a, !b) The constructor has what type? Probably SPair :: (Force a, Force b) -> SPair a and not: SPair :: (a, b) -> SPair a

...

It'd also be good to think about/specify how this is going to interact with unpacked/unboxed sums.

I don't think it interacts any differently than with unpacked/unboxed products today. Edward

Roman Cheplyaka

9:41 p.m.

On 05/09/15 00:23, Edward Z. Yang wrote:

...

I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:

let x = undefined y = Strict x in True

No error occurs with:

data Strict = Strict !a

But an error occurs with:

data Strict = Strict (Force a)

At what point does the error occur here? When evaluating True? What about the following two expressions? const False (let x = undefined y = Strict x in True) let x = undefined y = const False (Strict x) in True Roman

Roman Cheplyaka

9:43 p.m.

On 05/09/15 00:41, Roman Cheplyaka wrote:

...

On 05/09/15 00:23, Edward Z. Yang wrote:

...
I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:

let x = undefined y = Strict x in True

No error occurs with:

data Strict = Strict !a

But an error occurs with:

data Strict = Strict (Force a)

At what point does the error occur here? When evaluating True?

What about the following two expressions?

const False (let x = undefined y = Strict x in True)

let x = undefined y = const False (Strict x) in True

On second though, the second one shouldn't even compile because of the kind error, right?

Dan Doel

9:56 p.m.

If x :: t, and t :: Unlifted, then let x = e in e' has a value that depends on evaluating e regardless of its use in e' (or other things in the let, if they exist). It would be like writing let !x = e in e' today. -- Dan On Fri, Sep 4, 2015 at 5:41 PM, Roman Cheplyaka wrote:

...

On 05/09/15 00:23, Edward Z. Yang wrote:

...
I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:

let x = undefined y = Strict x in True

No error occurs with:

data Strict = Strict !a

But an error occurs with:

data Strict = Strict (Force a)

At what point does the error occur here? When evaluating True?

What about the following two expressions?

const False (let x = undefined y = Strict x in True)

let x = undefined y = const False (Strict x) in True

Roman

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Dan Doel

9:48 p.m.

On Fri, Sep 4, 2015 at 5:23 PM, Edward Z. Yang wrote:

...

But this kind of special handling is a bit bothersome. Consider:

data SPair a b = SPair (!a, !b)

The constructor has what type? Probably

SPair :: (Force a, Force b) -> SPair a

and not:

SPair :: (a, b) -> SPair a

I don't really understand what this example is showing. I don't think SPair is a legal declaration in any scenario. - In current Haskell it's illegal; you can only put ! directly on fields - If !a :: Unlifted, then (,) (!a) is a kind error (same with Force a)

...

I don't think it interacts any differently than with unpacked/unboxed products today.

I meant like: If T :: Unlifted, then am I allowed to do: data U = MkU {-# UNPACK #-} T ... and what are its semantics? If T is a sum, presumably it's related to the unpacked sums proposal from a couple days ago. Does stuff from this proposal make that proposal simpler? Should they reference things in one another? Will there be optimizations that turn: data E a b :: Unlifted where L :: a -> E a b R :: b -> E a b into |# a , b #| (or whatever the agreed upon syntax is)? Presumably yes. -- Dan

Edward Z. Yang

5 Sep 5 Sep

3:38 a.m.

Excerpts from Dan Doel's message of 2015-09-04 14:48:49 -0700:

...

I don't really understand what this example is showing. I don't think SPair is a legal declaration in any scenario.

- In current Haskell it's illegal; you can only put ! directly on fields - If !a :: Unlifted, then (,) (!a) is a kind error (same with Force a)

This is true. Perhaps it should be possible to define data types which are levity polymorphic, so SPair can kind as * -> * -> *, Unlifted -> Unlifted -> *, etc.

...

...
I don't think it interacts any differently than with unpacked/unboxed products today.

I meant like:

If T :: Unlifted, then am I allowed to do:

data U = MkU {-# UNPACK #-} T ...

and what are its semantics? If T is a sum, presumably it's related to the unpacked sums proposal from a couple days ago. Does stuff from this proposal make that proposal simpler? Should they reference things in one another?

Ah, this is a good question. I think you can just directly UNPACK unlifted types, without a strict bang pattern. I've added a note to the proposal.

...

Will there be optimizations that turn:

data E a b :: Unlifted where L :: a -> E a b R :: b -> E a b

into |# a , b #| (or whatever the agreed upon syntax is)? Presumably yes.

Yes, it should follow the same rules as https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes#Unpacking Edward

Dan Doel

1:21 a.m.

Here are some additional thoughts. If we examine an analogue of some of your examples: data MutVar a = MV (MutVar# RealWorld a) main = do let mv# = undefined let mv = MV mv# putStrLn "Okay." The above is illegal. Instead we _must_ write: let !mv# = undefined which signals that evaluation is occurring. So it is impossible to accidentally go from: main = do let mv = MV undefined putStrLn "Okay." which prints "Okay.", to something that throws an exception, without having a pretty good indication that you're doing so. I would guess this is desirable, so perhaps it should be mandated for Unlifted as well. ---- However, the above point confuses me with respect to another example. The proposal says that: data Id :: * -> Unlifted where Id :: a -> Id a could/should be compiled with no overhead over `a`, like a newtype. However, if Unlifted things have operational semantics like #, what does the following do: let x :: Id a !x = Id undefined The ! should evaluate to the Id constructor, but we're not representing it, so it actually doesn't evaluate anything? But: let x :: Id a !x = undefined throws an exception? Whereas for newtypes, both throw exceptions with a !x definition, or don't with an x definition? Is it actually possible to make Id behave this way without any representational overhead? I'm a little skeptical. I think that only Force (and Box) might be able to have no representational overhead. -- Dan On Fri, Sep 4, 2015 at 5:23 PM, Edward Z. Yang wrote:

...

Excerpts from Dan Doel's message of 2015-09-04 13:09:26 -0700:

...
Okay. That answers another question I had, which was whether MutVar# and such would go in the new kind.

So now we have partial, extended natural numbers:

data PNat :: * where PZero :: PNat PSuc :: PNat -> PNat

A flat domain of natural numbers:

data FNat :: * where FZero :: FNat FSuc :: !FNat -> FNat

And two sets of natural numbers:

Force FNat :: Unlifted

data UNat :: Unlifted where UZero :: UNat USuc :: UNat -> UNat

And really perhaps two flat domains (and three sets), if you use Force instead of !, which would differ on who ensures the evaluation. That's kind of a lot of incompatible definitions of essentially the same thing (PNat being the significantly different thing).

I was kind of more enthused about first class !a. For instance, if you think about the opening quote by Bob Harper, he's basically wrong. The flat domain FNat is the natural numbers (existing in an overall lazy language), and has the reasoning properties he wants to teach students about with very little complication. It'd be satisfying to recognize that unlifting the outer-most part gets you exactly there, with whatever performance characteristics that implies. Or to get rid of ! and use Unlifted definitions instead.

Maybe backwards compatibility mandates the duplication, but it'd be nice if some synthesis could be reached.

I would certainly agree that in terms of the data that is representable, there is not much difference; but there is a lot of difference for the client between Force and a strict field. If I write:

let x = undefined y = Strict x in True

No error occurs with:

data Strict = Strict !a

But an error occurs with:

data Strict = Strict (Force a)

One possibility for how to reconcile the difference for BC is to posit that there are just two different constructors:

Strict :: a -> Strict a Strict! :: Force a -> Strict a

But this kind of special handling is a bit bothersome. Consider:

data SPair a b = SPair (!a, !b)

The constructor has what type? Probably

SPair :: (Force a, Force b) -> SPair a

and not:

SPair :: (a, b) -> SPair a

...
It'd also be good to think about/specify how this is going to interact with unpacked/unboxed sums.

I don't think it interacts any differently than with unpacked/unboxed products today.

Edward

Edward Z. Yang

7:06 a.m.

Excerpts from Dan Doel's message of 2015-09-04 18:21:29 -0700:

...

Here are some additional thoughts.

If we examine an analogue of some of your examples:

data MutVar a = MV (MutVar# RealWorld a)

main = do let mv# = undefined let mv = MV mv# putStrLn "Okay."

The above is illegal. Instead we _must_ write:

This doesn't typecheck, but for a different reason: undefined :: a where a :: *, so you can't match up the kinds. error is actually extremely special in this case: it lives in OpenKind and matches both * and #. But let's suppose that we s/undefined/error "foo"/...

...

let !mv# = undefined

which signals that evaluation is occurring.

Also not true. Because error "foo" is inferred to have kind #, the bang pattern happens implicitly.

...

So it is impossible to accidentally go from:

main = do let mv = MV undefined putStrLn "Okay."

which prints "Okay.", to something that throws an exception, without having a pretty good indication that you're doing so. I would guess this is desirable, so perhaps it should be mandated for Unlifted as well.

Nope, if you just float the error call out of MV, you will go from "Okay." to an exception. Notice that *data constructors* are what are used to induce suspension. This is why we don't have a 'suspend' special form; instead, 'Box' is used directly.

...

However, the above point confuses me with respect to another example. The proposal says that:

data Id :: * -> Unlifted where Id :: a -> Id a

could/should be compiled with no overhead over `a`, like a newtype. However, if Unlifted things have operational semantics like #, what does the following do:

let x :: Id a !x = Id undefined

The ! should evaluate to the Id constructor, but we're not representing it, so it actually doesn't evaluate anything? But:

That's correct. Id is a box containing a lifted value. The box is unlifted, but the inner value can be lifted.

...

let x :: Id a !x = undefined

throws an exception?

Yes, exactly.

...

Whereas for newtypes, both throw exceptions with a !x definition, or don't with an x definition?

Also correct. They key thing is to distinguish error in kind * and error in kind #. You can make a table: | Id (error "foo") | error "foo" | ---------------------+-----------------------+-------------------+ newtype Id :: * -> * | error "foo" :: * | error "foo" :: * | data Id :: * -> # | Id (error "foo" :: *) | error "foo" :: # |

...

Is it actually possible to make Id behave this way without any representational overhead?

Yes. The reason is that an error "foo" :: # *immediately fails* (rather than attempt to allocate an Id). So the outer level of error doesn't ever need to be represented on the heap, so we can just represent the inner liftedness. Here's another way of looking at it: error in kind # is not a bottom at all. It's just a way of bailing immediately. HOWEVER...

...

I'm a little skeptical. I think that only Force (and Box) might be able to have no representational overhead.

It seems like it might be easier to explain if just Force and Box get optimized, and we don't bother with others; I only really care about those two operators being optimized. Edward

Dan Doel

5:35 p.m.

On Sat, Sep 5, 2015 at 3:06 AM, Edward Z. Yang wrote:

...

...
If we examine an analogue of some of your examples:

data MutVar a = MV (MutVar# RealWorld a)

main = do let mv# = undefined let mv = MV mv# putStrLn "Okay."

The above is illegal. Instead we _must_ write:

This doesn't typecheck, but for a different reason: undefined :: a where a :: *, so you can't match up the kinds.

error is actually extremely special in this case: it lives in OpenKind and matches both * and #. But let's suppose that we s/undefined/error "foo"/...

...
let !mv# = undefined

which signals that evaluation is occurring.

Also not true. Because error "foo" is inferred to have kind #, the bang pattern happens implicitly.

I tried with `error` first, and it worked exactly the way I described. But I guess it's a type inference weirdness. If I annotate mv# with MutVar# it will work, whereas otherwise it will be inferred that mv# :: a where a :: *, instead of #. Whereas !x is a pattern which requires monomorphism of x, and so it figures out mv# :: MutVar# .... Kind of an odd corner case where breaking cycles causes things _not_ to type check, due to open kinds not being first class. I thought I remembered that at some point it was decided that `let` bindings of unboxed things should be required to have bangs on the bindings, to indicate the evaluation order. Maybe I'm thinking of something else (was it that it was originally required and we got rid of it?).

...

Nope, if you just float the error call out of MV, you will go from "Okay." to an exception. Notice that *data constructors* are what are used to induce suspension. This is why we don't have a 'suspend' special form; instead, 'Box' is used directly.

I know that it's the floating that makes a difference, not the bang pattern. The point would be to make the syntax require the bang pattern to give a visual indication of when it happens, and make it illegal to look like you're doing a normal let that doesn't change the value (although having it actually be a bang pattern would be bad, because it'd restrict polymorphism of the definition). Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing. For instance, we can go from: let mv = MutVar undefined to: let mv = let mv# :: MutVar# RealWorld a ; mv# = undefined in MutVar mv# and the result is the same, because it is the definition of mv that is lazy. Constructors in complex expressions---and all subexpressions for that matter---just get compiled this way. E.G. let f :: MutVar# RealWorld a -> MutVar a f mv# = f mv# in flip const (f undefined) $ putStrLn "okay" No constructors involved, but no error.

...

...
Is it actually possible to make Id behave this way without any representational overhead?

Yes. The reason is that an error "foo" :: # *immediately fails* (rather than attempt to allocate an Id). So the outer level of error doesn't ever need to be represented on the heap, so we can just represent the inner liftedness.

Okay. So, there isn't representational overhead, but there is overhead, where you call a function or something (which will just return its argument), whereas newtype constructors end up not having any cost whatsoever? -- Dan

Dan Doel

6 Sep 6 Sep

8:56 p.m.

On Sat, Sep 5, 2015 at 1:35 PM, Dan Doel wrote:

...

Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing.

So, in light of this, `Box` is not necessary to define `suspend`. We can simply write: suspend :: Force a -> a suspend (Force x) = x and the fact that `a` has kind * means that `suspend undefined` only throws an exception if you inspect it. `Box` as currently defined (not the previous GADT definition) is novel in that it allows you to suspend unlifted types that weren't derived from `Force`. And it would probably be useful to have coercions between `Box (Force a)` and `a`, and `Force (Box u)` and `u`. But (I think) it is not necessary for mediating between `Force a` and `a`. -- Dan

Edward Z. Yang

7 Sep 7 Sep

5:57 p.m.

Yes, I think you are right. I've restructured the spec so that 'Box' is an optional extension. Excerpts from Dan Doel's message of 2015-09-06 13:56:35 -0700:

...

On Sat, Sep 5, 2015 at 1:35 PM, Dan Doel wrote:

...
Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing.

So, in light of this, `Box` is not necessary to define `suspend`. We can simply write:

suspend :: Force a -> a suspend (Force x) = x

and the fact that `a` has kind * means that `suspend undefined` only throws an exception if you inspect it.

`Box` as currently defined (not the previous GADT definition) is novel in that it allows you to suspend unlifted types that weren't derived from `Force`. And it would probably be useful to have coercions between `Box (Force a)` and `a`, and `Force (Box u)` and `u`. But (I think) it is not necessary for mediating between `Force a` and `a`.

-- Dan

Edward Z. Yang

6:13 p.m.

Excerpts from Dan Doel's message of 2015-09-05 10:35:44 -0700:

...

I tried with `error` first, and it worked exactly the way I described.

But I guess it's a type inference weirdness. If I annotate mv# with MutVar# it will work, whereas otherwise it will be inferred that mv# :: a where a :: *, instead of #. Whereas !x is a pattern which requires monomorphism of x, and so it figures out mv# :: MutVar# .... Kind of an odd corner case where breaking cycles causes things _not_ to type check, due to open kinds not being first class.

I thought I remembered that at some point it was decided that `let` bindings of unboxed things should be required to have bangs on the bindings, to indicate the evaluation order. Maybe I'm thinking of something else (was it that it was originally required and we got rid of it?).

Ah yes, I added an explicit type signature, which is why I didn't see your problem. As for requiring bang, I think probably you are thinking of: commit 831a35dd00faff195cf938659c2dd736192b865f Author: Ian Lynagh Date: Fri Apr 24 12:47:54 2009 +0000 Require a bang pattern when unlifted types are where/let bound; #3182 For now we only get a warning, rather than an error, because the alex and happy templates don't follow the new rules yet. But Simon eventually made it be less chatty: commit 67157c5c25c8044b54419470b5e8cc677be060c3 Author: simonpj@microsoft.com <unknown> Date: Tue Nov 16 17:18:43 2010 +0000 Warn a bit less often about unlifted bindings. Warn when (a) a pattern bindings binds unlifted values (b) it has no top-level bang (c) the RHS has a *lifted* type Clause (c) is new, argued for by Simon M Eg x# = 4# + 4# -- No warning (# a,b #) = blah -- No warning I# x = blah -- Warning Since in our cases the RHS is not lifted, no warning occurs.

...

...
Nope, if you just float the error call out of MV, you will go from "Okay." to an exception. Notice that *data constructors* are what are used to induce suspension. This is why we don't have a 'suspend' special form; instead, 'Box' is used directly.

I know that it's the floating that makes a difference, not the bang pattern. The point would be to make the syntax require the bang pattern to give a visual indication of when it happens, and make it illegal to look like you're doing a normal let that doesn't change the value (although having it actually be a bang pattern would be bad, because it'd restrict polymorphism of the definition).

I think this is a reasonable thing to ask for. I also think, with the commit set above, this very discussion happened in 2010, and was resolved in favor of not warning in this case for unboxed types. Maybe the situation is different with unlifted data types; it's hard for me to tell.

...

Also, the constructor isn't exactly relevant, so much as whether the unlifted error occurs inside the definition of a lifted thing. For instance, we can go from:

let mv = MutVar undefined

to:

let mv = let mv# :: MutVar# RealWorld a ; mv# = undefined in MutVar mv#

and the result is the same, because it is the definition of mv that is lazy. Constructors in complex expressions---and all subexpressions for that matter---just get compiled this way. E.G.

let f :: MutVar# RealWorld a -> MutVar a f mv# = f mv# in flip const (f undefined) $ putStrLn "okay"

No constructors involved, but no error.

Yes, you are right. I incorrectly surmised that a suspension function would have to be special form, but in fact, it does not need to be.

...

Okay. So, there isn't representational overhead, but there is overhead, where you call a function or something (which will just return its argument), whereas newtype constructors end up not having any cost whatsoever?

You might hope that it can get inlined away. But yes, a coercion would be best. Edward

Simon Peyton Jones

8 p.m.

| After many discussions and beers at ICFP, I've written up my current | best understanding of the unlifted data types proposal: | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes Too many beers! I like some, but not all, of this. There are several distinct things being mixed up. (1) First, a proposal to allow a data type to be declared to be unlifted. On its own, this is a pretty simple proposal: * Data types are always boxed, and so would unlifted data types * But an unlifted data type does not include bottom, and operationally is always represented by a pointer to the value. Just like Array#. * ALL the evaluation rules are IDENTICAL to those for other unlifted types such as Int# and Array#. Lets are strict, and cannot be recursive, function arguments are evaluated before the call. Literally nothing new here. * The code generator can generate more efficient case expressions, because the pointer always points to a value, never to a thunk or (I believe) an indirection. I think there are some special cases in GC and the RTS to ensure that this invariant holds. And that's it. Syntax: I'd suggest something more prominent than an Unlifted return kind, such as data unlifted T a = L a | R a but I would not die for this. I would really like to see this articulated as a stand-alone proposal. It makes sense by itself, and is really pretty simple. (2) Second, we cannot expect levity polymorphism. Consider map f (x:xs) = f x : map f xs Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out. So I vote against splitting # into two: plain will do just fine. (3) Third, the stuff about Force and suspend. Provided you do no more than write library code that uses the above new features I'm fine. But there seems to be lots of stuff that dances around the hope that (Force a) is represented the same way as 'a'. I don't' know how to make this fly. Is there a coercion in FC? If so then (a ~R Force a). And that seems very doubtful since we must do some evaluation. I got lost in all the traffic about it. (4) Fourth, you don't mention a related suggestion, namely to allow newtype T = MkT Int# with T getting kind #. I see no difficulty here. We do have (T ~R Int#). It's just a useful way of wrapping a newtype around an unlifted type. My suggestion: let's nail down (1), including a boxed version of Force an suspend as plain library code, if you want, and perhaps (4); and only THEN tackle the trickiness of unboxing Force. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Edward | Z. Yang | Sent: 04 September 2015 09:04 | To: ghc-devs | Subject: Unlifted data types | | Hello friends, | | After many discussions and beers at ICFP, I've written up my current | best understanding of the unlifted data types proposal: | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes | | Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, | Edward Kmett, and any others who I may have forgotten for crystallizing | this proposal. | | Cheers, | Edward | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Dan Doel

8:19 p.m.

On Mon, Sep 7, 2015 at 4:00 PM, Simon Peyton Jones wrote:

...

(2) Second, we cannot expect levity polymorphism. Consider map f (x:xs) = f x : map f xs Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out.

So I vote against splitting # into two: plain will do just fine.

I don't understand how that last bit follows from the previous stuff (or, I don't understand the sentence). Splitting # into two kinds is useful even if functions can't be levity polymorphic. # contains a bunch of types that aren't represented uniformly. Int# might be 32 bits while Double# is 64, etc. But Unlifted would contain only types that are uniformly represented as pointers, so you could write functions that are polymorphic over types of kind Unlifted. This is not true for Unboxed/# (unless we implement C++ style polymorphism-as-code-generation). ---- Also, with regard to the previous mail, it's not true that `suspend` has to be a special form. All expressions with types of kind * are 'special forms' in the necessary sense. -- Dan

Simon Peyton Jones

9:37 p.m.

| Splitting # into two kinds is useful even if functions can't be levity | polymorphic. # contains a bunch of types that aren't represented | uniformly. Int# might be 32 bits while Double# is 64, etc. But | Unlifted would contain only types that are uniformly represented as | pointers, so you could write functions that are polymorphic over types | of kind Unlifted. Yes, I agree that's true, provided they are *not* also polymorphic over things of kind *. But it's an orthogonal proposal. What you say is already true of Array# and IORef#. Perhaps there are functions that are usefully polymorphic over boxed-but-unlifted things. But our users have not been crying out for this polymorphism despite the existence of a menagerie of existing such types, including Array# and IORef# Let's tackle things one at a time, with separate proposals and separate motivation. Simon | C++ style polymorphism-as-code-generation). | | ---- | | Also, with regard to the previous mail, it's not true that `suspend` | has to be a special form. All expressions with types of kind * are | 'special forms' in the necessary sense. | | -- Dan

Dan Doel

11:09 p.m.

On Mon, Sep 7, 2015 at 5:37 PM, Simon Peyton Jones wrote:

...

But it's an orthogonal proposal. What you say is already true of Array# and IORef#. Perhaps there are functions that are usefully polymorphic over boxed-but-unlifted things. But our users have not been crying out for this polymorphism despite the existence of a menagerie of existing such types, including Array# and IORef#

Well, evidently people over in the ArrayArray thread want it, for one. But also, if general unlifted types get accepted, there are many possible uses. For instance, people working with concurrency have to worry about work being done in the correct thread, and there are functions on MVars and whatnot that ensure that threads don't simply pass thunks between each other. But, if you have unlifted types, then you can have: data UMVar (a :: Unlifted) and then the type rules out the possibility of passing thunks through a reference (at least at the top level). But this requires polymorphism to avoid having to create a separate type for each unlifted type. This is also a use case of `Force`, since it is likely that we want to put ordinary data types in the MVars, just ensure that we aren't passing thunks with delayed work. ---- I'm kind of down on being polymorphic over choice of evaluation order, as well. At least without any further motivation. ---- Also, I'd still like to synthesize some of the redundancies introduced by the proposal. Perhaps it could be done by making `Force` a more primitive building block than !. I.E. data Nat = Zero | Suc !Nat can be, under this proposal, considered sugar for something like: data Nat = Zero | Suc_INTERNAL# (Force Nat) pattern Suc x = Suc_INTERNAL# (Force x) and all stipulations about what you can UNPACK are actually about Unlifted fields, rather than ! fields (which just inherit them from Force). That still leaves `Force LiftedSum` vs. `UnliftedSum`, though. And to be honest, I'm not sure we need arbitrary data types in Unlifted; Force (which would be primitive) might be enough. -- Dan

Simon Peyton Jones

8 Sep 8 Sep

7:52 a.m.

| And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough. That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot: type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a) mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs) mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) That doesn't look terribly convenient. | ensure that threads don't simply | pass thunks between each other. But, if you have unlifted types, then | you can have: | | data UMVar (a :: Unlifted) | | and then the type rules out the possibility of passing thunks through | a reference (at least at the top level). Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x I don't see Force helping here. Simon

Richard Eisenberg

1:45 p.m.

I have put up an alternate set of proposals on https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes These sidestep around `Force` and `suspend` but probably have other problems. They make heavy use of levity polymorphism. Back story: this all was developed in a late-evening Haskell Symposium session that took place in the hotel bar. It seems Edward and I walked away with quite different understandings of what had taken place. I've written up my understanding. Most likely, the Right Idea is a combination of this all! See what you think. Thanks! Richard On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones wrote:

...

| And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough.

That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot:

type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a)

mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs)

mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs)

That doesn't look terribly convenient.

| ensure that threads don't simply | pass thunks between each other. But, if you have unlifted types, then | you can have: | | data UMVar (a :: Unlifted) | | and then the type rules out the possibility of passing thunks through | a reference (at least at the top level).

Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x

I don't see Force helping here.

Simon _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Jan Stolarek

2:15 p.m.

I think the wiki page is imprecise when it says:

...

data unlifted UBool = UTrue | UFalse

Intuitively, if you have x :: UBool in scope, you are guaranteed to have UTrue or UFalse, and not bottom.

But I still can say: foo :: UBool foo = foo and now foo contains bottom. I know that any attempt to use foo will lead to its immediate evaluation, but that is not exactly the same as "not containing a bottom". Or am I missing something here? Janek Dnia wtorek, 8 września 2015, Richard Eisenberg napisał:

...

I have put up an alternate set of proposals on

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

These sidestep around `Force` and `suspend` but probably have other problems. They make heavy use of levity polymorphism.

Back story: this all was developed in a late-evening Haskell Symposium session that took place in the hotel bar. It seems Edward and I walked away with quite different understandings of what had taken place. I've written up my understanding. Most likely, the Right Idea is a combination of this all!

See what you think.

Thanks! Richard

On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones wrote:

...
| And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough.

That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot:

type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a)

mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs)

mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs)

That doesn't look terribly convenient.

| ensure that threads don't simply | pass thunks between each other. But, if you have unlifted types, then | you can have: | | data UMVar (a :: Unlifted) | | and then the type rules out the possibility of passing thunks through | a reference (at least at the top level).

Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x

I don't see Force helping here.

Simon _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Richard Eisenberg

2:56 p.m.

On Sep 8, 2015, at 10:15 AM, Jan Stolarek wrote:

...

But I still can say:

foo :: UBool foo = foo

... Or am I missing something here?

I'm afraid you are. Top-level variables may not have an unlifted type, for exactly this reason. If you were to do this on a local let, your program would loop when it hits the let, so there's no problem there. Richard

Jan Stolarek

3:26 p.m.

...

Top-level variables may not have an unlifted type Ah, that makes much more sense now. Thanks.

Janek

Dan Doel

9 Sep 9 Sep

2:57 a.m.

I would say, by the way, that your question still makes some sense. When discussing strict evaluation, one can think of _values_ of a type and _expressions_ of a type. The (denotational) semantics of values would be unlifted, while expressions are lifted. And functions take values to expressions. So: f :: Int# -> Int# f x = f x is a valid definition, even though the result of the function is bottom, but has type Int#. And you are allowed: let i :: Int# i = error "whoa" in ... just not the same definition at the top level, since it'd be annoying for GHC to have to worry about evaluating all possible strict errors in every reachable module eagerly on program start up (I assume that's the/a reason). That may not be the only way to think about it, but it's how I tend to. -- Dan On Tue, Sep 8, 2015 at 11:26 AM, Jan Stolarek wrote:

...

...
Top-level variables may not have an unlifted type Ah, that makes much more sense now. Thanks.

Janek _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Simon Peyton Jones

8 Sep 8 Sep

2:58 p.m.

| > data unlifted UBool = UTrue | UFalse | > | > Intuitively, if you have x :: UBool in scope, you are guaranteed to | > have UTrue or UFalse, and not bottom. | | But I still can say: | | foo :: UBool | foo = foo | | and now foo contains bottom. You definitely CANNOT have a top-level declaration for a value of an unlifted type, any more than you can have for an Int# or unboxed tuple today. That should resolve your question. Simon

Simon Peyton Jones

9 Sep 9 Sep

12:28 p.m.

I like your suggestion! I think it'd be better to have TYPE :: TypeShape -> * data TypeShape = Unboxed | Boxed Levity data Levity = Lifted | Unlifted Now the awkward "unboxed/lifted" combination doesn't arise. Now, 'error' is "TypeShape-polymorphic": error :: forall (ts :: TypeShape) (a :: TYPE ts). String -> a What functions (if any) are "Levity-polymorphic". That is, they have identical code regardless of whether their (boxed) type args are lifted or unlifted? Answer: data constructors. As Richard says Cons :: forall (v1::Levity) (v2::Levity) (a::TYPE (Boxed v1)). a -> List v1 v2 a -> List v1 v2 a Why can it be levity-polymorphic? Because data constructors guarantee to construct no intermediate values (which they would be unsure whether or not to evaluate). Typically, though, functions over lists would not be levity-polymorphic, for reasons previously discussed. The awkward spot is the runtime system. Currently it goes to some lengths to ensure that it never introduces an indirection for a boxed-but-unlifted type. Simon Marlow would know exactly where. So I suspect that we WOULD need two versions of each (levity-polymorphic) data constructor, alas. And we'd need to know which version to use at every call site, which means the code would need to be levity-monomorphic. So we really would get very little levity polymorphism ineed. I think. All this fits nicely with Dan Doel's point about making ! a first class type constructor. Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Richard Eisenberg | Sent: 08 September 2015 14:46 | To: ghc-devs | Subject: Re: Unlifted data types | | I have put up an alternate set of proposals on | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes | | These sidestep around `Force` and `suspend` but probably have other | problems. They make heavy use of levity polymorphism. | | Back story: this all was developed in a late-evening Haskell Symposium | session that took place in the hotel bar. It seems Edward and I walked | away with quite different understandings of what had taken place. I've | written up my understanding. Most likely, the Right Idea is a | combination of this all! | | See what you think. | | Thanks! | Richard | | On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones | wrote: | | > | And to | > | be honest, I'm not sure we need arbitrary data types in Unlifted; | > | Force (which would be primitive) might be enough. | > | > That's an interesting thought. But presumably you'd have to use | 'suspend' (a terrible name) a lot: | > | > type StrictList a = Force (StrictList' a) data StrictList' a = Nil | | > Cons !a (StrictList a) | > | > mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs | = | > mapStrict' f (suspend xs) | > | > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' | f | > Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) | > | > | > That doesn't look terribly convenient. | > | > | ensure that threads don't simply | > | pass thunks between each other. But, if you have unlifted types, | > | then you can have: | > | | > | data UMVar (a :: Unlifted) | > | | > | and then the type rules out the possibility of passing thunks | > | through a reference (at least at the top level). | > | > Really? Presumably UMVar is a new primitive? With a family of | operations like MVar? If so can't we just define | > newtype UMVar a = UMV (MVar a) | > putUMVar :: UMVar a -> a -> IO () | > putUMVar (UMVar v) x = x `seq` putMVar v x | > | > I don't see Force helping here. | > | > Simon | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs@haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Richard Eisenberg

12:35 p.m.

On Sep 9, 2015, at 8:28 AM, Simon Peyton Jones wrote:

...

I think it'd be better to have

TYPE :: TypeShape -> *

data TypeShape = Unboxed | Boxed Levity data Levity = Lifted | Unlifted

Yes, of course.

...

So we really would get very little levity polymorphism ineed. I think.

That's right. The levity polymorphism is, essentially, only to have a nice type inference story. Once the code gets passed to the back end, the polymorphism would have to be removed. My idea was to use it to allow users to gloss (somewhat) over the ! vs. no-! distinction by having the compiler to the Right Thing during inference. Richard

Simon Peyton Jones

12:40 p.m.

| That's right. The levity polymorphism is, essentially, only to have a | nice type inference story. Once the code gets passed to the back end, | the polymorphism would have to be removed. My idea was to use it to | allow users to gloss (somewhat) over the ! vs. no-! distinction by | having the compiler to the Right Thing during inference. Can you be more specific? What does "gloss over" mean? S | -----Original Message----- | From: Richard Eisenberg [mailto:eir@cis.upenn.edu] | Sent: 09 September 2015 13:35 | To: Simon Peyton Jones | Cc: ghc-devs | Subject: Re: Unlifted data types | | | On Sep 9, 2015, at 8:28 AM, Simon Peyton Jones | wrote: | | > I think it'd be better to have | > | > TYPE :: TypeShape -> * | > | > data TypeShape = Unboxed | Boxed Levity data Levity = Lifted | | > Unlifted | > | | Yes, of course. | | > So we really would get very little levity polymorphism ineed. I | think. | | That's right. The levity polymorphism is, essentially, only to have a | nice type inference story. Once the code gets passed to the back end, | the polymorphism would have to be removed. My idea was to use it to | allow users to gloss (somewhat) over the ! vs. no-! distinction by | having the compiler to the Right Thing during inference. | | Richard

Richard Eisenberg

12:44 p.m.

On Sep 9, 2015, at 8:40 AM, Simon Peyton Jones wrote:

...

Can you be more specific? What does "gloss over" mean?

I mean that, for example, `length` will work over both strict lists and lazy lists. It will infer the strictness of its argument through ordinary type inference. So users have to be aware of strictness, but they will be able to use the same functions in both cases. Richard

Simon Peyton Jones

12:57 p.m.

| I mean that, for example, `length` will work over both strict lists | and lazy lists. It will infer the strictness of its argument through | ordinary type inference. So users have to be aware of strictness, but | they will be able to use the same functions in both cases. I didn't understand that. You mean that 'length' will be levity-polymorphic, but 'map' will not? What are the rules for determining which is which? (Which functions, exactly, can be levity-polymorphic???

Richard Eisenberg

1:03 p.m.

On Sep 9, 2015, at 8:57 AM, Simon Peyton Jones wrote:

...

| I mean that, for example, `length` will work over both strict lists | and lazy lists. It will infer the strictness of its argument through | ordinary type inference. So users have to be aware of strictness, but | they will be able to use the same functions in both cases.

I didn't understand that. You mean that 'length' will be levity-polymorphic, but 'map' will not? What are the rules for determining which is which? (Which functions, exactly, can be levity-polymorphic???

No functions (excepting `error` and friends) are truly levity polymorphic. We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it. For example, `map :: (a -> b) -> [a] -> [b]` has 4 places where levity polymorphism comes into play: the type a, the type b, the source list, and the result list. Any of these could be lazy or strict. And, I believe, all the combinations make sense. This means that we have one source Haskell declaration -- map -- that corresponds to 16 compiled functions. Obviously we wish to optimize somehow, and perhaps if we can't, this idea is bogus. But this is the basic idea. To be clear, nothing is different about `length` than `map` -- the fact that length is a consumer is utterly irrelevant in my example. Richard

Dan Doel

4:44 p.m.

On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg wrote:

...

No functions (excepting `error` and friends) are truly levity polymorphic.

I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had: MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted) then: takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc. I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is. But this:

...

We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.

Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.) -- Dan

Edward Kmett

7:30 p.m.

I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk. I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities. Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it. Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity. Most of the combinators for working with Maybe do need to care about that levity however. e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity. But if we look at what inference should probably grab for the levity of Functor: you'd get: class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default. These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument. Whether we could expect an average Haskeller to be willing to do so is an entirely different matter. -Edward On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel wrote:

...

...
No functions (excepting `error` and friends) are truly levity

On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg wrote: polymorphic.

I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:

MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)

then:

takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s

are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.

I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.

But this:

...
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.

Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)

-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Richard Eisenberg

10 Sep 10 Sep

2:26 p.m.

These observations from Ed and Dan are quite helpful. Could one of you put them on the wiki page? I hadn't considered the possibility of truly parametric levity polymorphism. Thanks! Richard On Sep 9, 2015, at 3:30 PM, Edward Kmett wrote:

...

I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk.

I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities.

Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it.

Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity.

Most of the combinators for working with Maybe do need to care about that levity however.

e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity.

But if we look at what inference should probably grab for the levity of Functor:

you'd get:

class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b

Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default.

These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument.

Whether we could expect an average Haskeller to be willing to do so is an entirely different matter.

-Edward

On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel wrote: On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg wrote:

...
No functions (excepting `error` and friends) are truly levity polymorphic.

I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:

MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)

then:

takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s

are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.

I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.

But this:

...
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.

Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)

-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Dan Doel

8 Oct 8 Oct

1:20 a.m.

I've added a section on parametric levity polymorphism to the wiki. Sorry it took so long. I might add some thoughts about first-class `!a` being the only semantic hole in our current strict data type situation later if I remember to do so. On Thu, Sep 10, 2015 at 10:26 AM, Richard Eisenberg wrote:

...

These observations from Ed and Dan are quite helpful. Could one of you put them on the wiki page? I hadn't considered the possibility of truly parametric levity polymorphism.

Thanks! Richard

On Sep 9, 2015, at 3:30 PM, Edward Kmett wrote:

I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk.

I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities.

Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it.

Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity.

Most of the combinators for working with Maybe do need to care about that levity however.

e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity.

But if we look at what inference should probably grab for the levity of Functor:

you'd get:

class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b

Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default.

These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument.

Whether we could expect an average Haskeller to be willing to do so is an entirely different matter.

-Edward

On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel wrote:

...
On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg wrote:

...
No functions (excepting `error` and friends) are truly levity polymorphic.

I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:

MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)

then:

takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s

are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.

I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.

But this:

...
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.

Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)

-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Simon Peyton Jones

10:02 a.m.

| I've added a section on parametric levity polymorphism to the wiki. | Sorry it took so long. What's the wiki page? Simon | | I might add some thoughts about first-class `!a` being the only | semantic hole in our current strict data type situation later if I | remember to do so.

Richard Eisenberg

12:36 p.m.

On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones wrote:

...

What's the wiki page?

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

Dan Doel

27 Oct 27 Oct

11:42 p.m.

Hello, I've added a section with my notes on the minimal semantics required to address what Haskell lacks with respect to strict types. Ed Kmett pointed me to some stuff that I think may fix all the problems with the !T sort of solution. It builds on the new constraint being considered for handling impredicativity. The quick sketch goes like this. Given the declaration: data Nat = Z | S !Nat then: Nat :: * !Nat :: Unlifted S :: Nat -> Nat But we also have: !Nat <~ Nat and the witness of this is just an identity function, because all values of type !Nat are legitimate values of type Nat. Then we can have: case n of S m -> ... Z -> ... where m has type !Nat, but we can still call `S m` and the like, because !Nat <~ Nat. If we do use `S m`, the S call will do some unnecessary evaluation of m, but this can (hopefully) be fixed with an optimization based on knowing that m has type !Nat, which we are weakening to Nat. Thoughts? -- Dan On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg wrote:

...

On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones wrote:

...
What's the wiki page?

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

Edward Kmett

28 Oct 28 Oct

3:53 a.m.

The idea of treating !S as a subtype of S and then relying on the potential for new impredicativity machinery to let us just talk about how !S <= S makes me really happy. data Nat = Z | S !Nat Pattern matching on S could give back the tighter type !Nat rather than Nat for the argument, and if we ever have to show that to a user, the 'approximation' machinery would show it to users as Nat, concealing this implementation detail. Similarly matching with an as-pattern as part of a pattern that evaluates could do the same. The constructor is a bit messier. It should really give back S :: Nat -> Nat as what the constructor should behave as rather than S :: !Nat -> Nat, because this will match existing behavior. Then the exposed constructor would force the argument before storing it away, just like we do today and we could recover via a sort of peephole optimization the elimination of the jump into the closure to evaluate when it is fed something known to be of type !Nat by some kind of "case/(!)-coercion" rule in the optimizer. I'm partial to those parts of the idea and think it works pretty well. I'm not sure how well it mixes with all the other discussions on levity polymorphism though. Notably: Trying to get to having !Nat live in an Unlifted kind, while Nat has a different kind seems likely to cause all sorts of headaches. =/ -Edward On Tue, Oct 27, 2015 at 7:42 PM, Dan Doel wrote:

...

Hello,

I've added a section with my notes on the minimal semantics required to address what Haskell lacks with respect to strict types.

Ed Kmett pointed me to some stuff that I think may fix all the problems with the !T sort of solution. It builds on the new constraint being considered for handling impredicativity. The quick sketch goes like this. Given the declaration:

data Nat = Z | S !Nat

then:

Nat :: * !Nat :: Unlifted S :: Nat -> Nat

But we also have:

!Nat <~ Nat

and the witness of this is just an identity function, because all values of type !Nat are legitimate values of type Nat. Then we can have:

case n of S m -> ... Z -> ...

where m has type !Nat, but we can still call `S m` and the like, because !Nat <~ Nat. If we do use `S m`, the S call will do some unnecessary evaluation of m, but this can (hopefully) be fixed with an optimization based on knowing that m has type !Nat, which we are weakening to Nat.

Thoughts?

-- Dan

On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg wrote:

...
On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones

wrote:

...
...
What's the wiki page?

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Simon Peyton Jones

9:05 a.m.

I'm out of bandwidth at the moment, but let me just remark that this is swampy territory. It's easy to mess up. A particular challenge is polymorphism: map :: forall a b. (a->b) -> [a] -> [b] map f (x:xs) = (f x) : (map f xs) In the compiled code for map, is a thunk built for (f x), or is it evaluated eagerly. Well, if you can instantiate the 'b' with !Int, say, then it should be evaluated eagerly. But if you instantiate with Int, then build a thunk. So, we really need two compiled versions of 'map'. Or perhaps four if we take 'b' into account. In general an exponential number. That's one reason that GHC doesn't let you instantiate a polymorphic type variable with an unlifted type, even if it is boxed. That is the reason that in "Types are calling conventions" we turn the entire intermediate language around, to make it call-by-value; and add a lazy type constructor. Not Int and (Strict Int), but rather Int and (Lazy Int). Another big issue is that *any* mixture of subtyping and (Haskell-style) parametric polymorphism gets very complicated very fast. Especially when you add higher kinds. (Then you need variance annotations, and before long you want variance polymorphism.) I'm extremely dubious about adding subtyping to Haskell. That's one reason Scala is so complicated. As a way to keep the discussion sane, I'd suggest focusing initially on the *intermediate language*, ignoring the source-language issues. That's what we did in "Types as calling conventions". There is a chance of writing out a complete syntax, a complete type system, and a complete operational semantics; and proving soundness etc. Once that foundation is done we can think about the source language. Finally, there may be low-hanging fruit to be had. Maybe we can get a lot more benefit without completely re-imagining GHC! Let's do that first. (I think that some of the proposals on the wiki page were in that direction.) But even for the low hanging fruit, getting the intermediate language changes (Core) nailed first would be good. Bur re-imagining GHC is good too. Swampy territory it may be, but it's also important, and there really *ought* to be a more seamless of combining strictness and laziness. Simon | -----Original Message----- | From: Dan Doel [mailto:dan.doel@gmail.com] | Sent: 27 October 2015 23:42 | To: Richard Eisenberg | Cc: Simon Peyton Jones; ghc-devs | Subject: Re: Unlifted data types | | Hello, | | I've added a section with my notes on the minimal semantics required to | address what Haskell lacks with respect to strict types. | | Ed Kmett pointed me to some stuff that I think may fix all the problems with | the !T sort of solution. It builds on the new constraint being considered | for handling impredicativity. The quick sketch goes like this. Given the | declaration: | | data Nat = Z | S !Nat | | then: | | Nat :: * | !Nat :: Unlifted | S :: Nat -> Nat | | But we also have: | | !Nat <~ Nat | | and the witness of this is just an identity function, because all values of | type !Nat are legitimate values of type Nat. Then we can | have: | | case n of | S m -> ... | Z -> ... | | where m has type !Nat, but we can still call `S m` and the like, because | !Nat <~ Nat. If we do use `S m`, the S call will do some unnecessary | evaluation of m, but this can (hopefully) be fixed with an optimization | based on knowing that m has type !Nat, which we are weakening to Nat. | | Thoughts? | | -- Dan | | | On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg wrote: | > | > On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones | wrote: | > | >> What's the wiki page? | > | > https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

Edward Kmett

9:48 a.m.

On Wed, Oct 28, 2015 at 5:05 AM, Simon Peyton Jones wrote:

...

I'm out of bandwidth at the moment, but let me just remark that this is swampy territory. It's easy to mess up.

A particular challenge is polymorphism:

map :: forall a b. (a->b) -> [a] -> [b] map f (x:xs) = (f x) : (map f xs)

In the compiled code for map, is a thunk built for (f x), or is it evaluated eagerly. Well, if you can instantiate the 'b' with !Int, say, then it should be evaluated eagerly. But if you instantiate with Int, then build a thunk. So, we really need two compiled versions of 'map'. Or perhaps four if we take 'b' into account. In general an exponential number.

That's one reason that GHC doesn't let you instantiate a polymorphic type

...

variable with an unlifted type, even if it is boxed.

This is one of the things we'd like to be able to fix. Right now I have a small explosion of code going on that is being duplicated over and over to parameterize over different unlifted types. In the discussions about levity/lifting so far Dan and I have been trying to tease apart what cases can be handled "forall" style rather than "pi" style to borrow the split from Richard's presentation, just to get at a sense of what really could be talked about without needing different calling conventions, despite lifting. There are situations where we are truly polymorphic in lifting, e.g. (==) from Eq and compare from Ord don't care if the arguments of type 'a' are lifted or not. Until you go to write a function application that returns a value of that type. If all you do is rearrange them then that machinery can be parametric in the choice. `map` on the other hand, cares about the selection because of the `f x` application. (Similarly, `min` and `max` from Ord do care about the convention on hand.) One could see a world wherein you could parameterize such an instance on levity explicitly, but it is pretty exhausting to think about.

...

Another big issue is that *any* mixture of subtyping and (Haskell-style) parametric polymorphism gets very complicated very fast. Especially when you add higher kinds. (Then you need variance annotations, and before long you want variance polymorphism.) I'm extremely dubious about adding subtyping to Haskell. That's one reason Scala is so complicated.

I was actually quite surprised to see a subtyping relationship rear its head in: https://ghc.haskell.org/trac/ghc/attachment/wiki/ImpredicativePolymorphism/I... But re-imagining GHC is good too. Swampy territory it may be, but it's

...

also important, and there really *ought* to be a more seamless of combining strictness and laziness.

I'm somewhat dubious of most approaches that try to mix strictness and laziness under one umbrella. That is why trying to tease out the small handful of cases where we are truly parametric in levity seems interesting. Finding out some situations existed where we really don't care if a type is lifted or not was eye opening to me personally, at least. -Edward

...

| -----Original Message----- | From: Dan Doel [mailto:dan.doel@gmail.com] | Sent: 27 October 2015 23:42 | To: Richard Eisenberg | Cc: Simon Peyton Jones; ghc-devs | Subject: Re: Unlifted data types | | Hello, | | I've added a section with my notes on the minimal semantics required to | address what Haskell lacks with respect to strict types. | | Ed Kmett pointed me to some stuff that I think may fix all the problems with | the !T sort of solution. It builds on the new constraint being considered | for handling impredicativity. The quick sketch goes like this. Given the | declaration: | | data Nat = Z | S !Nat | | then: | | Nat :: * | !Nat :: Unlifted | S :: Nat -> Nat | | But we also have: | | !Nat <~ Nat | | and the witness of this is just an identity function, because all values of | type !Nat are legitimate values of type Nat. Then we can | have: | | case n of | S m -> ... | Z -> ... | | where m has type !Nat, but we can still call `S m` and the like, because | !Nat <~ Nat. If we do use `S m`, the S call will do some unnecessary | evaluation of m, but this can (hopefully) be fixed with an optimization | based on knowing that m has type !Nat, which we are weakening to Nat. | | Thoughts? | | -- Dan | | | On Thu, Oct 8, 2015 at 8:36 AM, Richard Eisenberg wrote: | > | > On Oct 8, 2015, at 6:02 AM, Simon Peyton Jones
...
| wrote: | > | >> What's the wiki page? | > | > https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Richard Eisenberg

1:19 p.m.

I don't have terribly much to add, but I do want to counter one point: On Oct 28, 2015, at 5:48 AM, Edward Kmett wrote:

...

There are situations where we are truly polymorphic in lifting, e.g. (==) from Eq and compare from Ord don't care if the arguments of type 'a' are lifted or not.

But these do, I think. In running code, if (==) is operating over a lazy type, it has to check if the pointer points to a thunk. If (==) is operating over a strict one, it can skip the check. This is not a big difference, but it *is* a difference. A little more thinking about this has led here: The distinction isn't really forall vs. pi. That is, in the cases where the levity matters, we don't really need to pi-quantify. Instead, it's exactly like type classes. Imagine, for a moment, that we have an alternate universe where strict is the default, and we have

...

data Lazy a = Thunk (() -> a) | WHNF a

The WHNF is a bit of a lie, because this representation would mean that the contents of a WHNF are fully evaluated. But let's not get hung up on that point. Then, we have

...

type family BaseType a where BaseType (Lazy a) = a BaseType a = a

class Forceable a where force :: a -> BaseType a

instance Forceable a where force = id

instance Forceable (Lazy a) where force (Thunk f) = f () force (WHNF a) = a

Things that need to behave differently depending on strictness just take a dictionary of the Forceable class. Equivalently, they make a runtime decision of whether to force or not. So we don't need an exponential number of maps. We need a map that makes some runtime decisions. And the optimizer can be taught that specializing these decisions may be helpful. Of course, none of this would be exposed. And I'm not suggesting that we need to make the core language strict-by-default. But it clarified things, for me at least, when I realized this is regular old ad-hoc polymorphism, not something fancier. Richard

Edward Kmett

3:49 p.m.

On Wed, Oct 28, 2015 at 9:19 AM, Richard Eisenberg wrote:

...

I don't have terribly much to add, but I do want to counter one point:

On Oct 28, 2015, at 5:48 AM, Edward Kmett wrote:

...
There are situations where we are truly polymorphic in lifting, e.g. (==) from Eq and compare from Ord don't care if the arguments of type 'a' are lifted or not.

But these do, I think. In running code, if (==) is operating over a lazy type, it has to check if the pointer points to a thunk. If (==) is operating over a strict one, it can skip the check. This is not a big difference, but it *is* a difference.

Yes, but this is the job of the particular instance. Remember the instance gets to know the type it is working at, and its corresponding levity. class Eq (l :: Levity) (t :: Type l) where (==) :: a -> a -> Bool instance Eq @Unlifted (SmallMutableArray s a) where (==) = sameSmallMutableArray instance Eq @Lifted [] where (==) = ... Your objection arises for things like instance Eq @l (Foo @l) Where the same code has to execute with different levities, but if I can't even case or seq on a value with polymorphic levity, and can't construct such a value but merely pass it around then such code is still sound. It isn't safe to write functions that return values of polymorphic levity. I can however hand them back as (# a #). This is how we handle indexing into a array today. If we had a Maybe that was levity polymorphic in its argument Maybe :: forall (l :: Levity). Type l -> Lifted instance Eq @l a => Eq @Lifted (Maybe @l a) where Just a == Just b = a == b _ == _ = False is still okay under these rules, it never case analyzes a value of polymorphic levity, never seq's it. Neither of those things is legal because you can't 'enter' the closure. If it was levity polymorphic in the result type Maybe :: forall (i :: Levity) (j :: Levity). Type i -> Type j then your objection comes true. I can't naively write: instance Eq @i a => Eq j (Maybe @i @j a) where Just a == Just b = a == b _ == _ = False without compiling the same code twice, because of the act of case analysis. If we don't have real 'strict data types' in Lifted this situation never arises though. Even if we do I can write separate: instance Eq @i a => Eq Lifted (Maybe @i Lifted a) instance Eq @i a => Eq Unlifted (Maybe @i Unlifted a) instances, unless we can do better by writing a class constraint on the levity that we can use in a clever way here. I'm mostly concerned with the middle case where we don't overload data types on their levity, and try to recover the ability to talk about strict data types by other more explicit means, but rather permit them to accept arguments of different levities. There none of the code I care about actually needs to act differently based on levity. Anything that results in a function space there has to care about levity, but until a type occurs on the right hand side of an (->) or I go to seq a value of that type or case analyze it, I don't need to care about if its lifted or unlifted. With Dan's (!S) then things get more complicated in ways I don't fully understand the ramifications of yet, as you might be able to lift some of those restrictions. A little more thinking about this has led here: The distinction isn't

...

really forall vs. pi. That is, in the cases where the levity matters, we don't really need to pi-quantify. Instead, it's exactly like type classes.

In many ways pi comes down to doing typeclass like things, you're tracking information from the type system. The vehicle we have for doing that today is typeclasses. I've been thinking about anything that i have that actually needs the "pi" there as a form of "constraint" passing all along, with the constraint being whatever introspection you need to allow on the type to carry on. Imagine, for a moment, that we have an alternate universe where strict is

...

the default, and we have

...
data Lazy a = Thunk (() -> a) | WHNF a

The WHNF is a bit of a lie, because this representation would mean that the contents of a WHNF are fully evaluated. But let's not get hung up on that point.

Then, we have

...
type family BaseType a where BaseType (Lazy a) = a BaseType a = a

class Forceable a where force :: a -> BaseType a

instance Forceable a where force = id

instance Forceable (Lazy a) where force (Thunk f) = f () force (WHNF a) = a

Things that need to behave differently depending on strictness just take a dictionary of the Forceable class. Equivalently, they make a runtime decision of whether to force or not. So we don't need an exponential number of maps. We need a map that makes some runtime decisions. And the optimizer can be taught that specializing these decisions may be helpful.

Of course, none of this would be exposed. And I'm not suggesting that we need to make the core language strict-by-default. But it clarified things, for me at least, when I realized this is regular old ad-hoc polymorphism, not something fancier.

Yep. -Edward

Dan Doel

3:03 p.m.

On Wed, Oct 28, 2015 at 5:05 AM, Simon Peyton Jones wrote:

...

I'm out of bandwidth at the moment, but let me just remark that this is swampy territory. It's easy to mess up.

A particular challenge is polymorphism:

map :: forall a b. (a->b) -> [a] -> [b] map f (x:xs) = (f x) : (map f xs)

In the compiled code for map, is a thunk built for (f x), or is it evaluated eagerly. Well, if you can instantiate the 'b' with !Int, say, then it should be evaluated eagerly. But if you instantiate with Int, then build a thunk. So, we really need two compiled versions of 'map'. Or perhaps four if we take 'b' into account. In general an exponential number.

You can't insantiate `b` to `!Int` without map quantifying over levity, which is somewhat orthogonal (although it would have interactions). Without that, b has kind *, and !Int has kind Unlifted, so this is a kind error. Or, map is explicitly written having b :: Unlifted, in which case it already has the right calling convention.

...

That's one reason that GHC doesn't let you instantiate a polymorphic type variable with an unlifted type, even if it is boxed.

It's a reason to not put boxed, unlifted types in *. I don't think it's a reason not to allow quantifying over a separate kind of uniformly represented, unlifted types (i.e. doesn't include Int#, Double#, etc.).

...

Another big issue is that *any* mixture of subtyping and (Haskell-style) parametric polymorphism gets very complicated very fast. Especially when you add higher kinds. (Then you need variance annotations, and before long you want variance polymorphism.) I'm extremely dubious about adding subtyping to Haskell. That's one reason Scala is so complicated.

I'm not sure how much of this would be required. But how are you saved from this in the impredicativity proposal, which is already attempting to handle more of the induced subtyping of higher-rank types? I would guess it's mostly a matter of, "we're not going to support all that." Which might be fine for the unlifted stuff as well. Richard wrote:

...

But these do, I think. In running code, if (==) is operating over a lazy type, it has to check if the pointer points to a thunk. If (==) is operating over a strict one, it can skip the check. This is not a big difference, but it *is* a difference.

Yeah. It's not that (==) is parametric really. It's that Eq might be able to be parameterized on levity. But this is very useful, since it allows you to instantiate Eq on a bunch of unlifted reference types instead of having to use one-off functions like sameMutableArray#.

...

A little more thinking about this has led here: The distinction isn't really forall vs. pi. That is, in the cases where the levity matters, we don't really need to pi-quantify. Instead, it's exactly like type classes.

Yes, I think so, too. It is ad-hoc polymorphism that can be handled by type classes. There is no reason for an explicit value-level witness of levity that can be cased on, which is presumably what pi would allow.

Ryan Yates

8 Oct 8 Oct

1:07 p.m.

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes On Thu, Oct 8, 2015 at 6:02 AM, Simon Peyton Jones wrote:

...

| I've added a section on parametric levity polymorphism to the wiki. | Sorry it took so long.

What's the wiki page?

Simon | | I might add some thoughts about first-class `!a` being the only | semantic hole in our current strict data type situation later if I | remember to do so. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Carter Schonwald

11 Sep 11 Sep

3:22 a.m.

Would this allow having a strict monoid instance for maybe, given the right hinting at the use site? On Wednesday, September 9, 2015, Edward Kmett wrote:

...

I think ultimately the two views of levity that we've been talking diverge along the same lines as the pi vs forall discussion from your Levity polymorphism talk.

I've been focused entirely on situations where forall suffices, and no distinction is needed in how you compile for both levities.

Maybe could be polymorphic using a mere forall in the levity of the boxed argument it carries as it doesn't care what it is, it never forces it, pattern matching on it just gives it back when you pattern match on it.

Eq or Ord could just as easily work over anything boxed. The particular Eq _instance_ needs to care about the levity.

Most of the combinators for working with Maybe do need to care about that levity however.

e.g. consider fmap in Functor, the particular instances would care. Because you ultimately wind up using fmap to build 'f a' values and those need to know how the let binding should work. There seems to be a pi at work there. Correct operational behavior would depend on the levity.

But if we look at what inference should probably grab for the levity of Functor:

you'd get:

class Functor (l : Levity) (l' : Levity') (f :: GC l -> GC l') where fmap :: forall a b. (a :: GC l) (b :: GC l). (a -> b) -> f a -> f b

Baed on the notion that given current practices, f would cause us to pick a common kind for a and b, and the results of 'f'. Depending on how and if we decided to default to * unless annotated in various situations would drive this closer and closer to the existing Functor by default.

These are indeed distinct functors with distinct operational behavior, and we could implement each of them by supplying separate instances, as the levity would take part in the instance resolution like any other kind argument.

Whether we could expect an average Haskeller to be willing to do so is an entirely different matter.

-Edward

On Wed, Sep 9, 2015 at 12:44 PM, Dan Doel javascript:_e(%7B%7D,'cvml','dan.doel@gmail.com');> wrote:

...
...
No functions (excepting `error` and friends) are truly levity

On Wed, Sep 9, 2015 at 9:03 AM, Richard Eisenberg javascript:_e(%7B%7D,'cvml','eir@cis.upenn.edu');> wrote: polymorphic.

I was talking with Ed Kmett about this yesterday, and he pointed out that this isn't true. There are a significant array of levity polymorphic functions having to do with reference types. They simply shuffle around pointers with the right calling convention, and don't really care what levity their arguments are, because they're just operating uniformly either way. So if we had:

MVar# :: forall (l :: Levity). * -> TYPE (Boxed l) -> TYPE (Boxed Unlifted)

then:

takeMVar :: forall s (l :: Levity) (a :: TYPE (Boxed l)). MVar# s l a -> State# s -> (# State# s, a #) putMVar :: forall s (l :: Levity) (a :: Type (Boxed l)). MVar# s l a -> a -> State# s -> State# s

are genuinely parametric in l. And the same is true for MutVar#, Array#, MutableArray#, etc.

I think data type constructors are actually parametric, too (ignoring data with ! in them for the moment; the underlying constructors of those). Using a constructor just puts the pointers for the fields in the type, and matching on a constructor gives them back. They don't need to care whether their fields are lifted or not, they just preserve whatever the case is.

But this:

...
We use levity polymorphism in the types to get GHC to use its existing type inference to infer strictness. By the time type inference is done, we must ensure that no levity polymorphism remains, because the code generator won't be able to deal with it.

Is not parametric polymorphism; it is ad-hoc polymorphism. It even has the defaulting step from type classes. Except the ad-hoc has been given the same notation as the genuinely parametric, so you can no longer identify the latter. (I'm not sure I'm a great fan of the ad-hoc part anyway, to be honest.)

-- Dan _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org javascript:_e(%7B%7D,'cvml','ghc-devs@haskell.org'); http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Roman Cheplyaka

8:28 a.m.

On 11/09/15 06:22, Carter Schonwald wrote:

...

Would this allow having a strict monoid instance for maybe, given the right hinting at the use site?

That's a fantastic idea, especially if it could be generalized to Applicative functors, where the problem of "inner laziness" is pervasive. But that'd be tricky, because functions have the Lifted kind, and so <*> would have to be crazily levity-polymorphic. (Or is this not crazy?) Roman

Richard Eisenberg

14 Sep 14 Sep

3:59 p.m.

On Sep 11, 2015, at 4:28 AM, Roman Cheplyaka wrote:

...

On 11/09/15 06:22, Carter Schonwald wrote:

...
Would this allow having a strict monoid instance for maybe, given the right hinting at the use site?

That's a fantastic idea, especially if it could be generalized to Applicative functors, where the problem of "inner laziness" is pervasive.

But that'd be tricky, because functions have the Lifted kind, and so <*> would have to be crazily levity-polymorphic. (Or is this not crazy?)

No more crazy than other things. Right now, we have (<*>) :: forall (a :: *) (b :: *) (f :: * -> *). Applicative f => f (a -> b) -> f a -> f b Under this proposal, we would have (ignore the Boxity stuff) (<*>) :: forall (v1 :: Levity) (v2 :: Levity) (v3 :: Levity) (a :: TYPE v1) (b :: TYPE v2) (f :: forall (v4 :: Levity). TYPE v4 -> TYPE v3). Applicative f => f @'Lifted (a -> b) -> f @v1 a -> f @v2 b The higher-rank levity-polymorphism is necessary in case `a` and `b` have different levities. This may be getting wildly out-of-hand, but I don't think it's actually breaking. I would like to point out that using forall here is really quite wrong. As others have pointed out, levity polymorphism is ad-hoc polymorphism, not parametric. Using 'pi' would be much closer to it, but it implies the existence of more dependent types than we really need for this. Richard

...

Roman

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Edward Z. Yang

5:46 p.m.

I'm not so sure how useful an observation this is, but Dunfield had a paper at this very ICFP "Elaborating Evaluation-Order Polymorphism". He argues that polymorphism over evaluation order should be thought of as a form of intersection type. Edward Excerpts from Richard Eisenberg's message of 2015-09-14 08:59:36 -0700:

...

On Sep 11, 2015, at 4:28 AM, Roman Cheplyaka wrote:

...
On 11/09/15 06:22, Carter Schonwald wrote:

...
Would this allow having a strict monoid instance for maybe, given the right hinting at the use site?

That's a fantastic idea, especially if it could be generalized to Applicative functors, where the problem of "inner laziness" is pervasive.

But that'd be tricky, because functions have the Lifted kind, and so <*> would have to be crazily levity-polymorphic. (Or is this not crazy?)

No more crazy than other things. Right now, we have

(<*>) :: forall (a :: *) (b :: *) (f :: * -> *). Applicative f => f (a -> b) -> f a -> f b

Under this proposal, we would have (ignore the Boxity stuff)

(<*>) :: forall (v1 :: Levity) (v2 :: Levity) (v3 :: Levity) (a :: TYPE v1) (b :: TYPE v2) (f :: forall (v4 :: Levity). TYPE v4 -> TYPE v3). Applicative f => f @'Lifted (a -> b) -> f @v1 a -> f @v2 b

The higher-rank levity-polymorphism is necessary in case `a` and `b` have different levities. This may be getting wildly out-of-hand, but I don't think it's actually breaking.

I would like to point out that using forall here is really quite wrong. As others have pointed out, levity polymorphism is ad-hoc polymorphism, not parametric. Using 'pi' would be much closer to it, but it implies the existence of more dependent types than we really need for this.

Richard

...
Roman

_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Simon Marlow

9 Sep 9 Sep

7:42 p.m.

On 09/09/2015 13:28, Simon Peyton Jones wrote:

...

The awkward spot is the runtime system. Currently it goes to some lengths to ensure that it never introduces an indirection for a boxed-but-unlifted type. Simon Marlow would know exactly where. So I suspect that we WOULD need two versions of each (levity-polymorphic) data constructor, alas. And we'd need to know which version to use at every call site, which means the code would need to be levity-monomorphic. So we really would get very little levity polymorphism ineed. I think.

I *think* we're ok here. The RTS doesn't have any special machinery to avoid indirections to unlifted things that I'm aware of. Did you have a particular problem in mind? Indirections appear in various ways, but always as the result of a thunk being there in the first place - updates, selector thunks, and suspending duplicate work (during parallel evaluation) are all thunk-related. So does that mean data constructors can be levity-polymorphic? I don't see why not, but maybe I'm missing something. Cheers Simon

...

All this fits nicely with Dan Doel's point about making ! a first class type constructor.

Simon

| -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Richard Eisenberg | Sent: 08 September 2015 14:46 | To: ghc-devs | Subject: Re: Unlifted data types | | I have put up an alternate set of proposals on | | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes | | These sidestep around `Force` and `suspend` but probably have other | problems. They make heavy use of levity polymorphism. | | Back story: this all was developed in a late-evening Haskell Symposium | session that took place in the hotel bar. It seems Edward and I walked | away with quite different understandings of what had taken place. I've | written up my understanding. Most likely, the Right Idea is a | combination of this all! | | See what you think. | | Thanks! | Richard | | On Sep 8, 2015, at 3:52 AM, Simon Peyton Jones | wrote: | | > | And to | > | be honest, I'm not sure we need arbitrary data types in Unlifted; | > | Force (which would be primitive) might be enough. | > | > That's an interesting thought. But presumably you'd have to use | 'suspend' (a terrible name) a lot: | > | > type StrictList a = Force (StrictList' a) data StrictList' a = Nil | | > Cons !a (StrictList a) | > | > mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs | = | > mapStrict' f (suspend xs) | > | > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' | f | > Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) | > | > | > That doesn't look terribly convenient. | > | > | ensure that threads don't simply | > | pass thunks between each other. But, if you have unlifted types, | > | then you can have: | > | | > | data UMVar (a :: Unlifted) | > | | > | and then the type rules out the possibility of passing thunks | > | through a reference (at least at the top level). | > | > Really? Presumably UMVar is a new primitive? With a family of | operations like MVar? If so can't we just define | > newtype UMVar a = UMV (MVar a) | > putUMVar :: UMVar a -> a -> IO () | > putUMVar (UMVar v) x = x `seq` putMVar v x | > | > I don't see Force helping here. | > | > Simon | > _______________________________________________ | > ghc-devs mailing list | > ghc-devs@haskell.org | > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs | | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Simon Peyton Jones

10 Sep 10 Sep

9:41 a.m.

| > The awkward spot is the runtime system. Currently it goes to some | > lengths to ensure that it never introduces an indirection for a | > boxed-but-unlifted type. Simon Marlow would know exactly where. So | | I *think* we're ok here. The RTS doesn't have any special machinery | to avoid indirections to unlifted things that I'm aware of. Did you | have a particular problem in mind? Well I can't point to anything very specific. I just recall that in various places, if a pointer was to an Array# we would immediately, eagerly, recurse in the GC rather than add the Array# to the queue for later processing. Maybe that is no longer true. Maybe it was never true. It shouldn't be hard to find out. If true, there would be a run-time system test that returns true for boxed-but-unlifted heap objects. I think it would be worth a look because if I'm right it could have a significant impact on the design. Simon

Simon Marlow

7:59 p.m.

On 10/09/2015 10:41, Simon Peyton Jones wrote:

...

| > The awkward spot is the runtime system. Currently it goes to some | > lengths to ensure that it never introduces an indirection for a | > boxed-but-unlifted type. Simon Marlow would know exactly where. So | | I *think* we're ok here. The RTS doesn't have any special machinery | to avoid indirections to unlifted things that I'm aware of. Did you | have a particular problem in mind?

Well I can't point to anything very specific. I just recall that in various places, if a pointer was to an Array# we would immediately, eagerly, recurse in the GC rather than add the Array# to the queue for later processing. Maybe that is no longer true. Maybe it was never true.

Maybe that was an earlier variant of the GC. The current GC just treats unlifted objects like other objects in a breadth-first way, and it never introduces indirections except when reducing a selector thunk. So I think this is ok. Cheers Simon

...

It shouldn't be hard to find out. If true, there would be a run-time system test that returns true for boxed-but-unlifted heap objects.

I think it would be worth a look because if I'm right it could have a significant impact on the design. > Simon

Dan Doel

9 Sep 9 Sep

2:43 a.m.

On Tue, Sep 8, 2015 at 3:52 AM, Simon Peyton Jones wrote:

...

| And to | be honest, I'm not sure we need arbitrary data types in Unlifted; | Force (which would be primitive) might be enough.

That's an interesting thought. But presumably you'd have to use 'suspend' (a terrible name) a lot:

type StrictList a = Force (StrictList' a) data StrictList' a = Nil | Cons !a (StrictList a)

mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs = mapStrict' f (suspend xs)

mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' f Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs)

That doesn't look terribly convenient.

It's missing the part that makes it convenient. type StrictList a = Force (StrictList' a) data StrictList' a = Nil' | Cons' !a (StrictList a) pattern Nil = Force Nil' pattern Cons x xs = Force (Cons' x xs) mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f Nil = Nil mapStrict f (Cons x xs) = Cons (f x) (mapStrict f xs) But, really, my point is that we already almost have StrictList _today_: data StrictList a = Nil | Cons !a !(StrictList a) The only difference between this and the previous definition (denotationally, at least) is the outer-most level. That's why I liked the original proposal (which probably disappeared too fast for most people to read it), which was more like being able to talk about `!a` as a thing in itself. It's the only semantic gap in being able to define totally unlifted data types right now. So maybe it's also the only operational gap that needs to be plugged, as well. But that was vetoed because `!a` in a data declaration doesn't make a constructor with type `!a -> ...`, but `a -> ...` which evaluates.

...

Really? Presumably UMVar is a new primitive? With a family of operations like MVar? If so can't we just define newtype UMVar a = UMV (MVar a) putUMVar :: UMVar a -> a -> IO () putUMVar (UMVar v) x = x `seq` putMVar v x

I don't see Force helping here.

Yes, true. It only helps ensure that the implementation is correct, rather than enabling a previously impossible construction. Kind of like certain uses of GADTs vs. phantom types. But the ArrayArray people already want UMVar (and the like) anyway, because it cuts out a layer of indirection for types that are already unlifted. -- Dan

Simon Peyton Jones

11:31 a.m.

| (denotationally, at least) is the outer-most level. That's why I liked | the original proposal (which probably disappeared too fast for most | people to read it), which was more like being able to talk about `!a` | as a thing in itself. It's the only semantic gap in being able to That's another interesting idea. Currently ! is an annotation on a data constructor argument (only). It is not a type-former; so that !Int or !(Int ->Int) are not types. I think your point is that if we spell "Force" as "!" then it becomes a first-class type former. Let's try that: * t and !t are distinct types * You get from one to the other by using ! as a term-level data constructor, so you can use it in terms and in pattern matching. Thus - if (e::t) then (!e :: !t) - if (e::!t) then case of !x -> blah here (x::t) * As with newtype constructors, pattern-matching on !t doesn't imply evaluation; after all, it's already evaluated. * In Richard's notation, the type constructor (!) has kind (!) :: TYPE 'Boxed l -> TYPE 'Boxed Unlifted The wrinkle for data types is this. The declaration data T = MkT !Int !a produces a data constructor with type MkT :: forall a. Int -> a -> T But if (!t) is a first-class type, then you'd expect to get a data constructor with type MkT :: forall a. !Int -> !a -> T and that in turn would force many calls to look like (MkT !e1 !e2). But we could re-cast the special treatment for the top-level bang on data constructor arguments, to say that we get a worker/wrapper story so that the programmer's eye view is indeed that MkT :: forall a. Int -> a -> T, and the compiler generates the eval code to match things up. (Which is what happens now.) But if you wrote, say, data S = MkS (!Int, !a) then this magic would not happen and you really would get MkS :: (!Int, !a) -> S Interesting. Simon | -----Original Message----- | From: Dan Doel [mailto:dan.doel@gmail.com] | Sent: 09 September 2015 03:44 | To: Simon Peyton Jones | Cc: Edward Z. Yang; ghc-devs | Subject: Re: Unlifted data types | | On Tue, Sep 8, 2015 at 3:52 AM, Simon Peyton Jones | wrote: | > | And to | > | be honest, I'm not sure we need arbitrary data types in Unlifted; | > | Force (which would be primitive) might be enough. | > | > That's an interesting thought. But presumably you'd have to use | 'suspend' (a terrible name) a lot: | > | > type StrictList a = Force (StrictList' a) data StrictList' a = Nil | | > Cons !a (StrictList a) | > | > mapStrict :: (a -> b) -> StrictList a -> StrictList b mapStrict f xs | = | > mapStrict' f (suspend xs) | > | > mapStrict' :: (a -> b) -> StrictList' a -> StrictList' b mapStrict' | f | > Nil = Nil mapStrict' f (Cons x xs) = Cons (f x) (mapStrict f xs) | > | > | > That doesn't look terribly convenient. | | It's missing the part that makes it convenient. | | type StrictList a = Force (StrictList' a) | data StrictList' a = Nil' | Cons' !a (StrictList a) | pattern Nil = Force Nil' | pattern Cons x xs = Force (Cons' x xs) | | mapStrict :: (a -> b) -> StrictList a -> StrictList b | mapStrict f Nil = Nil | mapStrict f (Cons x xs) = Cons (f x) (mapStrict f xs) | | But, really, my point is that we already almost have StrictList | _today_: | | data StrictList a = Nil | Cons !a !(StrictList a) | | The only difference between this and the previous definition | (denotationally, at least) is the outer-most level. That's why I liked | the original proposal (which probably disappeared too fast for most | people to read it), which was more like being able to talk about `!a` | as a thing in itself. It's the only semantic gap in being able to | define totally unlifted data types right now. So maybe it's also the | only operational gap that needs to be plugged, as well. | | But that was vetoed because `!a` in a data declaration doesn't make a | constructor with type `!a -> ...`, but `a -> ...` which evaluates. | | > Really? Presumably UMVar is a new primitive? With a family of | operations like MVar? If so can't we just define | > newtype UMVar a = UMV (MVar a) | > putUMVar :: UMVar a -> a -> IO () | > putUMVar (UMVar v) x = x `seq` putMVar v x | > | > I don't see Force helping here. | | Yes, true. It only helps ensure that the implementation is correct, | rather than enabling a previously impossible construction. Kind of | like certain uses of GADTs vs. phantom types. | | But the ArrayArray people already want UMVar (and the like) anyway, | because it cuts out a layer of indirection for types that are already | unlifted. | | -- Dan

Edward Z. Yang

7 Sep 7 Sep

9:35 p.m.

Hello Simon,

...

There are several distinct things being mixed up.

I've split the document into three (four?) distinct subproposals. Proposals 1 and 2 stand alone.

...

(1) First, a proposal to allow a data type to be declared to be unlifted. On its own, this is a pretty simple proposal: [snip]

I would really like to see this articulated as a stand-alone proposal. It makes sense by itself, and is really pretty simple.

This is now "Proposal 1".

...

(2) Second, we cannot expect levity polymorphism. Consider map f (x:xs) = f x : map f xs Is the (f x) a thunk or is it evaluated strictly? Unless you are going to clone the code for map (which levity polymorphism is there to avoid), we can't answer "it depends on the type of (f x)". So, no, I think levity polymorphism is out.

So I vote against splitting # into two: plain will do just fine.

Levity polymorphism will not work without generating two copies of 'map', but plain polymorphism over 'Unlifted' is useful (as Dan has also pointed out.) In any case, I've extracted this out into a separate subproposal "Proposal 1.1". https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes#Proposal1.1:Polymorp... (reordering here.)

...

(4) Fourth, you don't mention a related suggestion, namely to allow newtype T = MkT Int# with T getting kind #. I see no difficulty here. We do have (T ~R Int#). It's just a useful way of wrapping a newtype around an unlifted type.

This is now "Proposal 2".

...

(3) Third, the stuff about Force and suspend. Provided you do no more than write library code that uses the above new features I'm fine. But there seems to be lots of stuff that dances around the hope that (Force a) is represented the same way as 'a'. I don't' know how to make this fly. Is there a coercion in FC? If so then (a ~R Force a). And that seems very doubtful since we must do some evaluation.

I agree that we can't introduce a coercion between 'Force a' and 'a', for the reason you mentioned. (there's also a second reason which is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have different kinds.) I've imagined that we might be able to just continue representing Force explicitly in Core, and somehow "compile it away" at STG time, but I am definitely fuzzy about how this is supposed to work. Perhaps Force should not actually be a data type, and we should have 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter of which compiles to a no-op.) Cheers, Edward

Simon Peyton Jones

9:52 p.m.

| I've split the document into three (four?) distinct subproposals. | Proposals 1 and 2 stand alone. I've re-numbered them 1,2,3,4, since 1.1 is (to me) a pretty major deal, stands in its own right, and certainly isn't a sub-proposal of (1). Under (new) 2, I'm very dubious about "Boxed levity polymorphism in types (and functions with extra code generation)". It's certainly true that we could generate two copied of the code for every function; but by generating three, or perhaps four copies we could also deal with Int# and Float#. Maybe one more for Double#. .NET does this on the fly, incidentally. Where do you stop? Also remember it's not just an issue of GC pointers. The semantics of the function changes, because things that are thunks for the lifted version become strict in the unlifted version. Your 'umap' is a bit more convincing. But for now (2) would be low on my priority list, until we encounter user pressure which (note) we have not encountered so far despite the range of boxed but unlifted types. Why is now the right time? (1) and (3) seem solid. I'll leave (4) for another message. Simon | -----Original Message----- | From: Edward Z. Yang [mailto:ezyang@mit.edu] | Sent: 07 September 2015 22:36 | To: Simon Peyton Jones | Cc: ghc-devs | Subject: RE: Unlifted data types | | Hello Simon, | | > There are several distinct things being mixed up. | | I've split the document into three (four?) distinct subproposals. | Proposals 1 and 2 stand alone. | | > (1) First, a proposal to allow a data type to be declared to be | unlifted. On its own, this is a pretty simple proposal: [snip] | > | > I would really like to see this articulated as a stand-alone proposal. | It makes sense by itself, and is really pretty simple. | | This is now "Proposal 1". | | > (2) Second, we cannot expect levity polymorphism. Consider | > map f (x:xs) = f x : map f xs | > Is the (f x) a thunk or is it evaluated strictly? Unless you are going | to clone the code for map (which levity polymorphism is there to avoid), | we can't answer "it depends on the type of (f x)". So, no, I think | levity polymorphism is out. | > | > So I vote against splitting # into two: plain will do just fine. | | Levity polymorphism will not work without generating two copies of | 'map', but plain polymorphism over 'Unlifted' is useful (as Dan has | also pointed out.) In any case, I've extracted this out into a | separate subproposal "Proposal 1.1". | https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes#Proposal1.1:Polym | orphismoveranewUnliftedkind | | (reordering here.) | | > (4) Fourth, you don't mention a related suggestion, namely to allow | > newtype T = MkT Int# | > with T getting kind #. I see no difficulty here. We do have (T ~R | Int#). It's just a useful way of wrapping a newtype around an unlifted | type. | | This is now "Proposal 2". | | > (3) Third, the stuff about Force and suspend. Provided you do no more | than write library code that uses the above new features I'm fine. But | there seems to be lots of stuff that dances around the hope that (Force | a) is represented the same way as 'a'. I don't' know how to make this | fly. Is there a coercion in FC? If so then (a ~R Force a). And that | seems very doubtful since we must do some evaluation. | | I agree that we can't introduce a coercion between 'Force a' and | 'a', for the reason you mentioned. (there's also a second reason which | is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have | different kinds.) | | I've imagined that we might be able to just continue representing | Force explicitly in Core, and somehow "compile it away" at STG time, | but I am definitely fuzzy about how this is supposed to work. Perhaps | Force should not actually be a data type, and we should have | 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter | of which compiles to a no-op.) | | Cheers, | Edward

Simon Peyton Jones

9:55 p.m.

| I agree that we can't introduce a coercion between 'Force a' and | 'a', for the reason you mentioned. (there's also a second reason which | is that 'a ~R Force a' is not well-typed; 'a' and 'Force a' have | different kinds.) | | I've imagined that we might be able to just continue representing | Force explicitly in Core, and somehow "compile it away" at STG time, | but I am definitely fuzzy about how this is supposed to work. Perhaps | Force should not actually be a data type, and we should have | 'force# :: a -> Force a' and 'unforce# :: Force a -> a' (the latter | of which compiles to a no-op.) I'm still doubtful. What is the problem you are trying to solve here? How does Force help us? Note that a singleton unboxed tuple (# e #) has the effect of suspending; e.g. f x = (# x+1 #) return immediately, returning a pointer to a thunk for (x+1). I'm not sure if that is relevant. Simon

Edward Z. Yang

10:08 p.m.

Excerpts from Simon Peyton Jones's message of 2015-09-07 14:55:09 -0700:

...

I'm still doubtful. What is the problem you are trying to solve here? How does Force help us?

The problem 'Force' is trying to solve is the fact that Haskell currently has many existing lifted data types, and they all have ~essentially identical unlifted versions. But for a user to write the lifted and unlifted version, they have to copy paste their code or use 'Force'.

...

Note that a singleton unboxed tuple (# e #) has the effect of suspending; e.g. f x = (# x+1 #) return immediately, returning a pointer to a thunk for (x+1). I'm not sure if that is relevant.

I don't think so? Unboxed tuples take a computation with kind * and represent it in kind #. But 'suspend' takes a computation in kind # and represents in kind *. Edward

Simon Peyton Jones

8 Sep 8 Sep

7:40 a.m.

| The problem 'Force' is trying to solve is the fact that Haskell | currently has many existing lifted data types, and they all have | ~essentially identical unlifted versions. But for a user to write the | lifted and unlifted version, they have to copy paste their code or use | 'Force'. But (Force [a]) will only be head-strict. You still have to make an essentially-identical version if you want a strict list. Ditto all components of a data structure. Is the gain (of head-strictness) really worth it? Incidentally, on the Unlifted-vs-# discussion, I'm not against making the distinction. I can see advantages in carving out a strict subset of Haskell, which this would help to do. Simon

Takenobu Tani

29 Dec 29 Dec

9:54 a.m.

Hi devs, Apologies if I'm missing context. Is it more better that "Unlifted" kind is other name, such as "BoxedUnlifted", "Bul", "%",... ? I'm studying Unlifted data types [1]. It's good proposals to optimize code by users. But I worry that newcomers may be confusing about its kind name. In my understanding [2], "unlifted" types may include "unboxed, unlifted" type. But "unlifted" kind does not include "unboxed, unlifted" types. [1]: https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes [2]: http://takenobu-hs.github.io/downloads/haskell_lazy_evaluation.pdf#page=190 (sorry, large) I wish you a happy new year. Regards, Takenobu 2015-09-04 17:03 GMT+09:00 Edward Z. Yang :

...

Hello friends,

After many discussions and beers at ICFP, I've written up my current best understanding of the unlifted data types proposal:

https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes

Many thanks to Richard, Iavor, Ryan, Simon, Duncan, George, Paul, Edward Kmett, and any others who I may have forgotten for crystallizing this proposal.

Cheers, Edward _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

3477

Age (days ago)

3593

Last active (days ago)

List overview

Download

67 comments

12 participants

participants (12)

Carter Schonwald
Dan Doel
Edward Kmett
Edward Z. Yang
Eric Seidel
Jan Stolarek
Richard Eisenberg
Roman Cheplyaka
Ryan Yates
Simon Marlow
Simon Peyton Jones
Takenobu Tani