
Forwarded from Andrew Martin below. I think we want more than just Maybe
(more than one null), but the nesting I described is certainly more
convenience than necessity.
---------- Forwarded message ---------
From: Andrew Martin
Relatedly, I was thinking the other day that after finishing implementing https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., I should really look at seeing if it's possible to add this maybe-of-a-lifted value trick straight to GHC. I think that with:
data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep | ...
data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v)
This doesn't have the nesting issues because the kind system prevents nesting. But anyway, back to the original question. I would recommend not using Maybe.Unsafe and using unpacked-maybe instead. The latter is definitely safe, and it only costs an extra machine word of space in each data constructor it gets used in, and it doesn't introduce more indirections.
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
Null pointers are widely known to be a lousy language feature in general, but there are certain situations where they're *really* useful for compact representation. For example, we define
newtype TMVar a = TMVar (TVar (Maybe a))
We don't, however, actually use the fact that (Maybe a) is lifted. So we could represent this much more efficiently using something like
newtype TMVar a = TMVar (TVar a)
where Nothing is represented by a distinguished "null" pointer.
While it's possible to implement this sort of thing in user code (with lots of fuss and care), it's not very nice at all. What I'd really like to be able to do is represent certain kinds of sums like this natively.
Now that we're getting BoxedRep, I think we can probably make it happen. The trick is to add a special Levity constructor representing sums of particular shapes. Specifically, we can represent a type like this if it is a possibly-nested sum which, when flattened into a single sum, consists of some number of nullary tuples and at most one Lifted or Unlifted type. Then we can have (inline) primops to convert between the BoxedRep and the sum-of-sums representations.
Anyone have thoughts on details for what the Levity constructor arguments might look like?
-- -Andrew Thaddeus Martin

From a implementors perspective my main questions would be: * How big is the benefit in practice? How many use cases are there? * How bad are the costs? (Runtime overhead, rts complexity, ...) The details of how this would be exposed to a user would are important. But if the costs are too high for the drawbacks then it becomes a moot point. David Feuer schrieb am 14.10.2020 um 22:21:
Forwarded from Andrew Martin below. I think we want more than just Maybe (more than one null), but the nesting I described is certainly more convenience than necessity.
---------- Forwarded message --------- From: *Andrew Martin*
mailto:andrew.thaddeus@gmail.com> Date: Wed, Oct 14, 2020, 4:14 PM Subject: Re: Restricted sums in BoxedRep To: David Feuer mailto:david.feuer@gmail.com> You'll have to forward this to the ghc-devs list to share it with others since I'm not currently subscribed to it, but I've had this same thought before. It is discussed at https://github.com/andrewthad/impure-containers/issues/12. Here's the relevant excerpt:
Relatedly, I was thinking the other day that after finishing implementing https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., I should really look at seeing if it's possible to add this maybe-of-a-lifted value trick straight to GHC. I think that with:
|data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep | ... data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v) |
This doesn't have the nesting issues because the kind system prevents nesting. But anyway, back to the original question. I would recommend not using |Maybe.Unsafe| and using |unpacked-maybe| instead. The latter is definitely safe, and it only costs an extra machine word of space in each data constructor it gets used in, and it doesn't introduce more indirections.
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
mailto:david.feuer@gmail.com> wrote: Null pointers are widely known to be a lousy language feature in general, but there are certain situations where they're *really* useful for compact representation. For example, we define
newtype TMVar a = TMVar (TVar (Maybe a))
We don't, however, actually use the fact that (Maybe a) is lifted. So we could represent this much more efficiently using something like
newtype TMVar a = TMVar (TVar a)
where Nothing is represented by a distinguished "null" pointer.
While it's possible to implement this sort of thing in user code (with lots of fuss and care), it's not very nice at all. What I'd really like to be able to do is represent certain kinds of sums like this natively.
Now that we're getting BoxedRep, I think we can probably make it happen. The trick is to add a special Levity constructor representing sums of particular shapes. Specifically, we can represent a type like this if it is a possibly-nested sum which, when flattened into a single sum, consists of some number of nullary tuples and at most one Lifted or Unlifted type. Then we can have (inline) primops to convert between the BoxedRep and the sum-of-sums representations.
Anyone have thoughts on details for what the Levity constructor arguments might look like?
-- -Andrew Thaddeus Martin
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

A related idea that came up recently and is perhaps simpler ties into this
via the lens of having unboxed Mvars/tvars (even if it’s restricted to just
things we can embed in a word64#)
This came up in
https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where viktor
had millions of independent mvars holding what’s essentially a strict unit
()!
The motivation in this later scenario is that in high concurrency settings,
the less trivial stuff the gc needs to trace under updates, the better ghc
scales.
This may not be a use case david has in mind, but certainly seems related.
Phrased more succinctly: gc perf dominates large heap / many core
computation in Haskell via sensitivity to allocation volume / mutation
volume (to ensure generational hypothesis stays valid), and providing tools
to incrementally reduce the pressure with local changes would be good.
So I’d propose / suggest that a baby step towards what david asks would be
for us to work out some manner of unboxed tvar/mvar ref machinery that
supports unboxed values.
On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger
From a implementors perspective my main questions would be:
* How big is the benefit in practice? How many use cases are there? * How bad are the costs? (Runtime overhead, rts complexity, ...)
The details of how this would be exposed to a user would are important. But if the costs are too high for the drawbacks then it becomes a moot point.
David Feuer schrieb am 14.10.2020 um 22:21:
Forwarded from Andrew Martin below. I think we want more than just Maybe (more than one null), but the nesting I described is certainly more convenience than necessity.
---------- Forwarded message --------- From: Andrew Martin
Date: Wed, Oct 14, 2020, 4:14 PM Subject: Re: Restricted sums in BoxedRep To: David Feuer You'll have to forward this to the ghc-devs list to share it with others since I'm not currently subscribed to it, but I've had this same thought before. It is discussed at https://github.com/andrewthad/impure-containers/issues/12. Here's the relevant excerpt:
Relatedly, I was thinking the other day that after finishing implementing https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., I should really look at seeing if it's possible to add this maybe-of-a-lifted value trick straight to GHC. I think that with:
data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep | ...
data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v)
This doesn't have the nesting issues because the kind system prevents nesting. But anyway, back to the original question. I would recommend not using Maybe.Unsafe and using unpacked-maybe instead. The latter is definitely safe, and it only costs an extra machine word of space in each data constructor it gets used in, and it doesn't introduce more indirections.
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
wrote: Null pointers are widely known to be a lousy language feature in general, but there are certain situations where they're *really* useful for compact representation. For example, we define
newtype TMVar a = TMVar (TVar (Maybe a))
We don't, however, actually use the fact that (Maybe a) is lifted. So we could represent this much more efficiently using something like
newtype TMVar a = TMVar (TVar a)
where Nothing is represented by a distinguished "null" pointer.
While it's possible to implement this sort of thing in user code (with lots of fuss and care), it's not very nice at all. What I'd really like to be able to do is represent certain kinds of sums like this natively.
Now that we're getting BoxedRep, I think we can probably make it happen. The trick is to add a special Levity constructor representing sums of particular shapes. Specifically, we can represent a type like this if it is a possibly-nested sum which, when flattened into a single sum, consists of some number of nullary tuples and at most one Lifted or Unlifted type. Then we can have (inline) primops to convert between the BoxedRep and the sum-of-sums representations.
Anyone have thoughts on details for what the Levity constructor arguments might look like?
-- -Andrew Thaddeus Martin
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's
accepted BoxedRep proposal.
On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald
A related idea that came up recently and is perhaps simpler ties into this via the lens of having unboxed Mvars/tvars (even if it’s restricted to just things we can embed in a word64#)
This came up in https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where viktor had millions of independent mvars holding what’s essentially a strict unit ()!
The motivation in this later scenario is that in high concurrency settings, the less trivial stuff the gc needs to trace under updates, the better ghc scales.
This may not be a use case david has in mind, but certainly seems related.
Phrased more succinctly: gc perf dominates large heap / many core computation in Haskell via sensitivity to allocation volume / mutation volume (to ensure generational hypothesis stays valid), and providing tools to incrementally reduce the pressure with local changes would be good.
So I’d propose / suggest that a baby step towards what david asks would be for us to work out some manner of unboxed tvar/mvar ref machinery that supports unboxed values.
On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < klebinger.andreas@gmx.at> wrote:
From a implementors perspective my main questions would be:
* How big is the benefit in practice? How many use cases are there? * How bad are the costs? (Runtime overhead, rts complexity, ...)
The details of how this would be exposed to a user would are important. But if the costs are too high for the drawbacks then it becomes a moot point.
David Feuer schrieb am 14.10.2020 um 22:21:
Forwarded from Andrew Martin below. I think we want more than just Maybe (more than one null), but the nesting I described is certainly more convenience than necessity.
---------- Forwarded message --------- From: Andrew Martin
Date: Wed, Oct 14, 2020, 4:14 PM Subject: Re: Restricted sums in BoxedRep To: David Feuer You'll have to forward this to the ghc-devs list to share it with others since I'm not currently subscribed to it, but I've had this same thought before. It is discussed at https://github.com/andrewthad/impure-containers/issues/12. Here's the relevant excerpt:
Relatedly, I was thinking the other day that after finishing implementing https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., I should really look at seeing if it's possible to add this maybe-of-a-lifted value trick straight to GHC. I think that with:
data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep | ...
data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v)
This doesn't have the nesting issues because the kind system prevents nesting. But anyway, back to the original question. I would recommend not using Maybe.Unsafe and using unpacked-maybe instead. The latter is definitely safe, and it only costs an extra machine word of space in each data constructor it gets used in, and it doesn't introduce more indirections.
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
wrote: Null pointers are widely known to be a lousy language feature in general, but there are certain situations where they're *really* useful for compact representation. For example, we define
newtype TMVar a = TMVar (TVar (Maybe a))
We don't, however, actually use the fact that (Maybe a) is lifted. So we could represent this much more efficiently using something like
newtype TMVar a = TMVar (TVar a)
where Nothing is represented by a distinguished "null" pointer.
While it's possible to implement this sort of thing in user code (with lots of fuss and care), it's not very nice at all. What I'd really like to be able to do is represent certain kinds of sums like this natively.
Now that we're getting BoxedRep, I think we can probably make it happen. The trick is to add a special Levity constructor representing sums of particular shapes. Specifically, we can represent a type like this if it is a possibly-nested sum which, when flattened into a single sum, consists of some number of nullary tuples and at most one Lifted or Unlifted type. Then we can have (inline) primops to convert between the BoxedRep and the sum-of-sums representations.
Anyone have thoughts on details for what the Levity constructor arguments might look like?
-- -Andrew Thaddeus Martin
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Sorry, unlifted, not unboxed...
On Thu, Oct 15, 2020, 11:57 AM David Feuer
Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's accepted BoxedRep proposal.
On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < carter.schonwald@gmail.com> wrote:
A related idea that came up recently and is perhaps simpler ties into this via the lens of having unboxed Mvars/tvars (even if it’s restricted to just things we can embed in a word64#)
This came up in https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where viktor had millions of independent mvars holding what’s essentially a strict unit ()!
The motivation in this later scenario is that in high concurrency settings, the less trivial stuff the gc needs to trace under updates, the better ghc scales.
This may not be a use case david has in mind, but certainly seems related.
Phrased more succinctly: gc perf dominates large heap / many core computation in Haskell via sensitivity to allocation volume / mutation volume (to ensure generational hypothesis stays valid), and providing tools to incrementally reduce the pressure with local changes would be good.
So I’d propose / suggest that a baby step towards what david asks would be for us to work out some manner of unboxed tvar/mvar ref machinery that supports unboxed values.
On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < klebinger.andreas@gmx.at> wrote:
From a implementors perspective my main questions would be:
* How big is the benefit in practice? How many use cases are there? * How bad are the costs? (Runtime overhead, rts complexity, ...)
The details of how this would be exposed to a user would are important. But if the costs are too high for the drawbacks then it becomes a moot point.
David Feuer schrieb am 14.10.2020 um 22:21:
Forwarded from Andrew Martin below. I think we want more than just Maybe (more than one null), but the nesting I described is certainly more convenience than necessity.
---------- Forwarded message --------- From: Andrew Martin
Date: Wed, Oct 14, 2020, 4:14 PM Subject: Re: Restricted sums in BoxedRep To: David Feuer You'll have to forward this to the ghc-devs list to share it with others since I'm not currently subscribed to it, but I've had this same thought before. It is discussed at https://github.com/andrewthad/impure-containers/issues/12. Here's the relevant excerpt:
Relatedly, I was thinking the other day that after finishing implementing https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., I should really look at seeing if it's possible to add this maybe-of-a-lifted value trick straight to GHC. I think that with:
data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep | ...
data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v)
This doesn't have the nesting issues because the kind system prevents nesting. But anyway, back to the original question. I would recommend not using Maybe.Unsafe and using unpacked-maybe instead. The latter is definitely safe, and it only costs an extra machine word of space in each data constructor it gets used in, and it doesn't introduce more indirections.
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
wrote: Null pointers are widely known to be a lousy language feature in general, but there are certain situations where they're *really* useful for compact representation. For example, we define
newtype TMVar a = TMVar (TVar (Maybe a))
We don't, however, actually use the fact that (Maybe a) is lifted. So we could represent this much more efficiently using something like
newtype TMVar a = TMVar (TVar a)
where Nothing is represented by a distinguished "null" pointer.
While it's possible to implement this sort of thing in user code (with lots of fuss and care), it's not very nice at all. What I'd really like to be able to do is represent certain kinds of sums like this natively.
Now that we're getting BoxedRep, I think we can probably make it happen. The trick is to add a special Levity constructor representing sums of particular shapes. Specifically, we can represent a type like this if it is a possibly-nested sum which, when flattened into a single sum, consists of some number of nullary tuples and at most one Lifted or Unlifted type. Then we can have (inline) primops to convert between the BoxedRep and the sum-of-sums representations.
Anyone have thoughts on details for what the Levity constructor arguments might look like?
-- -Andrew Thaddeus Martin
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Indeed, I mean things that aren’t pointery, and could be represented by a
tvar paired with a mutable byte array or mvar with mutable byte array, but
which we’d want considered as a single heap object from the rts/gc
perspective.
On Thu, Oct 15, 2020 at 11:58 AM David Feuer
Sorry, unlifted, not unboxed...
On Thu, Oct 15, 2020, 11:57 AM David Feuer
wrote: Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's accepted BoxedRep proposal.
On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < carter.schonwald@gmail.com> wrote:
A related idea that came up recently and is perhaps simpler ties into this via the lens of having unboxed Mvars/tvars (even if it’s restricted to just things we can embed in a word64#)
This came up in https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where viktor had millions of independent mvars holding what’s essentially a strict unit ()!
The motivation in this later scenario is that in high concurrency settings, the less trivial stuff the gc needs to trace under updates, the better ghc scales.
This may not be a use case david has in mind, but certainly seems related.
Phrased more succinctly: gc perf dominates large heap / many core computation in Haskell via sensitivity to allocation volume / mutation volume (to ensure generational hypothesis stays valid), and providing tools to incrementally reduce the pressure with local changes would be good.
So I’d propose / suggest that a baby step towards what david asks would be for us to work out some manner of unboxed tvar/mvar ref machinery that supports unboxed values.
On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < klebinger.andreas@gmx.at> wrote:
From a implementors perspective my main questions would be:
* How big is the benefit in practice? How many use cases are there? * How bad are the costs? (Runtime overhead, rts complexity, ...)
The details of how this would be exposed to a user would are important. But if the costs are too high for the drawbacks then it becomes a moot point.
David Feuer schrieb am 14.10.2020 um 22:21:
Forwarded from Andrew Martin below. I think we want more than just Maybe (more than one null), but the nesting I described is certainly more convenience than necessity.
---------- Forwarded message --------- From: Andrew Martin
Date: Wed, Oct 14, 2020, 4:14 PM Subject: Re: Restricted sums in BoxedRep To: David Feuer You'll have to forward this to the ghc-devs list to share it with others since I'm not currently subscribed to it, but I've had this same thought before. It is discussed at https://github.com/andrewthad/impure-containers/issues/12. Here's the relevant excerpt:
Relatedly, I was thinking the other day that after finishing implementing https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., I should really look at seeing if it's possible to add this maybe-of-a-lifted value trick straight to GHC. I think that with:
data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep | ...
data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v)
This doesn't have the nesting issues because the kind system prevents nesting. But anyway, back to the original question. I would recommend not using Maybe.Unsafe and using unpacked-maybe instead. The latter is definitely safe, and it only costs an extra machine word of space in each data constructor it gets used in, and it doesn't introduce more indirections.
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
wrote: Null pointers are widely known to be a lousy language feature in general, but there are certain situations where they're *really* useful for compact representation. For example, we define
newtype TMVar a = TMVar (TVar (Maybe a))
We don't, however, actually use the fact that (Maybe a) is lifted. So we could represent this much more efficiently using something like
newtype TMVar a = TMVar (TVar a)
where Nothing is represented by a distinguished "null" pointer.
While it's possible to implement this sort of thing in user code (with lots of fuss and care), it's not very nice at all. What I'd really like to be able to do is represent certain kinds of sums like this natively.
Now that we're getting BoxedRep, I think we can probably make it happen. The trick is to add a special Levity constructor representing sums of particular shapes. Specifically, we can represent a type like this if it is a possibly-nested sum which, when flattened into a single sum, consists of some number of nullary tuples and at most one Lifted or Unlifted type. Then we can have (inline) primops to convert between the BoxedRep and the sum-of-sums representations.
Anyone have thoughts on details for what the Levity constructor arguments might look like?
-- -Andrew Thaddeus Martin
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Yes, that's something quite different. We'd need a whole different heap
object type for such MVars and TVars. My approach covers the case where the
unboxed thing can only take on a few values, for some value of "a few"
which, depending on implementation, may or may not be very small. If the
nulls point to actual heap objects in pre-allocated pinned memory (say),
then up to 64 or so might be reasonable. If they point to "invalid" address
space, then the numbers could go up a good bit.
On Thu, Oct 15, 2020, 12:50 PM Carter Schonwald
Indeed, I mean things that aren’t pointery, and could be represented by a tvar paired with a mutable byte array or mvar with mutable byte array, but which we’d want considered as a single heap object from the rts/gc perspective.
On Thu, Oct 15, 2020 at 11:58 AM David Feuer
wrote: Sorry, unlifted, not unboxed...
On Thu, Oct 15, 2020, 11:57 AM David Feuer
wrote: Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's accepted BoxedRep proposal.
On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < carter.schonwald@gmail.com> wrote:
A related idea that came up recently and is perhaps simpler ties into this via the lens of having unboxed Mvars/tvars (even if it’s restricted to just things we can embed in a word64#)
This came up in https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where viktor had millions of independent mvars holding what’s essentially a strict unit ()!
The motivation in this later scenario is that in high concurrency settings, the less trivial stuff the gc needs to trace under updates, the better ghc scales.
This may not be a use case david has in mind, but certainly seems related.
Phrased more succinctly: gc perf dominates large heap / many core computation in Haskell via sensitivity to allocation volume / mutation volume (to ensure generational hypothesis stays valid), and providing tools to incrementally reduce the pressure with local changes would be good.
So I’d propose / suggest that a baby step towards what david asks would be for us to work out some manner of unboxed tvar/mvar ref machinery that supports unboxed values.
On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < klebinger.andreas@gmx.at> wrote:
From a implementors perspective my main questions would be:
* How big is the benefit in practice? How many use cases are there? * How bad are the costs? (Runtime overhead, rts complexity, ...)
The details of how this would be exposed to a user would are important. But if the costs are too high for the drawbacks then it becomes a moot point.
David Feuer schrieb am 14.10.2020 um 22:21:
Forwarded from Andrew Martin below. I think we want more than just Maybe (more than one null), but the nesting I described is certainly more convenience than necessity.
---------- Forwarded message --------- From: Andrew Martin
Date: Wed, Oct 14, 2020, 4:14 PM Subject: Re: Restricted sums in BoxedRep To: David Feuer You'll have to forward this to the ghc-devs list to share it with others since I'm not currently subscribed to it, but I've had this same thought before. It is discussed at https://github.com/andrewthad/impure-containers/issues/12. Here's the relevant excerpt:
Relatedly, I was thinking the other day that after finishing implementing https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., I should really look at seeing if it's possible to add this maybe-of-a-lifted value trick straight to GHC. I think that with:
data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep | ...
data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v)
This doesn't have the nesting issues because the kind system prevents nesting. But anyway, back to the original question. I would recommend not using Maybe.Unsafe and using unpacked-maybe instead. The latter is definitely safe, and it only costs an extra machine word of space in each data constructor it gets used in, and it doesn't introduce more indirections.
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
wrote: Null pointers are widely known to be a lousy language feature in general, but there are certain situations where they're *really* useful for compact representation. For example, we define
newtype TMVar a = TMVar (TVar (Maybe a))
We don't, however, actually use the fact that (Maybe a) is lifted. So we could represent this much more efficiently using something like
newtype TMVar a = TMVar (TVar a)
where Nothing is represented by a distinguished "null" pointer.
While it's possible to implement this sort of thing in user code (with lots of fuss and care), it's not very nice at all. What I'd really like to be able to do is represent certain kinds of sums like this natively.
Now that we're getting BoxedRep, I think we can probably make it happen. The trick is to add a special Levity constructor representing sums of particular shapes. Specifically, we can represent a type like this if it is a possibly-nested sum which, when flattened into a single sum, consists of some number of nullary tuples and at most one Lifted or Unlifted type. Then we can have (inline) primops to convert between the BoxedRep and the sum-of-sums representations.
Anyone have thoughts on details for what the Levity constructor arguments might look like?
-- -Andrew Thaddeus Martin
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I haven't been following this discussion too closely, but in my research
work I found that some of the benefits that I wanted in this direction were
already there with pointer tagging.
On Thu, Oct 15, 2020 at 12:59 PM David Feuer
Yes, that's something quite different. We'd need a whole different heap object type for such MVars and TVars. My approach covers the case where the unboxed thing can only take on a few values, for some value of "a few" which, depending on implementation, may or may not be very small. If the nulls point to actual heap objects in pre-allocated pinned memory (say), then up to 64 or so might be reasonable. If they point to "invalid" address space, then the numbers could go up a good bit.
On Thu, Oct 15, 2020, 12:50 PM Carter Schonwald < carter.schonwald@gmail.com> wrote:
Indeed, I mean things that aren’t pointery, and could be represented by a tvar paired with a mutable byte array or mvar with mutable byte array, but which we’d want considered as a single heap object from the rts/gc perspective.
On Thu, Oct 15, 2020 at 11:58 AM David Feuer
wrote: Sorry, unlifted, not unboxed...
On Thu, Oct 15, 2020, 11:57 AM David Feuer
wrote: Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's accepted BoxedRep proposal.
On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < carter.schonwald@gmail.com> wrote:
A related idea that came up recently and is perhaps simpler ties into this via the lens of having unboxed Mvars/tvars (even if it’s restricted to just things we can embed in a word64#)
This came up in https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where viktor had millions of independent mvars holding what’s essentially a strict unit ()!
The motivation in this later scenario is that in high concurrency settings, the less trivial stuff the gc needs to trace under updates, the better ghc scales.
This may not be a use case david has in mind, but certainly seems related.
Phrased more succinctly: gc perf dominates large heap / many core computation in Haskell via sensitivity to allocation volume / mutation volume (to ensure generational hypothesis stays valid), and providing tools to incrementally reduce the pressure with local changes would be good.
So I’d propose / suggest that a baby step towards what david asks would be for us to work out some manner of unboxed tvar/mvar ref machinery that supports unboxed values.
On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < klebinger.andreas@gmx.at> wrote:
From a implementors perspective my main questions would be:
* How big is the benefit in practice? How many use cases are there? * How bad are the costs? (Runtime overhead, rts complexity, ...)
The details of how this would be exposed to a user would are important. But if the costs are too high for the drawbacks then it becomes a moot point.
David Feuer schrieb am 14.10.2020 um 22:21:
Forwarded from Andrew Martin below. I think we want more than just Maybe (more than one null), but the nesting I described is certainly more convenience than necessity.
---------- Forwarded message --------- From: Andrew Martin
Date: Wed, Oct 14, 2020, 4:14 PM Subject: Re: Restricted sums in BoxedRep To: David Feuer You'll have to forward this to the ghc-devs list to share it with others since I'm not currently subscribed to it, but I've had this same thought before. It is discussed at https://github.com/andrewthad/impure-containers/issues/12. Here's the relevant excerpt:
> Relatedly, I was thinking the other day that after finishing > implementing > https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., > I should really look at seeing if it's possible to add this > maybe-of-a-lifted value trick straight to GHC. I think that with: > > data RuntimpRep > = BoxedRep Levity > | MaybeBoxedRep Levity > | IntRep > | ... > > data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v) > > This doesn't have the nesting issues because the kind system > prevents nesting. But anyway, back to the original question. I would > recommend not using Maybe.Unsafe and using unpacked-maybe instead. > The latter is definitely safe, and it only costs an extra machine word of > space in each data constructor it gets used in, and it doesn't introduce > more indirections. >
On Tue, Oct 13, 2020 at 5:47 PM David Feuer
wrote: > Null pointers are widely known to be a lousy language feature in > general, but there are certain situations where they're *really* useful for > compact representation. For example, we define > > newtype TMVar a = TMVar (TVar (Maybe a)) > > We don't, however, actually use the fact that (Maybe a) is lifted. > So we could represent this much more efficiently using something like > > newtype TMVar a = TMVar (TVar a) > > where Nothing is represented by a distinguished "null" pointer. > > While it's possible to implement this sort of thing in user code > (with lots of fuss and care), it's not very nice at all. What I'd really > like to be able to do is represent certain kinds of sums like this natively. > > Now that we're getting BoxedRep, I think we can probably make it > happen. The trick is to add a special Levity constructor representing sums > of particular shapes. Specifically, we can represent a type like this if it > is a possibly-nested sum which, when flattened into a single sum, consists > of some number of nullary tuples and at most one Lifted or Unlifted type. > Then we can have (inline) primops to convert between the BoxedRep and the > sum-of-sums representations. > > Anyone have thoughts on details for what the Levity constructor > arguments might look like? >
-- -Andrew Thaddeus Martin
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

This might be lost in the noise for MVar and TVar, but for arrays, there
are tremendous advantages to cutting out the extra heap objects.
On Thu, Oct 15, 2020, 1:10 PM Ryan Yates
I haven't been following this discussion too closely, but in my research work I found that some of the benefits that I wanted in this direction were already there with pointer tagging.
On Thu, Oct 15, 2020 at 12:59 PM David Feuer
wrote: Yes, that's something quite different. We'd need a whole different heap object type for such MVars and TVars. My approach covers the case where the unboxed thing can only take on a few values, for some value of "a few" which, depending on implementation, may or may not be very small. If the nulls point to actual heap objects in pre-allocated pinned memory (say), then up to 64 or so might be reasonable. If they point to "invalid" address space, then the numbers could go up a good bit.
On Thu, Oct 15, 2020, 12:50 PM Carter Schonwald < carter.schonwald@gmail.com> wrote:
Indeed, I mean things that aren’t pointery, and could be represented by a tvar paired with a mutable byte array or mvar with mutable byte array, but which we’d want considered as a single heap object from the rts/gc perspective.
On Thu, Oct 15, 2020 at 11:58 AM David Feuer
wrote: Sorry, unlifted, not unboxed...
On Thu, Oct 15, 2020, 11:57 AM David Feuer
wrote: Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's accepted BoxedRep proposal.
On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < carter.schonwald@gmail.com> wrote:
A related idea that came up recently and is perhaps simpler ties into this via the lens of having unboxed Mvars/tvars (even if it’s restricted to just things we can embed in a word64#)
This came up in https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, where viktor had millions of independent mvars holding what’s essentially a strict unit ()!
The motivation in this later scenario is that in high concurrency settings, the less trivial stuff the gc needs to trace under updates, the better ghc scales.
This may not be a use case david has in mind, but certainly seems related.
Phrased more succinctly: gc perf dominates large heap / many core computation in Haskell via sensitivity to allocation volume / mutation volume (to ensure generational hypothesis stays valid), and providing tools to incrementally reduce the pressure with local changes would be good.
So I’d propose / suggest that a baby step towards what david asks would be for us to work out some manner of unboxed tvar/mvar ref machinery that supports unboxed values.
On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < klebinger.andreas@gmx.at> wrote:
> From a implementors perspective my main questions would be: > > * How big is the benefit in practice? How many use cases are there? > * How bad are the costs? (Runtime overhead, rts complexity, ...) > > The details of how this would be exposed to a user would are > important. > But if the costs are too high for the drawbacks then it becomes a > moot point. > > > David Feuer schrieb am 14.10.2020 um 22:21: > > Forwarded from Andrew Martin below. I think we want more than just > Maybe (more than one null), but the nesting I described is certainly more > convenience than necessity. > > ---------- Forwarded message --------- > From: Andrew Martin
> Date: Wed, Oct 14, 2020, 4:14 PM > Subject: Re: Restricted sums in BoxedRep > To: David Feuer > > > You'll have to forward this to the ghc-devs list to share it with > others since I'm not currently subscribed to it, but I've had this same > thought before. It is discussed at > https://github.com/andrewthad/impure-containers/issues/12. Here's > the relevant excerpt: > >> Relatedly, I was thinking the other day that after finishing >> implementing >> https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., >> I should really look at seeing if it's possible to add this >> maybe-of-a-lifted value trick straight to GHC. I think that with: >> >> data RuntimpRep >> = BoxedRep Levity >> | MaybeBoxedRep Levity >> | IntRep >> | ... >> >> data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v) >> >> This doesn't have the nesting issues because the kind system >> prevents nesting. But anyway, back to the original question. I would >> recommend not using Maybe.Unsafe and using unpacked-maybe instead. >> The latter is definitely safe, and it only costs an extra machine word of >> space in each data constructor it gets used in, and it doesn't introduce >> more indirections. >> > > On Tue, Oct 13, 2020 at 5:47 PM David Feuer > wrote: > >> Null pointers are widely known to be a lousy language feature in >> general, but there are certain situations where they're *really* useful for >> compact representation. For example, we define >> >> newtype TMVar a = TMVar (TVar (Maybe a)) >> >> We don't, however, actually use the fact that (Maybe a) is lifted. >> So we could represent this much more efficiently using something like >> >> newtype TMVar a = TMVar (TVar a) >> >> where Nothing is represented by a distinguished "null" pointer. >> >> While it's possible to implement this sort of thing in user code >> (with lots of fuss and care), it's not very nice at all. What I'd really >> like to be able to do is represent certain kinds of sums like this natively. >> >> Now that we're getting BoxedRep, I think we can probably make it >> happen. The trick is to add a special Levity constructor representing sums >> of particular shapes. Specifically, we can represent a type like this if it >> is a possibly-nested sum which, when flattened into a single sum, consists >> of some number of nullary tuples and at most one Lifted or Unlifted type. >> Then we can have (inline) primops to convert between the BoxedRep and the >> sum-of-sums representations. >> >> Anyone have thoughts on details for what the Levity constructor >> arguments might look like? >> > > > -- > -Andrew Thaddeus Martin > > > _______________________________________________ > ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > > _______________________________________________ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Ah yes. In my research work I extended Mutable Constructor fields to allow
a mutable array field to help with that problem. This allowed me to have a
Nil constructor in a sum type and a Node constructor with normal fields as
well as an array of mutable fields. Pointer tagging worked as expected so
a case match on a dereference of a field would not need to follow the
pointer and no intermediate objects were between Node objects.
On Thu, Oct 15, 2020 at 4:08 PM David Feuer
This might be lost in the noise for MVar and TVar, but for arrays, there are tremendous advantages to cutting out the extra heap objects.
On Thu, Oct 15, 2020, 1:10 PM Ryan Yates
wrote: I haven't been following this discussion too closely, but in my research work I found that some of the benefits that I wanted in this direction were already there with pointer tagging.
On Thu, Oct 15, 2020 at 12:59 PM David Feuer
wrote: Yes, that's something quite different. We'd need a whole different heap object type for such MVars and TVars. My approach covers the case where the unboxed thing can only take on a few values, for some value of "a few" which, depending on implementation, may or may not be very small. If the nulls point to actual heap objects in pre-allocated pinned memory (say), then up to 64 or so might be reasonable. If they point to "invalid" address space, then the numbers could go up a good bit.
On Thu, Oct 15, 2020, 12:50 PM Carter Schonwald < carter.schonwald@gmail.com> wrote:
Indeed, I mean things that aren’t pointery, and could be represented by a tvar paired with a mutable byte array or mvar with mutable byte array, but which we’d want considered as a single heap object from the rts/gc perspective.
On Thu, Oct 15, 2020 at 11:58 AM David Feuer
wrote: Sorry, unlifted, not unboxed...
On Thu, Oct 15, 2020, 11:57 AM David Feuer
wrote: Putting unboxed things in TVar, MVar, etc., is part of Andrew Martin's accepted BoxedRep proposal.
On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < carter.schonwald@gmail.com> wrote:
> A related idea that came up recently and is perhaps simpler ties > into this via the lens of having unboxed Mvars/tvars (even if it’s > restricted to just things we can embed in a word64#) > > This came up in > https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, > where viktor had millions of independent mvars holding what’s essentially a > strict unit ()! > > The motivation in this later scenario is that in high concurrency > settings, the less trivial stuff the gc needs to trace under updates, the > better ghc scales. > > This may not be a use case david has in mind, but certainly seems > related. > > Phrased more succinctly: gc perf dominates large heap / many core > computation in Haskell via sensitivity to allocation volume / mutation > volume (to ensure generational hypothesis stays valid), and providing tools > to incrementally reduce the pressure with local changes would be good. > > So I’d propose / suggest that a baby step towards what david asks > would be for us to work out some manner of unboxed tvar/mvar ref machinery > that supports unboxed values. > > > > On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < > klebinger.andreas@gmx.at> wrote: > >> From a implementors perspective my main questions would be: >> >> * How big is the benefit in practice? How many use cases are there? >> * How bad are the costs? (Runtime overhead, rts complexity, ...) >> >> The details of how this would be exposed to a user would are >> important. >> But if the costs are too high for the drawbacks then it becomes a >> moot point. >> >> >> David Feuer schrieb am 14.10.2020 um 22:21: >> >> Forwarded from Andrew Martin below. I think we want more than just >> Maybe (more than one null), but the nesting I described is certainly more >> convenience than necessity. >> >> ---------- Forwarded message --------- >> From: Andrew Martin
>> Date: Wed, Oct 14, 2020, 4:14 PM >> Subject: Re: Restricted sums in BoxedRep >> To: David Feuer >> >> >> You'll have to forward this to the ghc-devs list to share it with >> others since I'm not currently subscribed to it, but I've had this same >> thought before. It is discussed at >> https://github.com/andrewthad/impure-containers/issues/12. Here's >> the relevant excerpt: >> >>> Relatedly, I was thinking the other day that after finishing >>> implementing >>> https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., >>> I should really look at seeing if it's possible to add this >>> maybe-of-a-lifted value trick straight to GHC. I think that with: >>> >>> data RuntimpRep >>> = BoxedRep Levity >>> | MaybeBoxedRep Levity >>> | IntRep >>> | ... >>> >>> data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v) >>> >>> This doesn't have the nesting issues because the kind system >>> prevents nesting. But anyway, back to the original question. I would >>> recommend not using Maybe.Unsafe and using unpacked-maybe >>> instead. The latter is definitely safe, and it only costs an extra machine >>> word of space in each data constructor it gets used in, and it doesn't >>> introduce more indirections. >>> >> >> On Tue, Oct 13, 2020 at 5:47 PM David Feuer >> wrote: >> >>> Null pointers are widely known to be a lousy language feature in >>> general, but there are certain situations where they're *really* useful for >>> compact representation. For example, we define >>> >>> newtype TMVar a = TMVar (TVar (Maybe a)) >>> >>> We don't, however, actually use the fact that (Maybe a) is lifted. >>> So we could represent this much more efficiently using something like >>> >>> newtype TMVar a = TMVar (TVar a) >>> >>> where Nothing is represented by a distinguished "null" pointer. >>> >>> While it's possible to implement this sort of thing in user code >>> (with lots of fuss and care), it's not very nice at all. What I'd really >>> like to be able to do is represent certain kinds of sums like this natively. >>> >>> Now that we're getting BoxedRep, I think we can probably make it >>> happen. The trick is to add a special Levity constructor representing sums >>> of particular shapes. Specifically, we can represent a type like this if it >>> is a possibly-nested sum which, when flattened into a single sum, consists >>> of some number of nullary tuples and at most one Lifted or Unlifted type. >>> Then we can have (inline) primops to convert between the BoxedRep and the >>> sum-of-sums representations. >>> >>> Anyone have thoughts on details for what the Levity constructor >>> arguments might look like? >>> >> >> >> -- >> -Andrew Thaddeus Martin >> >> >> _______________________________________________ >> ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> >> >> _______________________________________________ >> ghc-devs mailing list >> ghc-devs@haskell.org >> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >> > _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Ryan: would an unboxed value Mvar, Eg to write StrictMvar (), avoid
creating extra gc pressure / traversal a?
On Thu, Oct 15, 2020 at 4:23 PM Ryan Yates
Ah yes. In my research work I extended Mutable Constructor fields to allow a mutable array field to help with that problem. This allowed me to have a Nil constructor in a sum type and a Node constructor with normal fields as well as an array of mutable fields. Pointer tagging worked as expected so a case match on a dereference of a field would not need to follow the pointer and no intermediate objects were between Node objects.
On Thu, Oct 15, 2020 at 4:08 PM David Feuer
wrote: This might be lost in the noise for MVar and TVar, but for arrays, there are tremendous advantages to cutting out the extra heap objects.
On Thu, Oct 15, 2020, 1:10 PM Ryan Yates
wrote: I haven't been following this discussion too closely, but in my research work I found that some of the benefits that I wanted in this direction were already there with pointer tagging.
On Thu, Oct 15, 2020 at 12:59 PM David Feuer
wrote: Yes, that's something quite different. We'd need a whole different heap object type for such MVars and TVars. My approach covers the case where the unboxed thing can only take on a few values, for some value of "a few" which, depending on implementation, may or may not be very small. If the nulls point to actual heap objects in pre-allocated pinned memory (say), then up to 64 or so might be reasonable. If they point to "invalid" address space, then the numbers could go up a good bit.
On Thu, Oct 15, 2020, 12:50 PM Carter Schonwald < carter.schonwald@gmail.com> wrote:
Indeed, I mean things that aren’t pointery, and could be represented by a tvar paired with a mutable byte array or mvar with mutable byte array, but which we’d want considered as a single heap object from the rts/gc perspective.
On Thu, Oct 15, 2020 at 11:58 AM David Feuer
wrote: Sorry, unlifted, not unboxed...
On Thu, Oct 15, 2020, 11:57 AM David Feuer
wrote: > Putting unboxed things in TVar, MVar, etc., is part of Andrew > Martin's accepted BoxedRep proposal. > > On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < > carter.schonwald@gmail.com> wrote: > >> A related idea that came up recently and is perhaps simpler ties >> into this via the lens of having unboxed Mvars/tvars (even if it’s >> restricted to just things we can embed in a word64#) >> >> This came up in >> https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, >> where viktor had millions of independent mvars holding what’s essentially a >> strict unit ()! >> >> The motivation in this later scenario is that in high concurrency >> settings, the less trivial stuff the gc needs to trace under updates, the >> better ghc scales. >> >> This may not be a use case david has in mind, but certainly seems >> related. >> >> Phrased more succinctly: gc perf dominates large heap / many core >> computation in Haskell via sensitivity to allocation volume / mutation >> volume (to ensure generational hypothesis stays valid), and providing tools >> to incrementally reduce the pressure with local changes would be good. >> >> So I’d propose / suggest that a baby step towards what david asks >> would be for us to work out some manner of unboxed tvar/mvar ref machinery >> that supports unboxed values. >> >> >> >> On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < >> klebinger.andreas@gmx.at> wrote: >> >>> From a implementors perspective my main questions would be: >>> >>> * How big is the benefit in practice? How many use cases are there? >>> * How bad are the costs? (Runtime overhead, rts complexity, ...) >>> >>> The details of how this would be exposed to a user would are >>> important. >>> But if the costs are too high for the drawbacks then it becomes a >>> moot point. >>> >>> >>> David Feuer schrieb am 14.10.2020 um 22:21: >>> >>> Forwarded from Andrew Martin below. I think we want more than just >>> Maybe (more than one null), but the nesting I described is certainly more >>> convenience than necessity. >>> >>> ---------- Forwarded message --------- >>> From: Andrew Martin
>>> Date: Wed, Oct 14, 2020, 4:14 PM >>> Subject: Re: Restricted sums in BoxedRep >>> To: David Feuer >>> >>> >>> You'll have to forward this to the ghc-devs list to share it with >>> others since I'm not currently subscribed to it, but I've had this same >>> thought before. It is discussed at >>> https://github.com/andrewthad/impure-containers/issues/12. Here's >>> the relevant excerpt: >>> >>>> Relatedly, I was thinking the other day that after finishing >>>> implementing >>>> https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., >>>> I should really look at seeing if it's possible to add this >>>> maybe-of-a-lifted value trick straight to GHC. I think that with: >>>> >>>> data RuntimpRep >>>> = BoxedRep Levity >>>> | MaybeBoxedRep Levity >>>> | IntRep >>>> | ... >>>> >>>> data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v) >>>> >>>> This doesn't have the nesting issues because the kind system >>>> prevents nesting. But anyway, back to the original question. I would >>>> recommend not using Maybe.Unsafe and using unpacked-maybe >>>> instead. The latter is definitely safe, and it only costs an extra machine >>>> word of space in each data constructor it gets used in, and it doesn't >>>> introduce more indirections. >>>> >>> >>> On Tue, Oct 13, 2020 at 5:47 PM David Feuer >>> wrote: >>> >>>> Null pointers are widely known to be a lousy language feature in >>>> general, but there are certain situations where they're *really* useful for >>>> compact representation. For example, we define >>>> >>>> newtype TMVar a = TMVar (TVar (Maybe a)) >>>> >>>> We don't, however, actually use the fact that (Maybe a) is >>>> lifted. So we could represent this much more efficiently using something >>>> like >>>> >>>> newtype TMVar a = TMVar (TVar a) >>>> >>>> where Nothing is represented by a distinguished "null" pointer. >>>> >>>> While it's possible to implement this sort of thing in user code >>>> (with lots of fuss and care), it's not very nice at all. What I'd really >>>> like to be able to do is represent certain kinds of sums like this natively. >>>> >>>> Now that we're getting BoxedRep, I think we can probably make it >>>> happen. The trick is to add a special Levity constructor representing sums >>>> of particular shapes. Specifically, we can represent a type like this if it >>>> is a possibly-nested sum which, when flattened into a single sum, consists >>>> of some number of nullary tuples and at most one Lifted or Unlifted type. >>>> Then we can have (inline) primops to convert between the BoxedRep and the >>>> sum-of-sums representations. >>>> >>>> Anyone have thoughts on details for what the Levity constructor >>>> arguments might look like? >>>> >>> >>> >>> -- >>> -Andrew Thaddeus Martin >>> >>> >>> _______________________________________________ >>> ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs@haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>> >> _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

I'm not sure what would lead you to have abundant enough unboxed MVar's to
make a difference with GC. I'm happy to be wrong, but I think most
situations where you want an MVar you want the indirection. If you don't
want the indirection you probably don't want some feature of MVar and
should use something else.
On Thu, Oct 15, 2020 at 4:27 PM Carter Schonwald
Ryan: would an unboxed value Mvar, Eg to write StrictMvar (), avoid creating extra gc pressure / traversal a?
On Thu, Oct 15, 2020 at 4:23 PM Ryan Yates
wrote: Ah yes. In my research work I extended Mutable Constructor fields to allow a mutable array field to help with that problem. This allowed me to have a Nil constructor in a sum type and a Node constructor with normal fields as well as an array of mutable fields. Pointer tagging worked as expected so a case match on a dereference of a field would not need to follow the pointer and no intermediate objects were between Node objects.
On Thu, Oct 15, 2020 at 4:08 PM David Feuer
wrote: This might be lost in the noise for MVar and TVar, but for arrays, there are tremendous advantages to cutting out the extra heap objects.
On Thu, Oct 15, 2020, 1:10 PM Ryan Yates
wrote: I haven't been following this discussion too closely, but in my research work I found that some of the benefits that I wanted in this direction were already there with pointer tagging.
On Thu, Oct 15, 2020 at 12:59 PM David Feuer
wrote: Yes, that's something quite different. We'd need a whole different heap object type for such MVars and TVars. My approach covers the case where the unboxed thing can only take on a few values, for some value of "a few" which, depending on implementation, may or may not be very small. If the nulls point to actual heap objects in pre-allocated pinned memory (say), then up to 64 or so might be reasonable. If they point to "invalid" address space, then the numbers could go up a good bit.
On Thu, Oct 15, 2020, 12:50 PM Carter Schonwald < carter.schonwald@gmail.com> wrote:
Indeed, I mean things that aren’t pointery, and could be represented by a tvar paired with a mutable byte array or mvar with mutable byte array, but which we’d want considered as a single heap object from the rts/gc perspective.
On Thu, Oct 15, 2020 at 11:58 AM David Feuer
wrote: > Sorry, unlifted, not unboxed... > > On Thu, Oct 15, 2020, 11:57 AM David Feuer
> wrote: > >> Putting unboxed things in TVar, MVar, etc., is part of Andrew >> Martin's accepted BoxedRep proposal. >> >> On Thu, Oct 15, 2020, 11:44 AM Carter Schonwald < >> carter.schonwald@gmail.com> wrote: >> >>> A related idea that came up recently and is perhaps simpler ties >>> into this via the lens of having unboxed Mvars/tvars (even if it’s >>> restricted to just things we can embed in a word64#) >>> >>> This came up in >>> https://gitlab.haskell.org/ghc/ghc/-/issues/18798#note_307410, >>> where viktor had millions of independent mvars holding what’s essentially a >>> strict unit ()! >>> >>> The motivation in this later scenario is that in high concurrency >>> settings, the less trivial stuff the gc needs to trace under updates, the >>> better ghc scales. >>> >>> This may not be a use case david has in mind, but certainly seems >>> related. >>> >>> Phrased more succinctly: gc perf dominates large heap / many core >>> computation in Haskell via sensitivity to allocation volume / mutation >>> volume (to ensure generational hypothesis stays valid), and providing tools >>> to incrementally reduce the pressure with local changes would be good. >>> >>> So I’d propose / suggest that a baby step towards what david asks >>> would be for us to work out some manner of unboxed tvar/mvar ref machinery >>> that supports unboxed values. >>> >>> >>> >>> On Thu, Oct 15, 2020 at 5:32 AM Andreas Klebinger < >>> klebinger.andreas@gmx.at> wrote: >>> >>>> From a implementors perspective my main questions would be: >>>> >>>> * How big is the benefit in practice? How many use cases are >>>> there? >>>> * How bad are the costs? (Runtime overhead, rts complexity, ...) >>>> >>>> The details of how this would be exposed to a user would are >>>> important. >>>> But if the costs are too high for the drawbacks then it becomes a >>>> moot point. >>>> >>>> >>>> David Feuer schrieb am 14.10.2020 um 22:21: >>>> >>>> Forwarded from Andrew Martin below. I think we want more than >>>> just Maybe (more than one null), but the nesting I described is certainly >>>> more convenience than necessity. >>>> >>>> ---------- Forwarded message --------- >>>> From: Andrew Martin >>>> Date: Wed, Oct 14, 2020, 4:14 PM >>>> Subject: Re: Restricted sums in BoxedRep >>>> To: David Feuer >>>> >>>> >>>> You'll have to forward this to the ghc-devs list to share it with >>>> others since I'm not currently subscribed to it, but I've had this same >>>> thought before. It is discussed at >>>> https://github.com/andrewthad/impure-containers/issues/12. >>>> Here's the relevant excerpt: >>>> >>>>> Relatedly, I was thinking the other day that after finishing >>>>> implementing >>>>> https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-po..., >>>>> I should really look at seeing if it's possible to add this >>>>> maybe-of-a-lifted value trick straight to GHC. I think that with: >>>>> >>>>> data RuntimpRep >>>>> = BoxedRep Levity >>>>> | MaybeBoxedRep Levity >>>>> | IntRep >>>>> | ... >>>>> >>>>> data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE ('MaybeBoxedRep v) >>>>> >>>>> This doesn't have the nesting issues because the kind system >>>>> prevents nesting. But anyway, back to the original question. I would >>>>> recommend not using Maybe.Unsafe and using unpacked-maybe >>>>> instead. The latter is definitely safe, and it only costs an extra machine >>>>> word of space in each data constructor it gets used in, and it doesn't >>>>> introduce more indirections. >>>>> >>>> >>>> On Tue, Oct 13, 2020 at 5:47 PM David Feuer < >>>> david.feuer@gmail.com> wrote: >>>> >>>>> Null pointers are widely known to be a lousy language feature in >>>>> general, but there are certain situations where they're *really* useful for >>>>> compact representation. For example, we define >>>>> >>>>> newtype TMVar a = TMVar (TVar (Maybe a)) >>>>> >>>>> We don't, however, actually use the fact that (Maybe a) is >>>>> lifted. So we could represent this much more efficiently using something >>>>> like >>>>> >>>>> newtype TMVar a = TMVar (TVar a) >>>>> >>>>> where Nothing is represented by a distinguished "null" pointer. >>>>> >>>>> While it's possible to implement this sort of thing in user code >>>>> (with lots of fuss and care), it's not very nice at all. What I'd really >>>>> like to be able to do is represent certain kinds of sums like this natively. >>>>> >>>>> Now that we're getting BoxedRep, I think we can probably make it >>>>> happen. The trick is to add a special Levity constructor representing sums >>>>> of particular shapes. Specifically, we can represent a type like this if it >>>>> is a possibly-nested sum which, when flattened into a single sum, consists >>>>> of some number of nullary tuples and at most one Lifted or Unlifted type. >>>>> Then we can have (inline) primops to convert between the BoxedRep and the >>>>> sum-of-sums representations. >>>>> >>>>> Anyone have thoughts on details for what the Levity constructor >>>>> arguments might look like? >>>>> >>>> >>>> >>>> -- >>>> -Andrew Thaddeus Martin >>>> >>>> >>>> _______________________________________________ >>>> ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >>>> >>>> _______________________________________________ >>>> ghc-devs mailing list >>>> ghc-devs@haskell.org >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>> >>> _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (4)
-
Andreas Klebinger
-
Carter Schonwald
-
David Feuer
-
Ryan Yates