[GHC] #14196: Replace ArrayArray# with either UnliftedArray# or Array#

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature | Status: new request | Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Keywords: | Operating System: Unknown/Multiple LevityPolymorphism | Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Just to be clear, there currently is nothing named `UnliftedArray#`. I would like to propose that such a thing possibly be added, with the long- term goal of deprecating and removing `ArrayArray#`. The interface for it would look like this: {{{#!hs data UnliftedArray# (a :: TYPE 'UnliftedRep) data MutableUnliftedArray# s (a :: TYPE 'UnliftedRep) indexUnliftedArray# :: forall (a :: TYPE 'UnliftedRep). UnliftedArray# a -> Int# -> a writeUnliftedArray# :: forall (a :: TYPE 'UnliftedRep). MutableUnliftedArray# s a -> Int# -> a -> State# s -> State# s readUnliftedArray# :: forall (a :: TYPE 'UnliftedRep). MutableUnliftedArray# s a -> Int# -> State# s -> (# State# s, a #) unsafeFreezeUnliftedArray# :: forall (a :: TYPE 'UnliftedRep). MutableUnliftedArray# s a -> State# s -> (#State# s, UnliftedArray# a#) newUnliftedArray# :: forall (a :: TYPE 'UnliftedRep). Int# -> a -> State# s -> (# State# s, MutableUnliftedArray# s a #) }}} You would also have a few other things like `sameMutableUnliftedArray#`, `sizeofMutableArray#`, `unsafeThawUnliftedArray#`, `copyUnliftedArray#`, etc. The main difficulty that I see in doing this is that I'm not sure if you can tell a primop that it takes a polymorphic argument whose kind is something other than `TYPE LiftedRep`. The bodies of all of the functions I listed above could simply be copied from the `ArrayArray#` functions. There are a few alternatives I've heard discussed. One is to make `Array#` accepted both boxed or unboxed types. There is a brief discussion of this in the UnliftedDataTypes proposal [#point0 (0)]. Indeed, it appears that some have managed to smuggle unlifted data into `Array#` already, with the caveats one would expect [#point1 (1)]. This interface would look like: {{{#!hs data Array# (a :: TYPE k) data MutableArray# s (a :: TYPE k) indexArray# :: Array# a -> Int -> a -- same as before indexUnliftedArray# :: forall (a :: TYPE UnliftedRep). Array# a -> Int -> a }}} So instead of having `Array#` and `UnliftedArray#` as separate data types, we could have `Array#` handle both cases, but we would need to make a duplicate of every function that operates on `Array#`. This follows all of the appropriate rules for when levity polymorphism is allowed, and it should be backwards-compatible with the previous non-levity-polymorphic definition of `Array#`. I'm not sure how many things in the GHC runtime assume that the values inside an array are lifted, so this might cause a bunch of problems. If it is possible, this approach would be kind of neat because then `SmallArray#` could be similarly adapted. Anyway, I just wanted to get this out there. I would be happy to try implementing the first proposal (involving `UnliftedArray#`) at some point in the next six months. Any feedback on the idea would be appreciated. Also, any comments on how appropriate this is for someone new to contributing to GHC would be appreciated as well. Thanks. * [=#point0 (0)] https://ghc.haskell.org/trac/ghc/wiki/UnliftedDataTypes#Parametricity * [=#point0 (1)] https://www.reddit.com/r/haskell/comments/6v9rmg/an_unified_array_interface/... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): What is the problem you are trying to solve here? I'm lost. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): It looks like I forgot to provide the motivation for this. There are two problems with the current `ArrayArray#` interface: (1) Duplicated code and (2) lack of expressiveness, which pushes `unsafeCoerce#` onto end users. I'll go into both of these. Several of the primops for `ArrayArray#` have two variants. For example: {{{#!hs indexByteArrayArray# :: ArrayArray# -> Int# -> ByteArray# indexArrayArrayArray# :: ArrayArray# -> Int# -> ArrayArray# }}} We see the same thing for the read and write operations on `MutableArrayArray#` except that now we've got four variants: {{{#!hs readByteArrayArray# :: MutableArrayArray# s -> Int# -> State# s -> (#State# s, ByteArray##) readMutableByteArrayArray# :: MutableArrayArray# s -> Int# -> State# s -> (#State# s, MutableByteArray# s#) readArrayArrayArray# :: MutableArrayArray# s -> Int# -> State# s -> (#State# s, ArrayArray##) readMutableArrayArrayArray# :: MutableArrayArray# s -> Int# -> State# s -> (#State# s, MutableArrayArray# s#) }}} Under the hood, all four of these have the exact same implementation as we can see in [https://github.com/ghc/ghc/blob/8ae263ceb3566a7c82336400b09cb8f381217405/com...]: {{{#!hs emitPrimOp _ [res] ReadArrayArrayOp_ByteArray [obj,ix] = doReadPtrArrayOp res obj ix emitPrimOp _ [res] ReadArrayArrayOp_MutableByteArray [obj,ix] = doReadPtrArrayOp res obj ix emitPrimOp _ [res] ReadArrayArrayOp_ArrayArray [obj,ix] = doReadPtrArrayOp res obj ix emitPrimOp _ [res] ReadArrayArrayOp_MutableArrayArray [obj,ix] = doReadPtrArrayOp res obj ix }}} I consider this duplication a minor problem. It's not very costly, and it's easy to see what's going on. The real problem is that, despite the all the duplication, this interface still only captures a fraction of what `ArrayArray#` can really offer. The end user must explicitly use `unsafeCoerce#` to do a bunch of things (storing `MVar#`, `Array#`, etc.), and even when they want a `ByteArray#` (which the interface does safely allow), they must implicitly perform an unsafe coercion on every access (read/write/index) because the type system never really tells us what's in an `ArrayArray#`. In the `primitive` package, there is a lot of `unsafeCoerce#` that is required to make it work: [http://hackage.haskell.org/package/primitive-0.6.2.0/docs/src/Data- Primitive-UnliftedArray.html#PrimUnlifted]. Basically, the current interface interface doesn't take advantage of the type system. What we have are: * implicit unsafe coercion on every access * copying functions (`copyArrayArray#`, etc.) that don't ensure that the elements in both arrays are of the same type. * explicit `unsafeCoerce#` required whenever you want to: store `Array# a` inside of `ArrayArray#`, store `MutableByteArray# s` inside of `ArrayArray#`, store `MutVar# s a` inside of `ArrayArray#`, etc. For the most part, the advantages to having `UnliftedArray# a` are similar to the advantages of having `Array#` having the type variable `a`. `Array#` doesn't suffer from any of the aforementioned problems. Imagine if the interface for `Array#` looked like this: {{{#!hs data Array# data MutableArray# s indexArray# :: Array# -> Int -> a readMutableArray# :: MutableArray# s -> Int -> a -> State# s -> State# s }}} This would be terrible, and I'm glad it wasn't done that way. `ArrayArray#` was done that way because prior to GHC 8.0, type variables could only have lifted runtime representations. But now that we've had this for a while, I think it's time to look at cleaning up this interface. Plus, in the future, it might be worth having something like `SmallArray#` but for unlifted data. I wouldn't want to see the same bulky-but- inexpressive interface show up again. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): OK now I get it. I had to look at [https://downloads.haskell.org/~ghc/7.4.2/docs/html/libraries/ghc- prim-0.2.0.0/GHC-Prim.html#v:indexArrayArrayArray-35- the documentation for ArrayArray#]. If I may say it like this: The current primitive data type `ArrayArray#` is a heap-allocated array of pointers to unlifted objects. Any kind of unlifted objects would be fine, provide they are represented by a pointer. Specifically, we can put both `Array#` and `ByteArray#` (and I suppose another `ArrayArray#`) inside an `ArrayArray#`. But doing so is jolly awkwerd because `ArrayArray#` is not a parameterised type. Why isn't it parameterised? Becuase previously we had no way to quantify over unlifted types. But now we do. So we can make `ArrayArray#` into a paremterised type, namely your new {{{ data UnliftedArray# (a :: TYPE 'UnliftedRep) }}} Your idea here is that `UniftedArray#` is an array of pointers to ''unlifted'' values. Cool, I like it. Levity polymorphism is great. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by akio): * cc: akio (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): This would probably make for a good [[https://github.com/ghc-proposals /ghc-proposals|proposal]] since it will affect a relatively large number of users. That being said, given that `UnliftedArray#` is a strict generalization of `ArrayArray#` I expect it won't be *that* hard to expose the existing `ArrayArray#` interface built on top of `UnliftedArray#`. The only annoying issue is that users often import `GHC.Prim` directly, which has no Haskell module associated with it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): If `UnliftedNewtypes`, described in https://github.com/ghc-proposals/ghc- proposals/pull/98, is accepted, it will be trivial to reimplement `ArrayArray#`, and its related functions: {{{ newtype ArrayArray# = ArrayArray# (UnliftedArray# ByteArray#) }}} It's not important to use `ByteArray#` here. You could use any unlifted type since you're just going to unsafe coerce it when you pull in out or put it in. I guess these operators would need to go in `GHC.Exts` or something like that, which would probably still break some people's code since we all love to import `GHC.Prim` so much, but at least it provides a good backwards-compatibility option. I'll wait to see if `UnliftedNewtypes` is accepted. If so, I'll write this up as a proposal. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): For what it's worth, I wouldn't be opposed to breaking users of `GHC.Prim`; this module really is an internal implementation detail of the compiler and I think we would be unduly tying our hands by guaranteeing compatibility. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): Agreed. I think that requiring people to import `GHC.Exts` instead of `GHC.Prim` is acceptable. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14196: Replace ArrayArray# with either UnliftedArray# or Array# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: | LevityPolymorphism Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): `UnliftedNewtypes` was accepted. I'll begin working on a proposal for `UnliftedArray#`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14196#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC