[GHC] #11143: Feature request: Add index/read/write primops with byte offset for ByteArray#

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: Type: feature | Status: new request | Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 7.10.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Currently, primops for indexing `ByteArray#` and reading/writing `MutableByteArray#` have the following form: {{{#!hs indexTYPEArray# :: ByteArray# -> Int# -> TYPE# readTYPEArray# :: MutableByteArray# s -> Int# -> State# s -> (#State# s, TYPE##) writeTYPEArray# :: MutableByteArray# s -> Int# -> TYPE# -> State# s -> State# s }}} where second argument of type `Int#` is an offset measured in terms of the size of `TYPE#`. This is inconvinient if I want to store values of different types inside `ByteArray#`: I have to read values of type `Int8` and then glue them together with some bitwise operations. I suggest adding number of primops, similar to existing ones, which would accept offset in bytes from the start of the `ByteArray#`: {{{#!hs -- | Read 8-bit integer; offset in bytes. indexByteInt8Array# :: ByteArray# -> Int# -> Int# -- | Read 16-bit integer; offset in bytes. indexByteInt16Array# :: ByteArray# -> Int# -> Int# -- | Read 32-bit integer; offset in bytes. indexByteInt32Array# :: ByteArray# -> Int# -> Int# -- | Read 8-bit integer; offset in bytes. readInt8Array# :: MutableByteArray# s -> Int# -> State# s -> (#State# s, Int##) -- | Read 16-bit integer; offset in bytes. readInt16Array# :: MutableByteArray# s -> Int# -> State# s -> (#State# s, Int##) -- | Read 32-bit integer; offset in bytes. readInt32Array# :: MutableByteArray# s -> Int# -> State# s -> (#State# s, Int##) }}} and so on... -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by thomie): * failure: None/Unknown => Runtime performance bug -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.2.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by Mathnerd314): Can we instead have primops which take both an offset measured in bytes and an offset measured in terms of the type? {{{#!hs indexTYPEArray# :: ByteArray# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE# readTYPEArray# :: MutableByteArray# s -> Int# {-byte offset-} -> Int# {-type offset-} -> State# s -> (#State# s, TYPE##) writeTYPEArray# :: MutableByteArray# s -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE# -> State# s -> State# s indexTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE readTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> State# s -> (#State# s, TYPE ##) writeTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE -> State# s -> State# s }}} All of these go through the `mkBasicIndexed{Read,Write}` functions, which take both a byte offset and a type offset, so it seems reasonable to expose that. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.2.1 => Comment: It seems unlikely that this will happen for 8.2. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * keywords: => newcomers Comment: If anyone is interested in picking this up do ping me. It should be a relatively straightforward patch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by sjakobi): * cc: sjakobi (added) Comment: Replying to [comment:2 Mathnerd314]:
Can we instead have primops which take both an offset measured in bytes and an offset measured in terms of the type? {{{#!hs indexTYPEArray# :: ByteArray# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE# readTYPEArray# :: MutableByteArray# s -> Int# {-byte offset-} -> Int# {-type offset-} -> State# s -> (#State# s, TYPE##) writeTYPEArray# :: MutableByteArray# s -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE# -> State# s -> State# s
indexTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE readTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> State# s -> (#State# s, TYPE ##) writeTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE -> State# s -> State# s }}}
I like these types.
All of these go through the `mkBasicIndexed{Read,Write}` functions, which take both a byte offset and a type offset, so it seems reasonable to expose that.
I currently don't see how this can be done. These functions require a byte offset with type `ByteOff` (`Int`) but we only have a `CmmExpr`. It seems to me that the new primops will require quite a bit of new plumbing down to `CmmRegOff`. Am I missing something? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Changes (by sjakobi): * cc: sjakobi (removed) * owner: (none) => sjakobi * differential: => Phab:D4433 Comment: I have implemented a first primop which was much less code than I had expected. Please review the attached Phab. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by vagarenko):
Can we instead have primops which take both an offset measured in bytes and an offset measured in terms of the type? {{{#!hs indexTYPEArray# :: ByteArray# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE# readTYPEArray# :: MutableByteArray# s -> Int# {-byte offset-} -> Int# {-type offset-} -> State# s -> (#State# s, TYPE##) writeTYPEArray# :: MutableByteArray# s -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE# -> State# s -> State# s
indexTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE readTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> State# s -> (#State# s, TYPE ##) writeTYPEOffAddr# :: Addr# -> Int# {-byte offset-} -> Int# {-type offset-} -> TYPE -> State# s -> State# s }}}
All of these go through the `mkBasicIndexed{Read,Write}` functions, which take both a byte offset and a type offset, so it seems reasonable to expose that.
I'm confused. Why do you want this? Isn't `byte offset` here is just `type offset * sizeof type`? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by sjakobi): Replying to [comment:7 vagarenko]:
I'm confused. Why do you want this? Isn't `byte offset` here is just `type offset * sizeof type`?
While I'm not the requester, I think the types that [comment:2 Mathnerd314 proposes] have two advantages: * They relieve the user from doing the byte offset computation. * They are different from the preexisting types for the `indexTYPEArray` functions, thereby making it impossible to confuse the two. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by vagarenko): Replying to [comment:8 sjakobi]:
Replying to [comment:7 vagarenko]:
I'm confused. Why do you want this? Isn't `byte offset` here is just `type offset * sizeof type`?
While I'm not the requester, I think the types that [comment:2 Mathnerd314 proposes] have two advantages:
* They relieve the user from doing the byte offset computation. Then what does that `Int# {-byte offset-}` parameter mean?
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by sjakobi): Replying to [comment:9 vagarenko]:
Replying to [comment:8 sjakobi]:
* They relieve the user from doing the byte offset computation. Then what does that `Int# {-byte offset-}` parameter mean?
What I meant is that the user doesn't need to compute the total offset. While a user may use the API you proposed like {{{#!hs indexByteInt16Array# ba (byte_offset + type_offset * int16_size) }}} the other API offers {{{#!hs indexByteInt16Array# ba byte_offset type_offset }}} which IMO is simply more concise and convenient. I hope this answers your question. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by vagarenko): Replying to [comment:10 sjakobi]:
Replying to [comment:9 vagarenko]:
Replying to [comment:8 sjakobi]:
* They relieve the user from doing the byte offset computation. Then what does that `Int# {-byte offset-}` parameter mean?
What I meant is that the user doesn't need to compute the total offset. While a user may use the API you proposed like {{{#!hs indexByteInt16Array# ba (byte_offset + type_offset * int16_size)
}}} I still don't understand. What is total offset? What are `byte_offset` and `type_offset`?
By `byte_offset` you mean number of bytes from the start of the array `ba` to the sought value of type `Int#`, correct? Then what is `type_offset`? Number of `Int16` elements before the sought element? But my motivation for this ticket was to be able to store values of different types in a `ByteArray#`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by sjakobi): Replying to [comment:11 vagarenko]:
I don't understand. What is total offset? What are `byte_offset` and `type_offset`?
By `byte_offset` you mean number of bytes from the start of the array `ba` to the sought value of type `Int#`, correct?
Then what is `type_offset`? Number of `Int16` elements before the sought element? But my motivation for this ticket was to be able to store values of different types in a `ByteArray#`.
Yes, sorry for my lack of clarity and thanks for putting it in a nutshell! What I assume is [comment:2 Mathnerd314]'s motivation is that even when a `ByteArray#` contains several types, it may contain a sequence of values of the same type. For example you may want to serialize a vector of `Int32`s by first writing the length of the vector as a `Word64` followed by the values. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by sjakobi): bgamari commented on [Phab:D4433]:
If I'm not mistaken this also needs to take care to avoid unaligned loads and stores on architectures that do not support such things.
I could need some guidance on how to do that. Is there an existing machinery for working around unaligned operations that I could use? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for
ByteArray#
-------------------------------------+-------------------------------------
Reporter: vagarenko | Owner: sjakobi
Type: feature request | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.2
Resolution: | Keywords: newcomers
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D4433
Wiki Page: |
-------------------------------------+-------------------------------------
Changes (by bgamari):
* cc: Jaffacake (added)
Comment:
I don't believe so; this is one of the reasons the primops look like they
do: they allow us to avoid dealing with alignment headaches. Frankly, I'm
not even sure what sort of alignment guarantees our current C-- load and
store nodes expect. Jaffacake, could you comment on this?
We do have a list of architectures for which we need to worry about
alignment (essentially everything but amd64; see `PprC.cLoad`). To figure
out how to lower these operations I would likely just use a C compiler.
For instance, compile a test program like,
{{{#!c
#include

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #4442 | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Changes (by sjakobi): * related: => #4442 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomers Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #4442 | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Comment (by jberryman): I would also appreciate this. My use case is I'd like, for performance and coding simplicity, to be able to do potentially unaligned reads of Word64 when we're compiled for x86_64 arch. Library writers need to think about alignment when using the `Addr` interface to pinned ByteArrays, so I don't think it's a big deal to expose this (though docs could use improvement, see https://ghc.haskell.org/trac/ghc/ticket/14731) That said I would love it if the compiler didn't push this complexity down to library writers (both here and in the Addr load/store primops), and handled twiddling on architectures that don't support unaligned reads. I would like my code to be compatible but don't particularly care if there is a performance hit (99% of people are using x86). The alternative is I just have to write my own slow code and put it behind a CPP pragma. But it's not clear to me if vagarenko actually wants to do any unaligned reads (and I don't want to hijack this request). Perhaps they just want to do aligned reads on their heterogeneous structure. In that case you can write your own implementation easily by dividing the byte offset provided by the width of the payload you're requesting. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11143: Feature request: Add index/read/write primops with byte offset for ByteArray# -------------------------------------+------------------------------------- Reporter: vagarenko | Owner: sjakobi Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: newcomer Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #4442 | Differential Rev(s): Phab:D4433 Wiki Page: | -------------------------------------+------------------------------------- Changes (by monoidal): * keywords: newcomers => newcomer -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11143#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC