
Hi all, I've just noticed that all `WordX` (and `IntX`) data types are actually implemented as wrappers around `Word#` (and `Int#`). This probably doesn't matter much if it's stored on the heap (due to pointer indirection and heap alignment), but it also means that: ``` data Foo = Foo {-# UNPACK #-} !Word8 {-# UNPACK #-} !Int8 ``` will actually take *a lot* of space: on 64 bit we'd need 8 bytes for header, 8 bytes for `Word8`, 8 bytes for `Int8`. Is there any reason for this? The only thing I can see is that this avoids having to add things like `Word8#` primitives into the compiler. (also the codegen would need to emit zero-extend moves when loading from memory, like `movzb{l,q}`) If we had things like `Word8#` we could also consider changing `Bool` to just wrap it (with the obvious encoding). Which would allow to both UNPACK `Bool` *and* save the size within the struct. (alternatively one could imagine a `Bool#` that would be just a byte) I couldn't find any discussion about this, so any pointers would be welcome. :) Thanks, Michal PS. I've had a look at it after reading about the recent implementation of struct field reordering optimization in rustc: http://camlorn.net/posts/April%202017/rust-struct-field-reordering.html

On June 11, 2017 8:03:10 AM EDT, Michal Terepeta
Hi all,
I've just noticed that all `WordX` (and `IntX`) data types are actually implemented as wrappers around `Word#` (and `Int#`). This probably doesn't matter much if it's stored on the heap (due to pointer indirection and heap alignment), but it also means that: ``` data Foo = Foo {-# UNPACK #-} !Word8 {-# UNPACK #-} !Int8 ``` will actually take *a lot* of space: on 64 bit we'd need 8 bytes for header, 8 bytes for `Word8`, 8 bytes for `Int8`.
Is there any reason for this? The only thing I can see is that this avoids having to add things like `Word8#` primitives into the compiler. (also the codegen would need to emit zero-extend moves when loading from memory, like `movzb{l,q}`)
This is certainly one consideration. Another is that you would also need to teach the garbage collector to understand closures with sub-word-size fields. Currently we can encode whether each field of a closure is a pointer or not with a simple bitmap. If we naively allowed smaller fields we would need to increase the granularity of this representation to encode bytes. Of course, one way to work around this would be to impose an invariant that guarantees that pointers are always word-aligned. Then we would probably want to shuffle sub-word sized fields, allowing two Word16s to inhabit a single word. As you mention, this would no doubt require a bit of engineering. In particular, while x86 has robust support for sub-word-size operations, I don't believe all the platforms we support do. I these cases we would need to perform, for instance, aligned word-sized loads and stores and mask as appropriate. I may be wrong, however. Another consideration is that the byte code interpreter would need to learn to understand these closures. Regardless, Simon Marlow began some work in this direction a few years ago. There is a mostly complete patch in D38. All it needs is rebasing, fixing of the byte code interpreter, and then perhaps introduction of Word8# and friends. I think it would be great if we could make our heap representation a bit more space-conscious. Perhaps you could open a ticket so we collect these tidbits? Another somewhat related issue that would be good think about in parallel to this issue is the treatment of the word-sized dependence of Word. See #11953. Cheers, - Ben -- Sent from my Android device with K-9 Mail. Please excuse my brevity.

Hi, Am Sonntag, den 11.06.2017, 10:44 -0400 schrieb Ben Gamari:
This is certainly one consideration. Another is that you would also need to teach the garbage collector to understand closures with sub- word-size fields. Currently we can encode whether each field of a closure is a pointer or not with a simple bitmap. If we naively allowed smaller fields we would need to increase the granularity of this representation to encode bytes.
Of course, one way to work around this would be to impose an invariant that guarantees that pointers are always word-aligned. Then we would probably want to shuffle sub-word sized fields, allowing two Word16s to inhabit a single word.
that is not an issue; we already sort field into pointers first, and non-pointers later. So all pointers are at the beginning and nicely aligned, and all the non-pointer data can follow in whatever weird format. The GC only needs to know how many words in total are used by the non-pointer data. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nomeata@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

Joachim Breitner
Hi,
Am Sonntag, den 11.06.2017, 10:44 -0400 schrieb Ben Gamari:
This is certainly one consideration. Another is that you would also need to teach the garbage collector to understand closures with sub- word-size fields. Currently we can encode whether each field of a closure is a pointer or not with a simple bitmap. If we naively allowed smaller fields we would need to increase the granularity of this representation to encode bytes.
Of course, one way to work around this would be to impose an invariant that guarantees that pointers are always word-aligned. Then we would probably want to shuffle sub-word sized fields, allowing two Word16s to inhabit a single word.
that is not an issue; we already sort field into pointers first, and non-pointers later. So all pointers are at the beginning and nicely aligned, and all the non-pointer data can follow in whatever weird format. The GC only needs to know how many words in total are used by the non-pointer data.
Ahh, great point. I stand corrected. Cheers, - Ben

Thanks a lot for the replies & links! I'll try to finish Simon's diff (and probably ask silly questions if I get stuck ;) Cheers, Michal

Just for the record, I've opened:
https://ghc.haskell.org/trac/ghc/ticket/13825
to track this.
Cheers,
Michal
On Mon, Jun 12, 2017 at 8:45 PM Michal Terepeta
Thanks a lot for the replies & links!
I'll try to finish Simon's diff (and probably ask silly questions if I get stuck ;)
Cheers, Michal

On 11 June 2017 at 22:44, Joachim Breitner
Hi,
Am Sonntag, den 11.06.2017, 10:44 -0400 schrieb Ben Gamari:
This is certainly one consideration. Another is that you would also need to teach the garbage collector to understand closures with sub- word-size fields. Currently we can encode whether each field of a closure is a pointer or not with a simple bitmap. If we naively allowed smaller fields we would need to increase the granularity of this representation to encode bytes.
Of course, one way to work around this would be to impose an invariant that guarantees that pointers are always word-aligned. Then we would probably want to shuffle sub-word sized fields, allowing two Word16s to inhabit a single word.
that is not an issue; we already sort field into pointers first, and non-pointers later. So all pointers are at the beginning and nicely aligned, and all the non-pointer data can follow in whatever weird format. The GC only needs to know how many words in total are used by the non-pointer data.
But the compiler has no support for sub-word-sized fields yet. I made a partial patch to support it a while ago: https://phabricator.haskell.org/D38 Cheers Simon Greetings,
Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • https://www.joachim-breitner.de/ XMPP: nomeata@joachim-breitner.de • OpenPGP-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (4)
-
Ben Gamari
-
Joachim Breitner
-
Michal Terepeta
-
Simon Marlow