
Replying to [comment:3 xnyhps]:
The main argument in favor of alignment seems to be: code often `memcpy`s string literals into buffers. By doing that with aligned addresses (apparently) SSE instructions can be used. This is irrelevant for GHC, because the strings are only parsed into `[Char]`s, never copied.
Will that always be the case if a string literal represents something
You mention that there are a lot of string literals in the Prelude. I would bet that the vast majority of those are error messages. Might it be
#9577: String literals are wasting space -------------------------------------+------------------------------------- Reporter: xnyhps | Owner: xnyhps Type: bug | Status: new Priority: low | Milestone: Component: Compiler | Version: 7.8.2 (NCG) | Keywords: Resolution: | Architecture: Unknown/Multiple Operating System: | Difficulty: Unknown Unknown/Multiple | Blocked By: Type of failure: Runtime | Related Tickets: performance bug | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by xnyhps): Replying to [comment:4 dfeuer]: like `Text` or `ByteString`? If so, will that continue to hold in the future? Might a future optimization fuse `putStr` with the conversion to do a copy? It may be that these concerns are baseless, but it might make sense to consider what alternative optimizations yours could preclude. For the record, this is the rewrite rule used by ByteString: * https://github.com/haskell/bytestring/blob/master/Data/ByteString/Internal.h..., calling https://github.com/haskell/bytestring/blob/master/Data/ByteString/Internal.h.... This just wraps the `Addr#` directly, no copying here. However, [https://github.com/haskell/bytestring/blob/master/Data/ByteString/Internal.h... append] does call `memcpy` twice. I don't think GHC has the kind of optimizations that can turn a `memcpy` call into SIMD instructions directly, but maybe `memcpy` is more efficient when called with aligned buffers. I'll try to test this. And these are the rewrite rules for text: * https://github.com/bos/text/blob/e33c89be4256fdd1c31f39d8a2a63e58e23b0182/Da... calling https://github.com/bos/text/blob/e33c89be4256fdd1c31f39d8a2a63e58e23b0182/Da... A loop similar to `unpackCString#`, so alignment won't matter much. It is true that we might find optimizations later that benefit from aligned strings. But unaligning them now doesn't preclude that. Literals only exist within a single module, so any optimization has control over both the literal and the code that uses it. (`Strings` can be exported, but `Addr#`s can't.) Even if someone would try to mix object files generated by different versions of GHC it wouldn't be a problem. possible to specifically target ''exceptional'' strings that should never be anywhere speed-critical, and pack them all together? Putting them all together, ideally starting or ending on a page boundary, would (hopefully) mean that they wouldn't even need to be swapped in unless an error occurred. I'm not familiar enough with assembly or executable file formats to say whether this is possible, but I'll keep it in mind. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9577#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler