
#5218: Add unpackCStringLen# to create Strings from string literals -------------------------------------+------------------------------------- Reporter: tibbe | Owner: thoughtpolice Type: feature request | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.0.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #5877 #10064 | Differential Rev(s): Phab:D2443 Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari):
If i remember correctly, a `ByteArray#` would have an extra header and a length field, which in turn bring a 2 words overhead, one word more compareing to the `(# int#, addr# #)` solution.
But i would argue this overhead can bring a nice solution to ghc's long lasting literal problem, for example, vector package and text package can
Right, this would incur another word of overhead. However, on the majority of machines this is 8-bytes which is quite significant. Looking at GHC itself, just over a third of all string literals are 8 characters or less (6000 our of 17500). For these literals adding another word would increase the fractional overhead from >50% to >67%. I have spoken with GHC users targeting mobile platforms who already suffer from our code size; it's hard to justify such an increase without a very good reason. provide some TH to directly save some byteArray# literal using hexadecimal notation, this save many extra copying during runtime. What is stopping these libraries from providing this mechanism currently using `Addr#` and primitive strings directly? In general primitive strings are, as the name would suggest, primitive. I'm not sure forcing a heap object representation here is necessary nor prudent. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/5218#comment:59 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler