
#5218: Add unpackCStringLen# to create Strings from string literals -------------------------------------+------------------------------------- Reporter: tibbe | Owner: thoughtpolice Type: feature request | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.0.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #5877 #10064 | Differential Rev(s): Phab:D2443 Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Replying to [comment:60 winter]:
What is stopping these libraries from providing this mechanism currently using Addr# and primitive strings directly?
The problem is that there's no way to cast `Addr#` into `ByteArray#` without copy, while unboxed vector(not storable) and text both want `ByteArray#`.
In general primitive strings are, as the name would suggest,
Fair enough, but why not just poke a hole in the `ByteArray#` abstraction in that case? Namely, provide a `unsafeMkByteArray# :: Addr# -> Int -> ByteArray#`. primitive. I'm not sure forcing a heap object representation here is necessary nor prudent.
I disagree. If we give string literal a proper compact representation,
not only we can save unnecessary copying during runtime, we can save code size in other ways.
Consider if string literal now are `ByteArray#`s, we can use rules to
simplify a UTF8 text type like `forall a. fromString (GHC.unpackCString# a) = UTF8 a`, that means we can directly use constructor here instead of several calls. The same applys for unbox vectors using unboxed string literal and hexdecimal notation. I agree that these are all great simplifications, but I don't see why changing the representation of primitive strings is necessary to get there. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/5218#comment:61 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler