Re: [GHC] #5218: Add unpackCStringLen# to create Strings from string literals

#5218: Add unpackCStringLen# to create Strings from string literals -------------------------------------+------------------------------------- Reporter: tibbe | Owner: thoughtpolice Type: feature request | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.0.3 Resolution: | Keywords: strings Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #5877 #10064 | Differential Rev(s): Phab:D2443 #11312, #9719 | Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I'm struggling to grok this ticket, especially: '''what is the problem we are trying to solve?'''. I'm also concerned about making things too complicated. ''jscholl in comment:74 sounds right on target to me''. Here's my thinking, written out. Let's see if we agree at least about the "Goals" and "Core" part. == Goals == I believe that one goal is * '''The ability to put a block of binary data in the program code, without heavy encoding.''' Is that a goal? Can we focus solely on that for a while? == Core == To meet that goal, in Core, we need * A primitive data type `B#` whose values are simply blobs of binary data. * Some operations over this type; e.g. `lenB :: B# -> Int`, or `unpackString :: B# -> [Char]` or whatever. * Literal values (in Core) for `B#` values. `B#` plays the role of the `(# Int#, Addr# #)` representation mentioned above (comment:38 ff), but without being so concrete. I'm only using "`B#`" as a placeholder; we need a proper name for it! So what is it, precisely? * `B#` could be a completely new primitive type. * Or `B#` could be `ByteArray#`. That would have the major advantage of not adding a new type, and for sure we'd need to be able to turn it into a `ByteArray#`. So I like that, and it's what jscholl suggests in comment:74. * But `B#` can't be `Addr#` (which is a memory address)! Also look at #11312, which is highly relevant because it has the same conclusion. In #11312, I call this new type `String#`, but that's too character-oriented. I think we should focus on binary data. But adopting `B#` would fix the ghastly problems in #11312. == Haskell == If we had this new primitive type, we'd soon want literals for it in Haskell source code. * I suppose we could have a new literal syntax (about whose details I am intensely relaxed). After all, the literals of a language should be expressible I suppose. * But we could say you could only get it via a TH quasiquote e.g. `[bytes| fec923ac |]`? Is that so terrible? Note that everything in comment:84 belongs in this section. By the time we get to Core all this typeclass stuff has gone away. == Other goals == I don't have clarity on how `bytestring` would want to convert a `ByteArray#` to a `ByteString`. That ought to be a constant time operation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/5218#comment:86 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC