
#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature | Status: new request | Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- It would be neat if we had a primitive for calling memchr on a ByteArray#. Something like this: {{{ memchr :: ByteArray# -> Word# -> Int# -> Int# -> Int# }}} The integer arguments are a starting index and length. I have a hunch that copyByteArray# and friends are just wrappers for memcpy that somehow prohibit the array from being moved by the GC while they are running. I would be happy to try to implement memchr# if it seems like an alright idea. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): If the goal is really to just call the C `memchr` implementation then you might also simply use an unsafe foreign call. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): That does work. However, I prefer to avoid using the FFI where possible. My reasons (which I’ll admit are not entirely compelling) are: it’s syntactically unappealing, it prevents optimizations (barely relevant in this case though), and it doesn’t work with GHCJS. I noticed from looking at the source for Data.Primitive.ByteArray that copyByteArray# and friends weren’t always provided by GHC.Prim. https://hackage.haskell.org/package/primitive-0.6.3.0/docs/src/Data- Primitive-ByteArray.html#MutableByteArray For super old GHCs, they were just an FFI call to memcpy. But these were eventually brought into GHC.Prim. I don’t know why this decision was made, but I’m glad that it was. I would like more ByteArray operations that correspond to manually super optimized standard c library functions to be made available as primitives. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Right, indeed we have been gradually moving simple operations like this into primops and this is certainly something we can consider doing in the case of `memchr`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): Gotcha. Out of curiosity, who is the "we" that you refer to? Lately, I've been trying to improve my understanding of who has the authority to make certain decisions concerning important components of the GHC haskell ecosystem. According to the core libraries maintainership page (1), `ghc- prim` is maintained by GHC HQ, but I cannot figure out who GHC HQ is. Is there a page somewhere that lists who comprises this group? (1) https://wiki.haskell.org/Library_submissions#The_Core_Libraries -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): Sorry, just reread my comment after submitting it. I realized that the first two sentences may be read with a snarky tone to them that I didn't intend. Also, I'm glad that `memchr` could be considered for being moved into `ghc-prim`, and I'm happy to try implementing it if it's approved. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): Nevermind my earlier question. I just found the answer on the very page I had linked to:
The maintainer "GHC HQ" means Simon Marlow, Simon Peyton Jones, Herbert Valerio Riedel and Ben Gamari. Daniel Fischer has taken responsibility for numeric stuff.
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): That being said, your question is a very reasonable one and one that doesn't have a particularly clear meaning. "GHC HQ" is a rather vague entity which has historically meant some subset of the Simons, hvr, me/Austin/Ian, and perhaps some unnamed others. I generally try to avoid using the phrase for that reason but there are still plenty of references to be found on the wiki. Regarding `ghc-prim`, it's really an implementation detail of GHC and therefore Simon PJ has the ultimate say. In general if there's uncertainty on his part we just send the proposal to the proposal process. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by hvr): Fwiw, I can see the appeal to abstract over the basic low-level ISO C99 memory primitives (`memcmp`, `memchr`, `memcpy`, `memset` etc, some of which may have a LLVM primitive op counterpart), which makes it easier for projects like ghcjs, eta, or the upcoming ghc/wasm if `ghc-prim` already provides the primitives in a single central place, than having to patch N packages which FFI call themselves. It seems to me, that `memchr(3)` may be the only one left from the ISO C99 set of `mem*` family of functions operating on byte arrays we haven't yet wrapped. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): @bgamari Thanks for the clarification. It is much appreciated. @hvr I think that `memcmp` is also missing. Also, thanks for pointing out eta and wasm, which I'd forgotten about. I agree that this would be helpful for those as well. Also, that's a cool blog post. Thanks for linking. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): @hvr You're right, `memcmp` has already been implemented as `compareByteArrays#`. It just hasn't been released yet. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): I'm giving this a shot right now. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4472 Wiki Page: | -------------------------------------+------------------------------------- Changes (by alexbiehl): * differential: => Phab:D4472 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4472 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I'm not all that convinced that we need `memchr#` to be a primitive. The reason we brought in `memcmp` and `memcpy` were for performance in the `unordered-containers` package, as I recall, and they were a pretty important performance win in that case. Do we have any similar motivation for `memchr`? Where does it stop? These primitives are pretty complicated to implement in the compiler. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4472 Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): It's good to hear what the actual motivation was for bringing `memcpy` in. I cannot provide a similarly compelling performance-oriented reason for `memchr`. I think that in https://ghc.haskell.org/trac/ghc/ticket/14882#comment:8, hvr makes the best argument I could offer: that all of the other C99 memory primitives are already wrapped by `ghc-prim` and this one would complete the set. I disagree that `memchr` is difficult to implement a wrapper for in the compiler. In the differential I linked, the actual implementation is about 20 lines code. Some of the others are certainly more complicated though. The fact `memcpy` takes two arrays means the GHC needs several variants of it dealing with various combinations of `ByteArray#`, `MutableByteArray#`, and `Addr#`, (GHC has 5 variants of `copyByteArray#`) but `memchr` only takes a single array, so even the most complete wrapping of it possible would only need three implementations. I've only done one so far since it was the only one I needed.
Where does it stop?
I think that it would stop where the C99 memory primitives stop. So, nothing like `memset_s`, `rawmemchr`, etc. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4472 Wiki Page: | -------------------------------------+------------------------------------- Comment (by andrewthad): Also, I just realized that I never linked to the differential on phab. Here it is: https://phabricator.haskell.org/D4472 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4472 Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): So I'm a bit unsure of how to proceed. On one hand, I can see the argument for consistency and having a uniform interface to these primitives across Haskell implementations (e.g. GHC, GHCJS, an eventual WebAssembly implementation, Eta) is nice. On the other, this uniform interface could likely be as-easily provided in a library. Given the relatively small size of the patch and the fact that we seem to agree that this will be the last such primop (since this is indeed the only C99 memory primitive that we lack), I'm not particularly opposed to merging it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14882: memchr# -------------------------------------+------------------------------------- Reporter: andrewthad | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4472 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Is there a benefit to baking it in as a primop rather than making it an FFI function in `base` or even `ghc-prim`? If not, I don't see the point. Personally, I suspect that there is not ''yet'' a benefit, but that there might be one in the future. In particular, if we ever have sufficient support for vector registers, we might get a small speed boost by implementing `memchr#` ourselves and inlining it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14882#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC