Statically linking a small piece of C into every GHC generated binary

Hi, I'm trying to add support for the POPCNT instruction, which exists on some modern CPUs (e.g. Nehalem). The idea is to add a popCnt# primop which would generate a POPCNT instruction when compiling with -msse4.2. If the user didn't specified -msse4.2, the primop should fall back to some other implementation of population count. A good fallback, in terms of both speed and memory usage, is this lookup-table based function: static char popcount_table_8[256] = { /*0*/ 0, /*1*/ 1, /*2*/ 1, /*3*/ 2, /*4*/ 1, /*5*/ 2, /*6*/ 2, /*7*/ 3, /*8*/ 1, /*9*/ 2, /*10*/ 2, /*11*/ 3, ... }; /* Table-driven popcount, with 8-bit tables */ /* 6 ops plus 4 casts and 4 lookups, 0 long immediates, 4 stages */ inline uint32_t popcount(uint32_t x) { return popcount_table_8[(uint8_t)x] + popcount_table_8[(uint8_t)(x >> 8)] + popcount_table_8[(uint8_t)(x >> 16)] + popcount_table_8[(uint8_t)(x >> 24)]; } (GCC and LLVM use the same fallback method.) It's important that the fallback is as good as it gets so that the user of the primop doesn't have to implement their own fallback (which is very complicated as the user would have to detect whether -msse4.2 is used or not!). This precludes non-table based solutions (as they're slower). I've implemented the primop but run into some difficulty: to use the above fallback I need the code to be statically linked into every binary. I'm not quite sure how to achieve that. GCC manages by having the above function definition in libc, which is always statically linked. I think LLVM uses a small statically linked compiler run-time library for the same purpose. How would one go about having a small C library linked into every Haskell binary? If we go ahead and implement more of these modern instructions we're likely to need more fallbacks (so this isn't needed by just POPCNT). Cheers, Johan

On Tue, Jul 19, 2011 at 6:02 PM, Johan Tibell
I've implemented the primop but run into some difficulty: to use the above fallback I need the code to be statically linked into every binary. I'm not quite sure how to achieve that.
If dynamic linking doesn't hurt performance (too much). Could I stick this piece of C code in ghc-prim? Are we guaranteed to always link against ghc-prim?
GCC manages by having the above function definition in libc, which is always statically linked.
I just realized that this isn't true. I wonder if GCC's __builtin_popcount suffers a performance degradation when libc is dynamically linked.
I think LLVM uses a small statically linked compiler run-time library for the same purpose.
However I believe this is the case still. I'll need to doublecheck. Johan

2011/7/19 Johan Tibell
On Tue, Jul 19, 2011 at 6:02 PM, Johan Tibell
wrote: I've implemented the primop but run into some difficulty: to use the above fallback I need the code to be statically linked into every binary. I'm not quite sure how to achieve that.
If dynamic linking doesn't hurt performance (too much). Could I stick this piece of C code in ghc-prim? Are we guaranteed to always link against ghc-prim?
GCC manages by having the above function definition in libc, which is always statically linked.
I just realized that this isn't true. I wonder if GCC's __builtin_popcount suffers a performance degradation when libc is dynamically linked.
I assume you meant libgcc and not libc. I think the linking is a bit of a red herring, ideally they want to inline it and that requires LTO and static linking and GHC can't do LTO anyway. There should be little overhead (one extra jump, a cycle or two) in calling a dynamically linked function compared to a non-inlined statically linked one. One small advantage of static linking is that the functions won't be included if they're not used.
I think LLVM uses a small statically linked compiler run-time library for the same purpose.
However I believe this is the case still. I'll need to doublecheck.
Johan
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Cheers, Niklas
participants (2)
-
Johan Tibell
-
Niklas Larsson