It would be nice to have something equivalent to GCC's __builtin_clz intrinsic that could generate a BSR instruction on x86.  Is there any way to get GHC to generate such an instruction presently?

If Data.Bits had bitScan methods they would have straightforward default implementations but would also open up the possibility of generating efficient code for some instances.

Cheers,
  -Ryan

P.S. Here would be example default implementations:

-- The number of leading zero bits:
bitScanReverse :: Bits a => a -> Int
bitScanReverse num = loop (size - 1)
  where 
   size = bitSize num
   loop i | i < 0         = size
          | testBit num i = size - 1 - i
 | otherwise     = loop (i-1)

-- The number of trailing zero bits:
bitScanForward :: Bits a => a -> Int
bitScanForward num = loop 0 
  where 
   size = bitSize num
   loop i | i == size     = size
          | testBit num i = i
 | otherwise     = loop (i+1)