
[Redirecting to GHC users.] Turns out that 'shift' is just too big to be inlined. (It's only called once, but you have exported it too.) You can see GHC's inlining decisions by saying -ddump-inlinings. To make GHC keener to inline, use an INLINE pragma, or increase the inlining size threshold e.g. -funfolding-threshold=12 Simon | -----Original Message----- | From: haskell-bounces@haskell.org [mailto:haskell-bounces@haskell.org] On Behalf Of | roconnor@theorem.ca | Sent: 09 October 2006 00:41 | To: haskell@haskell.org | Subject: [Haskell] Expecting more inlining for bit shifting | | Consider the following GHC code: | | module Main where | | import GHC.Word | import GHC.Base | import GHC.Prim | import Random | | a `shiftRLT` b | b >=# 32# = int2Word# 0# | | otherwise = a `uncheckedShiftRL#` b | | (W32# x#) `shift` (I# i#) | | i# >=# 0# = W32# (narrow32Word# (x# `shiftL#` i#)) | | otherwise = W32# (x# `shiftRLT` negateInt# i#) | | x `shiftR` i = x `shift` (-i) | | shift7 x = x `shiftR` 7 | shift6 (W32# x) = (W32# (x `uncheckedShiftRL#` 6#)) | | main = do | xs <- sequence (replicate 1000000 | (fmap (shift7 . fromIntegral) (randomIO::IO Int))) | print (sum xs) | | I have copied the definition of `shiftR` for Word32 into this file. | | Suppose we want to shift a series of numbers by 7 bits. One would expect | GHC's inliner to notice that (-7) is indeed not greater than 0, and | eliminate the branch in the definition of `shift`. Further one would | expect GHC to notice that 7 is indeed not gtreater than 32, and eliminate | the branch in shiftRLT. Thus one would expect the code generated by using | shift7 to be identical to that being generated by shfit6 (with 7 replaced | by 6). | | But this appears not to be the case. The code generated for shift7 (if I | can read the C code correctly) is: | Sp[-1] = (-0x7U); | Sp[-2] = R1.p[1]; | *Sp = (W_)&s2za_info; | Sp=Sp-2; | JMP_((W_)&Main_zdwshift_info); | | while the code generated for shift6 is the lovely: | | Hp=Hp+2; | if ((W_)Hp > (W_)HpLim) goto _c2Aa; | _s2xq = (R1.p[1]) >> 0x6U; | Hp[-1] = (W_)&GHCziWord_W32zh_con_info; | *Hp = _s2xq; | R1.p=Hp-1; | Sp=Sp+1; | JMP_(*Sp); | _c2Aa: | HpAlloc = 0x8U; | JMP_(stg_gc_enter_1); | | My question is, why the discrepency? | | -- | Russell O'Connor http://r6.ca/ | ``All talk about `theft,''' the general counsel of the American Graphophone | Company wrote, ``is the merest claptrap, for there exists no property in | ideas musical, literary or artistic, except as defined by statute.'' | _______________________________________________ | Haskell mailing list | Haskell@haskell.org | http://www.haskell.org/mailman/listinfo/haskell