
so I have this simple bit of code, which should be fast but seems to be being compiled to something very slow.
import Data.Word import Data.Bits
fhb :: Word -> Word fhb w = b1 .|. b2 where b2 = if 0xFFFF0000 .&. w /= 0 then 0x2 else 0 b1 = if 0xFF00FF00 .&. w /= 0 then 0x1 else 0
what it compiles to is something involving Integers, lots of coercions and other nasty stuff when it should consist of a couple of primitive operations. John -- John Meacham - ⑆repetae.net⑆john⑈

john:
so I have this simple bit of code, which should be fast but seems to be being compiled to something very slow.
import Data.Word import Data.Bits
fhb :: Word -> Word fhb w = b1 .|. b2 where b2 = if 0xFFFF0000 .&. w /= 0 then 0x2 else 0 b1 = if 0xFF00FF00 .&. w /= 0 then 0x1 else 0
what it compiles to is something involving Integers, lots of coercions and other nasty stuff when it should consist of a couple of primitive operations.
looks suspicous! Ideally I'd want something like this produced: fhb_ideal :: Word -> Word fhb_ideal (W# w) = W# ((int2Word# (case word2Int# (int2Word# 0xFF00FF00# `and#` w) of 0# -> 0#; _ -> 1#)) `or#` (int2Word# (case word2Int# (int2Word# 0xFFFF0000# `and#` w) of 0# -> 0#; _ -> 2#))) which generates the core: M.$wfhb_ideal = \ (ww_sP7 :: GHC.Prim.Word#) -> GHC.Prim.or# (GHC.Prim.int2Word# (case GHC.Prim.word2Int# (GHC.Prim.and# __word 4278255360 ww_sP7) of ds_dMF { __DEFAULT -> 1; 0 -> 0 })) (GHC.Prim.int2Word# (case GHC.Prim.word2Int# (GHC.Prim.and# __word 4294901760 ww_sP7) of ds_dMI { __DEFAULT -> 2; 0 -> 0 })) Whereas the test example: fhb_boxed :: Word -> Word fhb_boxed w = b1 .|. b2 where b2 = if 0xFFFF0000 .&. w /= 0 then 0x2 else 0 b1 = if 0xFF00FF00 .&. w /= 0 then 0x1 else 0 Turns into some nasty: M.lit = case GHC.Prim.addIntC# 2147418113 2147483647 of wild2_aN5 { (# r_aN7, c_aN8 #) -> case case c_aN8 of wild3_aN9 { __DEFAULT -> case GHC.Prim.int2Integer# 2147418113 of wild4_aNa { (# s_aNc, d_aNd #) -> case GHC.Prim.int2Integer# 2147483647 of wild5_aNe { (# s1_aNg, d1_aNh #) -> case GHC.Prim.plusInteger# s_aNc d_aNd s1_aNg d1_aNh of wild_aO8 { (# s2_aOa, d2_aOb #) -> GHC.Prim.integer2Word# s2_aOa d2_aOb } } }; 0 -> GHC.Prim.int2Word# r_aN7 } of ww_aNW { __DEFAULT -> GHC.Word.W# ww_aNW } } M.lit1 :: GHC.Word.Word [GlobalId] [Str: DmdType] M.lit1 = case GHC.Prim.addIntC# 2130771713 2147483647 of wild2_aN5 { (# r_aN7, c_aN8 #) -> case case c_aN8 of wild3_aN9 { __DEFAULT -> case GHC.Prim.int2Integer# 2130771713 of wild4_aNa { (# s_aNc, d_aNd #) -> case GHC.Prim.int2Integer# 2147483647 of wild5_aNe { (# s1_aNg, d1_aNh #) -> case GHC.Prim.plusInteger# s_aNc d_aNd s1_aNg d1_aNh of wild_aO8 { (# s2_aOa, d2_aOb #) -> GHC.Prim.integer2Word# s2_aOa d2_aOb } } }; 0 -> GHC.Prim.int2Word# r_aN7 } of ww_aNW { __DEFAULT -> GHC.Word.W# ww_aNW } } M.$wfhb_boxed :: GHC.Prim.Word# -> GHC.Prim.Word# [GlobalId] [Arity 1 Str: DmdType L] M.$wfhb_boxed = \ (ww_sPh :: GHC.Prim.Word#) -> case M.lit1 of wild_aOp { GHC.Word.W# x#_aOr -> case GHC.Prim.eqWord# (GHC.Prim.and# x#_aOr ww_sPh) __word 0 of wild2_aOk { GHC.Base.False -> case M.lit of wild1_XPt { GHC.Word.W# x#1_XPx -> case GHC.Prim.eqWord# (GHC.Prim.and# x#1_XPx ww_sPh) __word 0 of wild21_XOP { GHC.Base.False -> __word 3; GHC.Base.True -> __word 1 } }; GHC.Base.True -> case M.lit of wild1_XPt { GHC.Word.W# x#1_XPx -> case GHC.Prim.eqWord# (GHC.Prim.and# x#1_XPx ww_sPh) __word 0 of wild21_XOP { GHC.Base.False -> __word 2; GHC.Base.True -> __word 0 } } } } So not sure where those Integer thingies are creeping in. Here's a little test case, btw, with a QuickCheck property. -- Don

Donald Bruce Stewart wrote:
john:
so I have this simple bit of code, which should be fast but seems to be being compiled to something very slow.
import Data.Word import Data.Bits
fhb :: Word -> Word fhb w = b1 .|. b2 where b2 = if 0xFFFF0000 .&. w /= 0 then 0x2 else 0 b1 = if 0xFF00FF00 .&. w /= 0 then 0x1 else 0
M.lit = case GHC.Prim.addIntC# 2147418113 2147483647
If I understand the code correctly this happens in the desugaring pass already. The literal 0xFFFF0000 exceeds the range of Int so it's expressed with smaller numbers (see deSugar/DsUtils.lhs, mkIntegerExpr); in this case it becomes fromInteger (0x70000001 + 0x7FFFFFFF), which matches the above line exactly. Btw, the comment in rename/RnTypes about this ("Big integer literals are built [...]") is wrong, as it refers to mkIntegerLit instead of mkIntegerExpr. This suggests that as a workaround, using -0x00010000 and -0x00FF0100 should help (but the code will depend on the word size then). Bertram

Bertram, Don, John You are hot on the trail. Indeed you can probably find out what is going on, and how to fix it, as fast as we can, since this is all paged out of my brain. One plausible place to look would be Inst.shortCutIntLit; add a case for Word similar to that for Int and Integer. The only downside of this is that it works when it's clear *at the moment the literal is seen* that it's a Word. In principle, this knowledge might arise later in type inference. But in practice I think it'll do just what you want. It'd be possible to do something cleverer later, but I'm not sure it'd be necessary. I don't think it's an off-by-one thing John. If you guys can come up with a patch, I'll check it over. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] On | Behalf Of Bertram Felgenhauer | Sent: 28 January 2007 15:41 | To: glasgow-haskell-users@haskell.org | Subject: Re: hrm... | | Donald Bruce Stewart wrote: | > john: | > > so I have this simple bit of code, which should be fast but seems to be | > > being compiled to something very slow. | > > | > > > import Data.Word | > > > import Data.Bits | > > > | > > > fhb :: Word -> Word | > > > fhb w = b1 .|. b2 where | > > > b2 = if 0xFFFF0000 .&. w /= 0 then 0x2 else 0 | > > > b1 = if 0xFF00FF00 .&. w /= 0 then 0x1 else 0 | | > M.lit = | > case GHC.Prim.addIntC# 2147418113 2147483647 | | If I understand the code correctly this happens in the desugaring pass | already. The literal 0xFFFF0000 exceeds the range of Int so it's expressed | with smaller numbers (see deSugar/DsUtils.lhs, mkIntegerExpr); in this | case it becomes fromInteger (0x70000001 + 0x7FFFFFFF), which matches | the above line exactly. | | Btw, the comment in rename/RnTypes about this | ("Big integer literals are built [...]") is wrong, as it refers to | mkIntegerLit instead of mkIntegerExpr. | | This suggests that as a workaround, using -0x00010000 and -0x00FF0100 | should help (but the code will depend on the word size then). | | Bertram | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On 1/27/07, John Meacham
so I have this simple bit of code, which should be fast but seems to be being compiled to something very slow.
import Data.Word import Data.Bits
fhb :: Word -> Word fhb w = b1 .|. b2 where b2 = if 0xFFFF0000 .&. w /= 0 then 0x2 else 0 b1 = if 0xFF00FF00 .&. w /= 0 then 0x1 else 0
what it compiles to is something involving Integers, lots of coercions and other nasty stuff when it should consist of a couple of primitive operations.
Output from an AMD64 box: $wfhb = \ (ww_sIw :: GHC.Prim.Word#) -> case GHC.Prim.eqWord# (GHC.Prim.and# __word 4278255360 ww_sIw) __word 0 of wild2_aHI { GHC.Base.False -> case GHC.Prim.eqWord# (GHC.Prim.and# __word 4294901760 ww_sIw) __word 0 of wild21_XHW { GHC.Base.False -> __word 3; GHC.Base.True -> __word 1 }; GHC.Base.True -> case GHC.Prim.eqWord# (GHC.Prim.and# __word 4294901760 ww_sIw) __word 0 of wild21_XHW { GHC.Base.False -> __word 2; GHC.Base.True -> __word 0 } } -- Cheers, Lemmih

On Sat, Jan 27, 2007 at 01:48:29AM +0100, Lemmih wrote:
On 1/27/07, John Meacham
wrote: so I have this simple bit of code, which should be fast but seems to be being compiled to something very slow.
import Data.Word import Data.Bits
fhb :: Word -> Word fhb w = b1 .|. b2 where b2 = if 0xFFFF0000 .&. w /= 0 then 0x2 else 0 b1 = if 0xFF00FF00 .&. w /= 0 then 0x1 else 0
what it compiles to is something involving Integers, lots of coercions and other nasty stuff when it should consist of a couple of primitive operations.
Output from an AMD64 box:
$wfhb = \ (ww_sIw :: GHC.Prim.Word#) -> case GHC.Prim.eqWord# (GHC.Prim.and# __word 4278255360 ww_sIw) __word 0 of wild2_aHI { GHC.Base.False -> case GHC.Prim.eqWord# (GHC.Prim.and# __word 4294901760 ww_sIw) __word 0 of wild21_XHW { GHC.Base.False -> __word 3; GHC.Base.True -> __word 1 }; GHC.Base.True -> case GHC.Prim.eqWord# (GHC.Prim.and# __word 4294901760 ww_sIw) __word 0 of wild21_XHW { GHC.Base.False -> __word 2; GHC.Base.True -> __word 0 } }
Yeah, but the 64 bit version of the algorithm also generates the bad code on x86-64. I think the issue is an off by one error somewhere, making ghc think that 0xffffffff is too big to fit in a Word, when it actually fits just right. John -- John Meacham - ⑆repetae.net⑆john⑈
participants (5)
-
Bertram Felgenhauer
-
dons@cse.unsw.edu.au
-
John Meacham
-
Lemmih
-
Simon Peyton-Jones