
#12095: GHC and LLVM don't agree on what to do with byteSwap16# -------------------------------------+------------------------------------- Reporter: thoughtpolice | Owner: Type: bug | Status: new Priority: high | Milestone: 8.0.2 Component: Compiler | Version: 8.0.1 (LLVM) | Keywords: codegen, llvm | Operating System: Unknown/Multiple Architecture: | Type of failure: Incorrect result Unknown/Multiple | at runtime Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Consider this test case (taken from [https://github.com/well-typed/binary- serialise-cbor/issues/67 here] and lightly modified to work on big/little endian machines): {{{#!hs {-# LANGUAGE BangPatterns #-} {-# LANGUAGE MagicHash #-} {-# LANGUAGE CPP #-} module Main ( main -- :: IO () ) where #include "ghcconfig.h" import GHC.Prim import GHC.Word data T = T !Addr# t :: T #ifndef WORDS_BIGENDIAN t = T "\xcf\xb1"# #else t = T "\xb1\xcf"# #endif grabWord16 :: T -> Word64 grabWord16 (T addr#) = W64# (byteSwap16# (indexWord16OffAddr# addr# 0#)) trip :: Int trip = fromIntegral (grabWord16 t) main :: IO () main = print trip }}} With GHC 7.10.3 using the NCG, the results given are correct: {{{ $ ghc --version The Glorious Glasgow Haskell Compilation System, version 7.10.3 $ ghc -Wall -fforce-recomp -O2 Issue67.hs && ./Issue67 [1 of 1] Compiling Main ( Issue67.hs, Issue67.o ) Linking Issue67 ... 53169 }}} This also is the same on GHC 8.0.1 using the NCG, on both PowerPC and AMD64 as well. This answer is correct: `53169` is `0xCFB1` in hex, so the `byteSwap16#` primitive correctly works to decode the swapped-endian number. However, the story is not the same with GHC 7.10.3+LLVM 3.5, or GHC 8.0.1+LLVM 3.7: {{{ $ ghc --version The Glorious Glasgow Haskell Compilation System, version 7.10.3 $ llc --version | head -2 LLVM (http://llvm.org/): LLVM version 3.5.2 $ ghc -Wall -fforce-recomp -O2 Issue67.hs -fllvm && ./Issue67 [1 of 1] Compiling Main ( Issue67.hs, Issue67.o ) Linking Issue67 ... -12367 }}} Note: {{{#!hs -12367 == (fromIntegral (53169 :: Word16) :: Int16) }}} The relevant snippet looks like this at the CMM level (GHC 7.10.3): {{{ ==================== Output Cmm ==================== [section "data" { Main.main2_closure: const Main.main2_info; const 0; const 0; const 0; }, section "readonly" { c3rq_str: I8[] [207,177] }, section "readonly" { c3rr_str: I8[] [207,177] }, Main.main2_entry() // [R1] { info_tbl: [(c3ru, label: Main.main2_info rep:HeapRep static { Thunk }), (c3rD, label: block_c3rD_info rep:StackRep [])] stack_info: arg_space: 8 updfr_space: Just 8 } {offset c3ru: ... c3ro: I64[Sp - 16] = stg_bh_upd_frame_info; I64[Sp - 8] = _c3rn::I64; (_c3rw::I64) = call MO_BSwap W16(%MO_UU_Conv_W16_W64(I16[c3rr_str])); I64[Sp - 24] = c3rD; R4 = GHC.Types.[]_closure+1; R3 = _c3rw::I64; R2 = 0; Sp = Sp - 24; call GHC.Show.$wshowSignedInt_info(R4, R3, R2) returns to c3rD, args: 8, res: 8, upd: 24; ... }}} Pre-optimized LLVM basic block: {{{ c3rB: %ln3sc = ptrtoint i8* @stg_bh_upd_frame_info to i64 %ln3sb = load i64** %Sp_Var %ln3sd = getelementptr inbounds i64* %ln3sb, i32 -2 store i64 %ln3sc, i64* %ln3sd, !tbaa !1 %ln3sf = load i64* %lc3rA %ln3se = load i64** %Sp_Var %ln3sg = getelementptr inbounds i64* %ln3se, i32 -1 store i64 %ln3sf, i64* %ln3sg, !tbaa !1 %ln3sh = ptrtoint %c3rE_str_struct* @c3rE_str$def to i64 %ln3si = inttoptr i64 %ln3sh to i16* %ln3sj = load i16* %ln3si, !tbaa !5 %ln3sk = zext i16 %ln3sj to i64 %ln3sl = trunc i64 %ln3sk to i16 %ln3sm = call ccc i16 (i16)* @llvm.bswap.i16( i16 %ln3sl ) %ln3sn = sext i16 %ln3sm to i64 store i64 %ln3sn, i64* %lc3rJ %ln3sp = ptrtoint void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @c3rQ_info$def to i64 %ln3so = load i64** %Sp_Var %ln3sq = getelementptr inbounds i64* %ln3so, i32 -3 store i64 %ln3sp, i64* %ln3sq, !tbaa !1 %ln3sr = ptrtoint i8* @ghczmprim_GHCziTypes_ZMZN_closure to i64 %ln3ss = add i64 %ln3sr, 1 store i64 %ln3ss, i64* %R4_Var %ln3st = load i64* %lc3rJ store i64 %ln3st, i64* %R3_Var store i64 0, i64* %R2_Var %ln3su = load i64** %Sp_Var %ln3sv = getelementptr inbounds i64* %ln3su, i32 -3 %ln3sw = ptrtoint i64* %ln3sv to i64 %ln3sx = inttoptr i64 %ln3sw to i64* store i64* %ln3sx, i64** %Sp_Var %ln3sy = bitcast i8* @base_GHCziShow_zdwshowSignedInt_info to void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* }}} Post-optimized block (`opt --enable-tbaa=true -O2 out-llvm-orig.ll -o out- llvm.bc`): {{{ c3rB: ; preds = %c3rU %ln3s8 = ptrtoint i8* %ln3s7 to i64 %ln3sd = getelementptr inbounds i64* %Sp_Arg, i64 -2 store i64 ptrtoint (i8* @stg_bh_upd_frame_info to i64), i64* %ln3sd, align 8, !tbaa !5 %ln3sg = getelementptr inbounds i64* %Sp_Arg, i64 -1 store i64 %ln3s8, i64* %ln3sg, align 8, !tbaa !5 store i64 ptrtoint (void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)* @"c3rQ_info$def" to i64), i64* %ln3rZ, align 8, !tbaa !5 tail call cc10 void bitcast (i8* @base_GHCziShow_zdwshowSignedInt_info to void (i64*, i64*, i64*, i64, i64, i64, i64, i64, i64, i64)*)(i64* %Base_Arg, i64* %ln3rZ, i64* %Hp_Arg, i64 %R1_Arg, i64 0, i64 -12367, i64 add (i64 ptrtoint (i8* @ghczmprim_GHCziTypes_ZMZN_closure to i64), i64 1), i64 undef, i64 undef, i64 %SpLim_Arg) #0 ret void }}} Folds it right into a constant! I haven't spent time diagnosing this much further, yet. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12095 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler