Re: [GHC] #4092: Floating point manipulation : ulp and coerce IEEE-754 Double# into Word64#

1 Jul 2015

      #4092: Floating point manipulation : ulp and coerce IEEE-754 Double# into Word64#
-------------------------------------+-------------------------------------
        Reporter:  malosh            |                   Owner:
            Type:  feature request   |                  Status:  new
        Priority:  normal            |               Milestone:  7.12.1
       Component:  Compiler          |                 Version:  6.12.2
      Resolution:                    |                Keywords:
Operating System:  Unknown/Multiple  |            Architecture:
 Type of failure:  None/Unknown      |  Unknown/Multiple
      Blocked By:                    |               Test Case:
 Related Tickets:                    |                Blocking:
                                     |  Differential Revisions:
-------------------------------------+-------------------------------------

Comment (by duncan):

 In the meantime, in the absence of primops, this is the best I can manage.
 I suggest we add this or similar to GHC.Float:

 {{{
 {-# INLINE castWord2Float #-}
 castWord2Float :: Word32 -> Float
 castWord2Float (W32# w#) = F# (castWord2Float# w#)

 {-# NOINLINE castWord2Float# #-}
 castWord2Float# :: Word# -> Float#
 castWord2Float# w# =
     case newByteArray# 4# realWorld# of
       (# s', mba# #) ->
         case writeWord32Array# mba# 0# w# s' of
           s'' ->
             case readFloatArray# mba# 0# s'' of
               (# _, f# #) -> f#

 {-# INLINE castWord2Double #-}
 castWord2Double :: Word64 -> Double
 castWord2Double (W64# w#) = D# (castWord2Double# w#)

 {-# NOINLINE castWord2Double# #-}
 castWord2Double# :: Word# -> Double#
 castWord2Double# w# =
     case newByteArray# 8# realWorld# of
       (# s', mba# #) ->
         case writeWord64Array# mba# 0# w# s' of
           s'' ->
             case readDoubleArray# mba# 0# s'' of
               (# _, f# #) -> f#
 }}}

 This is similar to the "cast STUArray" method, but avoids the extra call
 and closure allocation due to the `runSTRep`. For the "cast STUArray"
 method, see:

 http://hackage.haskell.org/package/reinterpret-cast-0.1.0/docs/src/Data-
 ReinterpretCast-Internal-ImplArray.html

 The `NOINLINE` means that the use of `realWorld#` should be ok, despite
 `newByteArray# 8# realWorld#` being a constant.

 It'll need a very similar impl for 32bit systems that need the Word64#
 type.

 Compare the CMM of the above:
 {{{
 castWord2Double#_entry() //  [R2]
          { info_tbl: [(c2Qn,
                        label: castWord2Double#_info
                        rep:HeapRep static { Fun {arity: 1 fun_type:
 ArgSpec 4} })]
            stack_info: arg_space: 8 updfr_space: Just 8
          }
      {offset
        c2Qn:
            Hp = Hp + 24;
            if (Hp > HpLim) goto c2Qr; else goto c2Qq;
        c2Qr:
            HpAlloc = 24;
            R2 = R2;
            R1 = castWord2Double#_closure;
            call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
        c2Qq:
            I64[Hp - 16] = stg_ARR_WORDS_info;
            I64[Hp - 8] = 8;
            _s2Q9::P64 = Hp - 16;
            I64[_s2Q9::P64 + 16] = R2;
            D1 = F64[_s2Q9::P64 + 16];
            call (P64[Sp])(D1) args: 8, res: 0, upd: 8;
      }
  }
 }}}
 with the version that uses runST / runSTRep
 {{{
  sat_s2QX_entry() //  [R1]
          { info_tbl: [(c2Rd,
                        label: sat_s2QX_info
                        rep:HeapRep 1 nonptrs { Fun {arity: 1 fun_type:
 ArgSpec 3} })]
            stack_info: arg_space: 8 updfr_space: Just 8
          }
      {offset
        c2Rd:
            Hp = Hp + 40;
            if (Hp > HpLim) goto c2Rj; else goto c2Ri;
        c2Rj:
            HpAlloc = 40;
            R1 = R1;
            call (stg_gc_fun)(R1) args: 8, res: 0, upd: 8;
        c2Ri:
            _s2QN::I64 = I64[R1 + 7];
            I64[Hp - 32] = stg_ARR_WORDS_info;
            I64[Hp - 24] = 8;
            _s2QR::P64 = Hp - 32;
            I64[_s2QR::P64 + 16] = _s2QN::I64;
            _s2QV::F64 = F64[_s2QR::P64 + 16];
            I64[Hp - 8] = D#_con_info;
            F64[Hp] = _s2QV::F64;
            R1 = Hp - 7;
            call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
      }
  },
  castWord2Double#_entry() //  [R2]
          { info_tbl: [(c2Rk,
                        label: castWord2Double#_info
                        rep:HeapRep static { Fun {arity: 1 fun_type:
 ArgSpec 4} })]
            stack_info: arg_space: 8 updfr_space: Just 8
          }
      {offset
        c2Rk:
            Hp = Hp + 16;
            if (Hp > HpLim) goto c2Ro; else goto c2Rn;
        c2Ro:
            HpAlloc = 16;
            R2 = R2;
            R1 = castWord2Double#_closure;
            call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8;
        c2Rn:
            I64[Hp - 8] = sat_s2QX_info;
            I64[Hp] = R2;
            R2 = Hp - 7;
            call runSTRep_info(R2) args: 8, res: 0, upd: 8;
      }
  }
 }}}
 The runSTRep version involves allocating a `sat_s2QX` closure and calling
 `runSTRep` to call that closure.

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/4092#comment:20
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

GHC

tags

participants (1)