Hi,

While trying to figure out why some of my code is very slow I have found that it is something related to division.
Digging a bit deeper I found an example which shows some unexpected magic and a lack of the expected one.
Before raising any tickets in trac I would like to consult with you regarding what I see. Maybe I am misunderstanding the way GHC is supposed to work.

-------------------
module Test where

import Data.Int
import GHC.Exts
import GHC.Prim

foo :: Int -> Int
foo a =
  let
    b = a `quot` 1111
    c = b `quot` 1113
    d = c `quot` 1117
  in d

bar :: Int -> Int
bar !a' =
  let
    !(I# a) = a'
    !(b) = quotInt# a 1111#
    !(c) = quotInt# b 1113#
    !(d) = quotInt# c 1117#
  in I# d
-------------------

Here 'foo' is a function written in a common way and 'bar' is essentially identical one, written in a low-level style.
* My understanding is that these functions are equivalent in terms of what they do. The only difference is in the code being generated.

Unexpected magic is in the Core dump:
-------------------
Test.$wfoo =
  \ (ww_sxw :: GHC.Prim.Int#) ->
    case ww_sxw of wild1_ax0 {
      __DEFAULT ->
        case GHC.Prim.quotInt# wild1_ax0 1111 of wild2_Xxc {
          __DEFAULT ->
            case GHC.Prim.quotInt# wild2_Xxc 1113 of wild3_Xxt {
              __DEFAULT -> GHC.Prim.quotInt# wild3_Xxt 1117;
              (-9223372036854775808) -> (-8257271295304186)
            };
          (-9223372036854775808) -> (-7418931981405)
        };
      (-9223372036854775808) -> (-6677706553)
    }

Test.bar =
  \ (a'_ah5 :: GHC.Types.Int) ->
    case a'_ah5 of _ { GHC.Types.I# ipv_ste ->
    GHC.Types.I#
      (GHC.Prim.quotInt#
         (GHC.Prim.quotInt# (GHC.Prim.quotInt# ipv_ste 1111) 1113) 1117)
    }
-------------------
Question 1: what is the meaning of those magic numbers -9223372036854775808, -6677706553, -7418931981405, -8257271295304186?
Question 2: under which circumstances those strange branches of execution will be used and what those results would mean?
Question 3: why is the Core for 'foo' so different to 'bar'?

The lack of expected magic is in the assembler code:
-------------------
    addq $16,%r12
    cmpq 144(%r13),%r12
    ja .Lcz1
    movl $1117,%ecx
    movl $1113,%r10d
    movl $1111,%r11d
    movq 7(%rbx),%rax
    cqto
    idivq %r11
    cqto
    idivq %r10
    cqto
    idivq %rcx
    movq $ghczmprim_GHCziTypes_Izh_con_info,-8(%r12)
    movq %rax,0(%r12)
    leaq -7(%r12),%rbx
    addq $8,%rbp
    jmp *0(%rbp)
-------------------
Question: can't it use cheap multiplication and shift instead of expensive division here? I know that such optimisation is implemented at least to some extent for C--. I suppose it also won't do anything smart for expressions like a*4 or a/4 for the same reason.


With kind regards,
Denys Rtveliashvili