
I'm trying to get a feel for the assembly output by GHC on my platform. Below is a module containing one function and the associated assembly. I've put in comments what I think is going on, but I'd appreciate it if anyone could give me some pointers. I'd really like to know three things: * Why does _Add_unsafeShiftR_info check if (%esi) is 3? * What's going on in _s86_info? * At the end of _s87_info, 8 is added to %ebp and then jumped to. Is that a jump to the I# constructor and, if so, how did it's address get to that offset from %ebp? Thanks in advance for any assistance! Justin --- cut here ---- # Compiled with # # ghc -c Add.hs -fext-core -keep-s-files -fasm -O2 -ddump-to-file -ddump-stg -ddump-cmm # # Platform: Windows XP. # Processor Pentium 4. # GHC Version: 6.8.2. # # > {-# OPTIONS_GHC -fglasgow-exts -fbang-patterns #-} # > module Add (unsafeShiftR) # # > where # # > import Data.Bits # > import GHC.Base (uncheckedShiftRL#, Int(..), word2Int#, int2Word#) # # > I# a `unsafeShiftR` I# b = I# (word2Int# (int2Word# a `uncheckedShiftRL#` b)) # .data .align 4 .globl _Add_unsafeShiftR_closure _Add_unsafeShiftR_closure: .long _Add_unsafeShiftR_info # Specifies entry point for unsafeShiftR? .text .align 4,0x90 .long 33 .long 34 _s87_info: addl $8,%edi cmpl 92(%ebx),%edi # Stack check? ja .Lc8n movl 4(%ebp),%eax # Get value to be shifted movl 3(%esi),%ecx # Get shift amount shrl %cl,%eax # Shift value movl $_base_GHCziBase_Izh_con_info,-4(%edi) # Get ready to call I#? movl %eax,(%edi) # Constructor value leal -3(%edi),%esi # Address to what in I#? addl $8,%ebp # Make room on stack for next caller? jmp *(%ebp) # Go to constructor? Tail call? .Lc8n: movl $8,112(%ebx) jmp *-8(%ebx) # Error routine if stack overflows? .text .align 4,0x90 .long 1 .long 34 _s86_info: movl 4(%ebp),%eax # Testing what here? movl 3(%esi),%ecx movl %ecx,4(%ebp) movl %eax,%esi movl $_s87_info,(%ebp) testl $3,%esi jne _s87_info jmp *(%esi) .text .align 4,0x90 .long 131084 .long 0 .long 15 .globl _Add_unsafeShiftR_info _Add_unsafeShiftR_info: movl (%ebp),%esi movl $_s86_info,(%ebp) testl $3,%esi # Looking at StgRetInfoTable here? jne _s86_info # Enter body of unsafeShiftR? jmp *(%esi) .data .align 4 __module_registered: .long 0 .text .align 4,0x90 .globl ___stginit_Add_ ___stginit_Add_: cmpl $0,__module_registered jne .Lc8E .Lc8F: movl $1,__module_registered addl $-4,%ebp movl $___stginit_base_GHCziBase_,(%ebp) addl $-4,%ebp movl $___stginit_base_Prelude_,(%ebp) addl $-4,%ebp movl $___stginit_base_DataziBits_,(%ebp) .Lc8E: addl $4,%ebp jmp *-4(%ebp) .text .align 4,0x90 .globl ___stginit_Add ___stginit_Add: jmp ___stginit_Add_ .ident "GHC 6.8.2"

On Tue, Mar 04, 2008 at 05:07:03PM -0800, Justin Bailey wrote:
I'm trying to get a feel for the assembly output by GHC on my platform. Below is a module containing one function and the associated assembly. I've put in comments what I think is going on, but I'd appreciate it if anyone could give me some pointers. I'd really like to know three things:
* Why does _Add_unsafeShiftR_info check if (%esi) is 3? * What's going on in _s86_info? * At the end of _s87_info, 8 is added to %ebp and then jumped to. Is that a jump to the I# constructor and, if so, how did it's address get to that offset from %ebp?
Thanks in advance for any assistance!
It would be more helpful if you didn't try to go from Haskell to assembly in one step - it's a lot easier to understand each big step of the GHC pipeline individually. Haskell | \-> Core (ghc -ddump-simpl Foo.hs > Foo.core; or -fext-core if you want something ugly but parsable; an unrestricted but simple expression-functional language) | \-> STG (ghc -ddump-stg ...) (Much more regular than Core; more like functional C) | \-> C-- (ghc -ddump-cmm) (just what it says: Simplified C for compiler writers. The universal assembly language for the 21st century) | \-> assembly Stefan

On 3/4/08, Justin Bailey
I'm trying to get a feel for the assembly output by GHC on my platform. Below is a module containing one function and the associated assembly. I've put in comments what I think is going on, but I'd appreciate it if anyone could give me some pointers. I'd really like to know three things:
* Why does _Add_unsafeShiftR_info check if (%esi) is 3? * What's going on in _s86_info? * At the end of _s87_info, 8 is added to %ebp and then jumped to. Is that a jump to the I# constructor and, if so, how did it's address get to that offset from %ebp?
I agree with Stefan's advice: read the STG code, especially to answer the second two questions (-ddump-stg), and read the STG paper (if you haven't already): Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine, SL Peyton Jones, Journal of Functional Programming 2(2), Apr 1992, pp127-202. Then if anything's still not clear, ask on the mailing list. Cheers, Tim -- Tim Chevalier * http://cs.pdx.edu/~tjc * Often in error, never in doubt "The future is not google-able." -- William Gibson

Justin Bailey wrote:
I'm trying to get a feel for the assembly output by GHC on my platform. Below is a module containing one function and the associated assembly. I've put in comments what I think is going on, but I'd appreciate it if anyone could give me some pointers. I'd really like to know three things:
* Why does _Add_unsafeShiftR_info check if (%esi) is 3? * What's going on in _s86_info? * At the end of _s87_info, 8 is added to %ebp and then jumped to. Is that a jump to the I# constructor and, if so, how did it's address get to that offset from %ebp?
One point that seems to irritate you is that ghc does pointer tagging; the lower 2 bits of a pointer are zero if it points to an unevaluated closure (a thunk); otherwise they encode a constructor. (I'm not sure what happens in the case when there are more than 3 constructors. But in your case there's only one, I#, with tag 01.) Register usage (as far as I can see): %ebp - STG stack pointer %edi - Heap pointer (in particular cmpl 92(%ebx),%edi is a heap check and will trigger a GC) %esi - at start of *_info, points to most recently evaluated value. %ebx - points to some RTS global variables. Rough outline of the code: _Add_unsafeShiftR_info: find first argument, and evaluate it if it's not yet fully evaluated. continue at _s86_info. (note that a pointer to _s86_info, a continuation, is stored on the control stack) _s86_info: store the first argument value on the stack, load the second one, and evaluate it if necessary. continue at _s87_info. _s87_info: load both arguments, and compute result. The result is fully evaluated, so a tagged pointer is returned. (the -3 is -4, plus 1 for the tag) clean up the stack (the space for the two arguments is no longer needed) and jump to the continuation stored by the caller. HTH, Bertram
participants (4)
-
Bertram Felgenhauer
-
Justin Bailey
-
Stefan O'Rear
-
Tim Chevalier