[GHC] #8048: Register spilling produces ineffecient/highly contending code

10 Jul 2013

      #8048: Register spilling produces ineffecient/highly contending code
-------------------------------------+-------------------------------------
       Reporter:  schyler            |             Owner:
           Type:  bug                |            Status:  new
       Priority:  normal             |         Milestone:  _|_
      Component:  Compiler           |           Version:  7.6.3
       Keywords:  register           |  Operating System:  Unknown/Multiple
  allocator spill                    |   Type of failure:  Runtime
   Architecture:  Unknown/Multiple   |  performance bug
     Difficulty:  Unknown            |         Test Case:
     Blocked By:                     |          Blocking:
Related Tickets:                     |
-------------------------------------+-------------------------------------
 The native codegen and llvm both produce ineffecient code for functions
 using structures with many strict fields or unboxed values.

 Consider the following program:
 {{{
 {-# LANGUAGE BangPatterns #-}
 module Spill where

 import GHC.Exts

 data S = S !Int !Int !Int !Int !Int !Int !Int !Int !Int

 spill :: S -> S -> S -> S
 spill (S !a !b !c !d !e !f !g !h !i) (S !j !k !l !m !n !o !p !q !r) (S !s
 !t !u !v !w !x !y !z _)
  = S (a + j + s) (b + c) (k + r) (a + b + c + d + e + f + g + h + i) (j +
 k + l + m + n + o + p + q + r) (s + t + u + v + w + x + y + z) (a + b + c)
 (j + k + l) (s + t + u)
 }}}

 Parts of the code produced for this (which is identical regardless of
 -funbox-strict-fields) looks like:
 {{{
 _cnc:
     addq $80,%r12
     cmpq 144(%r13),%r12
     ja _cni
     movq $Spill.S_con_info,-72(%r12)
     movq 32(%rbp),%rax
     movq %rax,-64(%r12)
     movq 24(%rbp),%rax
     movq %rax,-56(%r12)
     movq 16(%rbp),%rax
     movq %rax,-48(%r12)
     movq 8(%rbp),%rax
     movq %rax,-40(%r12)
     movq 40(%rbp),%rax
     movq %rax,-32(%r12)
     movq 48(%rbp),%rax
     movq %rax,-24(%r12)
     movq 56(%rbp),%rax
     movq %rax,-16(%r12)
     movq 64(%rbp),%rax
     movq %rax,-8(%r12)
     movq 7(%rbx),%rax
     movq %rax,0(%r12)
     leaq -71(%r12),%rbx
     addq $72,%rbp
     jmp *0(%rbp)
 }}}
 {{{
 _csv:
     movq 63(%rbx),%rax
     movq %rax,-56(%rbp)
     movq 55(%rbx),%rax
     movq %rax,-48(%rbp)
     movq 47(%rbx),%rax
     movq %rax,-40(%rbp)
     movq 39(%rbx),%rax
     movq %rax,-32(%rbp)
     movq 31(%rbx),%rax
     movq %rax,-24(%rbp)
     movq 23(%rbx),%rax
     movq %rax,-16(%rbp)
     movq 71(%rbx),%rax
     movq %rax,-8(%rbp)
     movq 15(%rbx),%rax
     movq %rax,0(%rbp)
 }}}

 And likewise for LLVM:
 {{{
 .LBB10_1: # %coZ
         movq    7(%rbx), %rcx
         movq    $Spill_S_con_info, 8(%rax)
         movq    8(%rbp), %rdx
         movq    %rdx, 16(%rax)
         movq    16(%rbp), %rdx
         movq    %rdx, 24(%rax)
         movq    24(%rbp), %rdx
         movq    %rdx, 32(%rax)
         movq    32(%rbp), %rdx
         movq    %rdx, 40(%rax)
         movq    40(%rbp), %rdx
         movq    %rdx, 48(%rax)
         movq    48(%rbp), %rdx
         movq    %rdx, 56(%rax)
         movq    56(%rbp), %rdx
         movq    %rdx, 64(%rax)
         movq    64(%rbp), %rdx
         movq    %rdx, 72(%rax)
         movq    %rcx, (%r12)
         movq    72(%rbp), %rax
         leaq    72(%rbp), %rbp
         leaq    -71(%r12), %rbx
         jmpq    *%rax  # TAILCALL
 }}}

 Quoting from #ghc "the [register allocator] core algo is '96 vintage".
 Improvements are needed;

 * Take into consideration pipelining and handle spills less dramatically,
 attempting to reduce register contention
 * Sink memory reads in order to reduce register pressure

-- 
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8048
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler