Re: [GHC] #12554: Testsuite exhibits large amount of framework failures

15 Oct 2016

      #12554: Testsuite exhibits large amount of framework failures
-------------------------------------+-------------------------------------
        Reporter:  Phyx-             |                Owner:
            Type:  bug               |               Status:  new
        Priority:  normal            |            Milestone:
       Component:  Test Suite        |              Version:  8.0.1
      Resolution:                    |             Keywords:
Operating System:  Windows           |         Architecture:
 Type of failure:  Incorrect result  |  Unknown/Multiple
  at runtime                         |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:                    |  Differential Rev(s):
       Wiki Page:                    |
-------------------------------------+-------------------------------------

Comment (by bgamari):

 Alright, I think I see what is going on here.

 But first some background. The testsuite driver runs each testcase in its
 own fresh directory to ensure a clean test environment. The code
 responsible for this can be found in `testlib.do_test()`. This looks
 something like the following,
 {{{#!python
 def do_test(name, way, func, args, files):
     opts = getTestOpts()
     shutil.rmtree(opts.testdir, ignore_errors=True)
     os.makedirs(opts.testdir)

     # Copy test files into testdir and run test
     ...
 }}}

 Note how we invoke `rmtree` with `ignore_errors=True`, presumably to
 account for the fact that `testdir` may not exist (e.g. if we are running
 in a fresh working directory). However, this unfortunately hides any
 **real** errors that we may encounter while trying to delete `testdir`, in
 which case we'll attempt to create `testdir` when it already exists.

 This appears to be precisely what happens some fraction of the time on
 Windows. Namely, if we rework `do_test` as follows,
 {{{#!python
 def do_test(name, way, func, args, files):
     opts = getTestOpts()
     if os.path.exists(opts.testdir):
         print('cleaning %s' % opts.testdir)
         shutil.rmtree(opts.testdir, ignore_errors=False)
     os.makedirs(opts.testdir)

     # Copy test files into testdir and run test
     ...
 }}}
 We find the following,
 {{{
 =====> TH_nameSpace(ext-interp) 39 of 42 [0, 3, 12]
 cleaning ./th/TH_nameSpace.run
 *** framework failure for TH_nameSpace(ext-interp) [Error 5] Access is
 denied: './th/TH_nameSpace.run/TH_nameSpace.exe'
 }}}

 I suspect this is due to the fact that executable images of running
 programs are locked and therefore cannot be deleted. The strange thing
 here is that the process has presumably already terminated. It seems like
 the test process must still be in some sort of zombie state where it has
 terminated and we have its exit code, but its resources (e.g. executable
 image) have still not been released.

 As one would expect, adding a `time.sleep(0.1)` before the `rmtree` hides
 the issue. It's not clear to me what the proper solution is here.

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12554#comment:8
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler