
#12554: Testsuite exhibits large amount of framework failures -------------------------------------+------------------------------------- Reporter: Phyx- | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Test Suite | Version: 8.0.1 Resolution: | Keywords: Operating System: Windows | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Alright, I think I see what is going on here. But first some background. The testsuite driver runs each testcase in its own fresh directory to ensure a clean test environment. The code responsible for this can be found in `testlib.do_test()`. This looks something like the following, {{{#!python def do_test(name, way, func, args, files): opts = getTestOpts() shutil.rmtree(opts.testdir, ignore_errors=True) os.makedirs(opts.testdir) # Copy test files into testdir and run test ... }}} Note how we invoke `rmtree` with `ignore_errors=True`, presumably to account for the fact that `testdir` may not exist (e.g. if we are running in a fresh working directory). However, this unfortunately hides any **real** errors that we may encounter while trying to delete `testdir`, in which case we'll attempt to create `testdir` when it already exists. This appears to be precisely what happens some fraction of the time on Windows. Namely, if we rework `do_test` as follows, {{{#!python def do_test(name, way, func, args, files): opts = getTestOpts() if os.path.exists(opts.testdir): print('cleaning %s' % opts.testdir) shutil.rmtree(opts.testdir, ignore_errors=False) os.makedirs(opts.testdir) # Copy test files into testdir and run test ... }}} We find the following, {{{ =====> TH_nameSpace(ext-interp) 39 of 42 [0, 3, 12] cleaning ./th/TH_nameSpace.run *** framework failure for TH_nameSpace(ext-interp) [Error 5] Access is denied: './th/TH_nameSpace.run/TH_nameSpace.exe' }}} I suspect this is due to the fact that executable images of running programs are locked and therefore cannot be deleted. The strange thing here is that the process has presumably already terminated. It seems like the test process must still be in some sort of zombie state where it has terminated and we have its exit code, but its resources (e.g. executable image) have still not been released. As one would expect, adding a `time.sleep(0.1)` before the `rmtree` hides the issue. It's not clear to me what the proper solution is here. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12554#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler