Is this test summary good or bad?

Folks, I'm running Mac OSX 10.4.7 on Intel. This is the result of running the ghc-regress suite of tests using a freshly updated ghc 6.5 that was bootstrapped using a binary distribution. I suspect the framework failures were cases where tests got hung and I had to Ctrl-C them to let testing continue. I'm not sure, though. OVERALL SUMMARY for test run started at Sun Jul 9 19:02:59 BST 2006 1090 total tests, which gave rise to 4985 test cases, of which 13 caused framework failures 1003 were skipped 3460 expected passes 22 expected failures 0 unexpected passes 448 unexpected failures Unexpected failures: 10queens(ghci) CPUTime001(ghci) Chan001(ghci) IOError002(ghci) MVar001(opt,ghci) QSem001(ghci) QSemN001(ghci) SampleVar001(ghci) TH_bracket1(normal) TH_bracket2(normal) TH_bracket3(normal) TH_class1(normal) TH_dupdecl(normal) TH_exn(normal) TH_fail(normal) TH_genEx(normal) TH_mkName(normal) TH_recover(normal) TH_reifyDecl1(normal) TH_reifyType1(normal) TH_reifyType2(normal) TH_repE1(normal) TH_repE2(normal) TH_repE3(normal) TH_repGuard(normal) TH_repGuardOutput(normal) TH_repPatSig(normal) TH_repPrim(normal) TH_repPrimOutput(normal) TH_spliceDecl1(normal) TH_spliceDecl2(normal) TH_spliceDecl3(normal) TH_spliceDecl4(normal) TH_spliceE1(normal) TH_spliceE3(normal) TH_spliceE4(normal) TH_spliceE5(normal) TH_spliceE5_prof(normal) TH_spliceExpr1(normal) TH_spliceInst(normal) TH_tuple1(normal) TH_where(normal) addr001(ghci) andre_monad(ghci) andy_cherry(ghci) arith001(ghci) arith002(ghci) arith003(ghci) arith004(ghci) arith005(ghci,threaded1) arr001(ghci) arr002(ghci) arr003(ghci) arr004(ghci) arr005(ghci) arr006(ghci) arr007(ghci,threaded2) arr008(normal,ghci) arr009(profasm,ghci) arr010(ghci) arr011(ghci) arr012(ghci) arr013(ghci) arr014(ghci) arr015(ghci) arr016(ghci,threaded2) arr017(ghci) arrowlet1(normal) arrowrun001(ghci) arrowrun002(ghci,threaded2) arrowrun003(ghci) arrowrun004(profasm,ghci) barton-mangler-bug(ghci) bits(ghci) cabal01(normal) cc012(profasm) cg001(ghci) cg002(prof,ghci) cg003(ghci) cg004(ghci) cg005(ghci) cg006(ghci) cg007(ghci) cg008(ghci) cg009(ghci) cg010(ghci) cg011(ghci) cg012(ghci) cg013(ghci) cg014(ghci) cg015(prof,ghci) cg016(ghci) cg017(optasm) char001(ghci) char002(ghci) cholewo-eval(ghci) church(ghci) conc001(ghci) conc002(ghci) conc003(ghci) conc006(ghci) conc007(ghci) conc008(ghci) conc027(ghci) conc049(ghci) conc051(ghci) concprog001(ghci) currentDirectory001(ghci) cvh_unboxing(ghci) datatype(ghci) diffArray001(prof,profasm,ghci) directory001(ghci) doesDirectoryExist001(ghci) drvfail001(normal) drvfail002(normal) drvrun001(ghci) drvrun002(ghci) drvrun003(ghci) drvrun004(ghci) drvrun005(ghci) drvrun006(ghci) drvrun007(profasm,ghci) drvrun008(ghci) drvrun009(ghci) drvrun010(ghci) drvrun011(ghci) drvrun012(ghci) drvrun013(ghci) drvrun014(ghci) drvrun015(ghci) drvrun016(ghci) drvrun017(ghci) drvrun018(ghci) drvrun019(ghci) dsrun001(ghci) dsrun002(ghci) dsrun003(ghci) dsrun004(prof,ghci) dsrun005(prof,ghci) dsrun006(ghci) dsrun007(ghci,threaded2) dsrun008(ghci,threaded2) dsrun009(prof,ghci) dsrun010(ghci) dsrun011(ghci) dsrun012(ghci) dsrun013(ghci) dynamic001(ghci) dynamic002(ghci) echo001(ghci) enum01(ghci) enum02(ghci) enum03(ghci) exceptions001(ghci,threaded2) exceptions002(ghci) exitWith001(ghci) ext1(optasm,ghci) fast2haskell(ghci) ffi-deriv1(opt) fileexist01(normal,ghci) finalization001(ghci) forkprocess01(ghci,threaded2) freeNames(ghci) fun_insts(ghci) galois_raytrace(ghci) genUpTo(ghci) geq(ghci) getArgs001(ghci) getC(ghci) getEnv001(prof,ghci) getPermissions001(prof) ghci001(ghci,ghci) ghci002(ghci) ghci003(ghci) ghci004(ghci) ghci005(ghci) ghci006(ghci) ghci007(ghci) ghci008(ghci) ghci009(ghci) ghci011(ghci) ghci012(ghci) ghci013(ghci) ghci014(ghci) ghci015(ghci) ghciprog004(normal) gmapQ-assoc(ghci) gread(ghci) gshow(ghci) gzip(ghci) hClose001(optasm,ghci) hFileSize001(ghci) hTell001(ghci) hTell002(ghci) hash001(ghci) ioref001(ghci) jl_defaults(ghci) joao-circular(ghci) jq_readsPrec(ghci) jtod_circint(ghci) jules_xref(optasm,ghci) jules_xref2(optasm,ghci) labels(ghci) launchbury(ghci) lennart_range(ghci) lex(ghci) lexNum(ghci) life_space_leak(ghci) list001(profasm,ghci) maessen_hashtab (normal,opt,optasm,prof,profasm,ghci,threaded1,threaded2) memo001(ghci) memo002(ghci) mod100(normal) mod118(normal) mod30(normal) mod53(normal) mod55(normal) mod56(normal) nested-datatypes(ghci) net001(ghci,threaded2) net002(ghci) newtype(ghci) north_array(ghci) num001(ghci) num002(ghci) num003(ghci) num004(ghci) num005(ghci) num006(prof,ghci) num007(ghci) num008(optasm,ghci) packedstring001(ghci) paradise(ghci) performGC001(ghci) process001(ghci) process002(ghci) prog001(ghci) prog002(ghci) prog003(ghci) prog005(ghci) prog006(ghci) queryfdoption01(profasm) rand001(ghci) ratio001(ghci) read001(ghci) read015(opt) readLitChar(ghci) record_upd(ghci) regex001(ghci) reify(ghci) rittri(ghci) rn.prog006(normal) rn017(normal) rn022(prof) rn026(normal) rn028(opt) rn031(normal) rn035(profasm) rnfail043(normal) rw(normal) sanders_array(ghci) seward-space-leak(ghci) show001(ghci) signals001(optasm,ghci) signals002(ghci) simpl005(profasm) simpl009(prof) simpl011(normal,opt,optasm,prof,profasm) simpl014(profasm) stableptr001(ghci) stableptr003(optasm,ghci) stableptr004(profasm,ghci) stableptr005(prof,ghci) strict_anns(ghci) strings(ghci) system001(ghci) tc002(normal) tc056(normal,opt,optasm,prof,profasm) tc076(opt,optasm) tc092(normal,opt,optasm) tc097(normal,opt,optasm) tc102(normal,opt,optasm,prof,profasm) tc104(prof) tc106(prof) tc109(opt) tc134(normal,opt,optasm,prof,profasm) tc135(normal) tc136(normal,opt,optasm,prof,profasm) tc137(opt) tc141(normal,opt,optasm,prof,profasm) tc150(normal) tcfail002(normal) tcfail004(normal) tcfail005(normal) tcfail010(normal) tcfail013(normal) tcfail014(normal) tcfail018(normal) tcfail040(normal) tcfail043(normal) tcfail046(normal) tcfail061(normal) tcfail068(normal) tcfail071(normal) tcfail072(normal) tcrun001(ghci) tcrun002(opt,ghci) tcrun003(ghci) tcrun004(ghci,threaded2) tcrun005(ghci) tcrun006(ghci,threaded1) tcrun007(ghci) tcrun008(ghci) tcrun009(ghci) tcrun010(ghci) tcrun011(ghci) tcrun012(ghci) tcrun013(ghci) tcrun014(ghci) tcrun015(ghci,threaded2) tcrun016(ghci) tcrun017(ghci) tcrun018(ghci) tcrun019(ghci) tcrun020(ghci) tcrun021(ghci) tcrun022(ghci) tcrun023(ghci) tcrun024(ghci) tcrun025(ghci) tcrun027(ghci) tcrun028(normal,ghci) tcrun029(ghci) tcrun030(ghci) tcrun031(ghci) tcrun032(ghci) tcrun033(ghci) tcrun034(ghci) tcrun035(normal,opt,optasm,prof,profasm,ghci,threaded1,threaded2) tcrun036(ghci) tcrun037(ghci) testeq2(ghci) text001(ghci) thurston-modular-arith(ghci) time002(ghci,threaded1) time003(ghci,threaded1) time004(ghci) trace001(ghci) tree(ghci) tup001(ghci) twin(ghci) typecheck.testeq1(ghci) unicode001(ghci,ghci) unicode002(ghci) uri001(ghci) utf8_002(normal) utf8_003(normal) utf8_004(normal) utf8_005(normal) weak001(ghci) where(ghci) xmlish(ghci) -- http://wagerlabs.com/

joelr1:
Folks,
I'm running Mac OSX 10.4.7 on Intel. This is the result of running the ghc-regress suite of tests using a freshly updated ghc 6.5 that was bootstrapped using a binary distribution.
I suspect the framework failures were cases where tests got hung and I had to Ctrl-C them to let testing continue. I'm not sure, though.
OVERALL SUMMARY for test run started at Sun Jul 9 19:02:59 BST 2006 1090 total tests, which gave rise to 4985 test cases, of which 13 caused framework failures 1003 were skipped
3460 expected passes 22 expected failures 0 unexpected passes 448 unexpected failures
Generally ok, except ghc is broken for some reason. Using the wrong stage for the test? You'll need to use the stage2 ghc. -- Don

dons:
joelr1:
Folks,
I'm running Mac OSX 10.4.7 on Intel. This is the result of running the ghc-regress suite of tests using a freshly updated ghc 6.5 that was bootstrapped using a binary distribution.
I suspect the framework failures were cases where tests got hung and I had to Ctrl-C them to let testing continue. I'm not sure, though.
OVERALL SUMMARY for test run started at Sun Jul 9 19:02:59 BST 2006 1090 total tests, which gave rise to 4985 test cases, of which 13 caused framework failures 1003 were skipped
3460 expected passes 22 expected failures 0 unexpected passes 448 unexpected failures
Generally ok, except ghc is broken for some reason.
Oooops :) ghc_i_ is broken for some reason. ghc looked ok :)
Using the wrong stage for the test? You'll need to use the stage2 ghc.
-- Don _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

This is using stage2. Does it look any better? OVERALL SUMMARY for test run started at Mon Jul 10 15:11:22 BST 2006 952 total tests, which gave rise to 4583 test cases, of which 11 caused framework failures 1099 were skipped 3185 expected passes 24 expected failures 0 unexpected passes 217 unexpected failures Unexpected failures: QSemN001(ghci) TH_class1(normal) TH_reifyType2(normal) TH_repGuard(normal) TH_spliceE5(normal) TH_where(normal) andre_monad(optasm,prof,profasm) andy_cherry(normal,prof,threaded1,threaded2) arith002(opt) arith003(threaded2) arith010(normal,prof) arith017(prof,profasm) arr001(prof) arr003(optasm) arr004(profasm) arr005(threaded1) barton-mangler-bug(normal) cabal01(normal) cc012(normal) cg009(optasm) cg010(profasm) cholewo-eval(prof) conc007(ghci) conc009(ghci) conc013(profasm,ghci) conc021(optasm) conc022(optasm) conc027(normal) conc030(threaded1) conc034(opt) conc037(threaded2) conc039(threaded1) conc045(ghci) conc049(threaded1) driver011(normal) driver012(normal) driver015(normal) driver034(normal) driver042(normal) drvfail001(normal) drvfail002(normal) drvfail005(normal) drvfail006(normal) drvfail007(normal) drvfail008(normal) drvfail009(normal) drvfail010(normal) drvrun009(threaded2) drvrun017(threaded2) dsrun004(normal) dsrun013(threaded2) exitWith001(optasm,prof) fileexist01(normal) forkprocess01(normal,threaded2) fun_insts(opt,optasm) galois_raytrace(normal) getC(optasm) getPermissions001(opt) ghci008(ghci) hPutBuf002(optasm) hash001(prof,profasm) ioeGetErrorString001(threaded1) isEOF001(prof,profasm) jl_defaults(threaded2) jtod_circint(normal,profasm) jules_xref(optasm) jules_xref2(ghci) launchbury(ghci) lennart_range(normal,prof,threaded1) lex(opt,profasm) life_space_leak(normal) maessen_hashtab (normal,opt,optasm,prof,profasm,ghci,threaded1,threaded2) mod53(normal) mod55(normal) mod56(normal) north_array(normal,prof,profasm) num003(prof) num004(normal,opt,profasm) num008(profasm,threaded1) num010(prof) process003(normal) read001(optasm) read010(profasm) record_upd(optasm,threaded1) reify(opt) rename.prog001(normal,profasm) rename.prog005(normal) rittri(opt,prof) rn.prog006(normal) rn003(prof) rn005(profasm) rn006(opt) rn009(optasm) rn012(prof,profasm) rn013(normal,profasm) rn017(prof) rn020(opt,optasm,profasm) rn022(opt) rn023(opt) rn024(opt) rn028(profasm) rn031(normal,profasm) rn032(prof) rn033(opt) rn035(optasm) rn037(prof,profasm) rn041(opt,profasm) rn044(prof) rn046(optasm) rn048(prof) rn050(opt) rw(normal) sanders_array(optasm) signals002(ghci) simpl002(normal,opt) simpl003(optasm) simpl005(normal,optasm) simpl009(normal) simpl010(opt,optasm) simpl011(normal,opt,optasm,prof,profasm) simplrun005(opt) spec001(prof) tc005(opt) tc007(optasm) tc009(prof) tc012(prof,profasm) tc013(normal) tc015(optasm) tc017(profasm) tc018(optasm) tc021(normal) tc024(optasm,profasm) tc025(optasm) tc026(optasm,profasm) tc027(optasm) tcfail002(normal) tcfail004(normal) tcfail005(normal) tcfail007(normal) tcfail010(normal) tcrun001(threaded2) tcrun004(opt) tcrun005(opt,threaded1) tcrun006(profasm) tcrun007(prof) tcrun008(threaded2) thurston-modular-arith(opt,profasm) timing001(opt,prof) timing002(opt) timing003(opt,prof,profasm) tree(ghci) typecheck.prog001(normal,prof) typecheck.prog002(optasm) typecheck.testeq1(ghci,threaded2) unicode001(normal,ghci,threaded1) utf8_002(normal) utf8_003(normal) utf8_004(normal) utf8_005(normal)

joelr1:
This is using stage2. Does it look any better?
OVERALL SUMMARY for test run started at Mon Jul 10 15:11:22 BST 2006 952 total tests, which gave rise to 4583 test cases, of which 11 caused framework failures 1099 were skipped
3185 expected passes 24 expected failures 0 unexpected passes 217 unexpected failures
Not too bad (mips64 is around the same), but not the same as the linux head: OVERALL SUMMARY for test run started at Sun Jul 9 22:35:16 BST 2006 1401 total tests, which gave rise to 7573 test cases, of which 0 caused framework failures 1357 were skipped 6056 expected passes 50 expected failures 0 unexpected passes 110 unexpected failures So the next step would be to diff the two results, and individually run the tests that failed. Quite possibly many are failing for non-scary reasons (like missing libs, or missing arch-specific output cases) http://www.haskell.org/pipermail/cvs-all/2006-July/048188.html -- Don

It's a little bit more complicated for me since some tests just plain hang. I will investigate, though. On Jul 11, 2006, at 2:56 AM, Donald Bruce Stewart wrote:
Not too bad (mips64 is around the same), but not the same as the linux head: [...] So the next step would be to diff the two results, and individually run the tests that failed. Quite possibly many are failing for non-scary reasons (like missing libs, or missing arch-specific output cases)

On Jul 11, 2006, at 11:00 AM, Simon Marlow wrote:
Which ones hang? Could you take one of the hanging tests, compile it with -debug, run with +RTS -Ds, and send us the output?
What ends up happening is this: 28683 p2 S 0:00.11 ../../timeout/timeout 300 cd ./typecheck/ should_compile && '/Users/joelr/work/Haskell/ghc/compiler/stage2/ghc- inplace' -no-recomp -dcore-lint -dcmm-lint -Di386_apple_darwin -c tc033.hs -fno-warn-incomplete-patterns >tc033.comp.stderr 2>&1 29125 p2 R 1:53.48 ../../timeout/timeout 300 cd ./typecheck/ should_fail && '/Users/joelr/work/Haskell/ghc/compiler/stage2/ghc- inplace' -no-recomp -dcore-lint -dcmm-lint -Di386_apple_darwin -c tcfail011.hs >tcfail011.comp.stderr 2>&1 Now, these things have been running there forever and I'm not even sure it's a Haskell problem. I suppose the test harness should have terminated the test after 300 seconds but didn't. If I try to re-run the first process by hand it finishes instantly. If I try to re-run the whole thing as above, putting everything after 300 in double quotes it also finishes instantly. -- http://wagerlabs.com/

Joel Reymont wrote:
On Jul 11, 2006, at 11:00 AM, Simon Marlow wrote:
Which ones hang? Could you take one of the hanging tests, compile it with -debug, run with +RTS -Ds, and send us the output?
What ends up happening is this:
28683 p2 S 0:00.11 ../../timeout/timeout 300 cd ./typecheck/ should_compile && '/Users/joelr/work/Haskell/ghc/compiler/stage2/ghc- inplace' -no-recomp -dcore-lint -dcmm-lint -Di386_apple_darwin -c tc033.hs -fno-warn-incomplete-patterns >tc033.comp.stderr 2>&1
29125 p2 R 1:53.48 ../../timeout/timeout 300 cd ./typecheck/ should_fail && '/Users/joelr/work/Haskell/ghc/compiler/stage2/ghc- inplace' -no-recomp -dcore-lint -dcmm-lint -Di386_apple_darwin -c tcfail011.hs >tcfail011.comp.stderr 2>&1
Now, these things have been running there forever and I'm not even sure it's a Haskell problem. I suppose the test harness should have terminated the test after 300 seconds but didn't.
If I try to re-run the first process by hand it finishes instantly. If I try to re-run the whole thing as above, putting everything after 300 in double quotes it also finishes instantly.
The timeout program is a bit of a bugbear. It uses forkProcess in a non-trivial way, and has caused me many problems (forkProcess in the threaded RTS is a bit difficult to get right, as you might imagine). So I suspect some kind of bug around forkProcess on MacOS X. If you could capture the +RTS -Ds output from a timeout that hangs, that would help. You probably need to run something like 'timeout 1 true +RTS -Ds 2>&1
log' in a loop until it hangs.
Cheers, Simon

On Jul 10, 2006, at 2:29 PM, Donald Bruce Stewart wrote:
Using the wrong stage for the test? You'll need to use the stage2 ghc.
How do I do this? I just ran make in the tests directory and it indeed picked up stage1 ghc. Thanks, Joel -- http://wagerlabs.com/

On Jul 10, 2006, at 2:29 PM, Donald Bruce Stewart wrote:
Using the wrong stage for the test? You'll need to use the stage2 ghc.
make stage=2 does it but I wish this was documented. -- http://wagerlabs.com/
participants (3)
-
dons@cse.unsw.edu.au
-
Joel Reymont
-
Simon Marlow