
#9058: System.IO.openTempFile does not scale ------------------------------------+------------------------------------- Reporter: slyfox | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Keywords: | Operating System: Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: None/Unknown Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | ------------------------------------+------------------------------------- In search of a bug in darcs http://bugs.darcs.net/issue2364 i've notice very bad property of openTempFile: it's pattern is very predictable and has O(n^2) of already created temp files. Predictability allows very fun bugs survive in buggy programs, like: {{{ thread1: (fn, fh) <- openTempFile "." "hello" renameFile fn "something" -- some time after when (some_rare_buggy_condition) $ -- oops, reused temp name, but too late, other thread killed it writeFileFile fn thread2: (fn, fh) <- openTempFile "." "hello" workWithFn fn -- nobody should touch it, right? }}} It's _very_ hard to debug data corruption when all temp files are named "foo${pid}" and sometimes "foo${pid+1}". And more serious bug: the more threads you have trying to create similar temps performance drops significantly: Attached program shows the following numbers: {{{ $ time ./bench-temps same 2000 real 0m2.795s user 0m1.516s sys 0m1.190s $ time ./bench-temps diff 2000 real 0m0.161s user 0m0.043s sys 0m0.115s }}} It's O(N^2) growing open() storm. https://github.com/ghc/ghc/blob/master/libraries/base/System/IO.hs#L465 {{{ FileExists -> findTempName (x + 1) }}} This is the source of the problem. I'd suggest always using random name for it. For portability reasons I suggest adding at least insecure random '''rand()''' value from C library. That way we will succeed in opening temp file at the first attempt. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9058 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler