
After switching to git, I discovered that ghci is interpreting a lot of modules when it should have loaded the .o files (i.e. I get 'SomeModule (interpreted)' instead of 'Skipped SomeModule'). It turns out that git checkouts update the modtime on checked-out files, even when they get reverted back to their original contents. Shake has an option ChangeModtimeAndDigestInput to check file contents in addition to modtime to not get fooled by this, but ghc is still doing a plain modtime check. So shake correctly skips the rebuild on those modules, but ghci recompiles them anyway. This means I'm better off just disabling shake's digest check, since otherwise I can just never recompile that stuff at all. Would it be reasonable to do the same kind of check as shake in ghc? Namely, shake does a quick check if modtime has changed, but even if it has, it checks the file contents digest to make sure. My understanding is that ghc does the quick modtime check, and then does an expensive interface check. This would augment that to become a quick modtime check, then a quick-ish digest check, and then the expensive interface check. I guess the old input file digest will have to be stored somewhere, presumably in the .hi file, so it's not a totally trivial change. The benefit should be that anyone working with git should be able to reuse more .o files after branch checkouts. The two relevant places seem to be GHC.loadModule and DriverPipeline.runPhase. I'm willing to have a go at this if people think it's a good idea and I can get some pointers on the .hi plumbing if I get hung up there.

It turns out that git checkouts update the modtime on checked-out files, even when they get reverted back to their original contents.
Looks to me the problem's right here. Namely, git checkout.
If the contents didn't change, the modtime shouldn't either. What's the
reason behind changing it?
Have you brought this up to the git maintainers? Compensating for flaws in
co-tools costs code and complexity in ghc we would rather do without.
In the meantime, it shouldn't be hard to kludge up some shell scripts that
run before and after git checkout to reset the modtime back to what it
should be.
On Wednesday, May 30, 2018, Evan Laforge
After switching to git, I discovered that ghci is interpreting a lot of modules when it should have loaded the .o files (i.e. I get 'SomeModule (interpreted)' instead of 'Skipped SomeModule').
It turns out that git checkouts update the modtime on checked-out files, even when they get reverted back to their original contents. Shake has an option ChangeModtimeAndDigestInput to check file contents in addition to modtime to not get fooled by this, but ghc is still doing a plain modtime check. So shake correctly skips the rebuild on those modules, but ghci recompiles them anyway. This means I'm better off just disabling shake's digest check, since otherwise I can just never recompile that stuff at all.
Would it be reasonable to do the same kind of check as shake in ghc? Namely, shake does a quick check if modtime has changed, but even if it has, it checks the file contents digest to make sure. My understanding is that ghc does the quick modtime check, and then does an expensive interface check. This would augment that to become a quick modtime check, then a quick-ish digest check, and then the expensive interface check.
I guess the old input file digest will have to be stored somewhere, presumably in the .hi file, so it's not a totally trivial change. The benefit should be that anyone working with git should be able to reuse more .o files after branch checkouts.
The two relevant places seem to be GHC.loadModule and DriverPipeline.runPhase. I'm willing to have a go at this if people think it's a good idea and I can get some pointers on the .hi plumbing if I get hung up there. _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- -- Kim-Ee

On Tue, May 29, 2018 at 11:46 PM, Kim-Ee Yeoh
It turns out that git checkouts update the modtime on checked-out files, even when they get reverted back to their original contents.
Looks to me the problem's right here. Namely, git checkout.
If the contents didn't change, the modtime shouldn't either. What's the reason behind changing it?
The contents do change, but then they change back again. Say you visit another branch then come back.
Have you brought this up to the git maintainers? Compensating for flaws in co-tools costs code and complexity in ghc we would rather do without.
Here's an explanation with some links: https://confluence.atlassian.com/bbkb/preserving-file-timestamps-with-git-an...
In the meantime, it shouldn't be hard to kludge up some shell scripts that run before and after git checkout to reset the modtime back to what it should be.
That sounds like code and complexity too! Only it would be repeated in every repo. And it sounds pretty hard too. I'd have to keep and update a Map (Branch, FilePath) ModTime, and hook every checkout to record and restore every modified file. At that point I've more or less written my own checkout command.
participants (2)
-
Evan Laforge
-
Kim-Ee Yeoh