Hello all,

I've been spending some of my winter break trying my hand at compiling GHC, with a mind to hopefully contributing down the line.

I've got it working, but I ran into a few things along the way that I figure might be worth fixing and/or documenting. In the approximate order I encountered them:
  • The first pacman mirror on the list bundled with MSYS2 is down, with the result that every download pacman makes takes ~10sec longer than it should. It downloads a lot, so that really adds up - but it's easy to fix, just "pacman -Sy pacman-mirrors" before doing anything else with it. Is that worth mentioning on the wiki? I was thinking a line on https://ghc.haskell.org/trac/ghc/wiki/Building/Preparation/Windows could be helpful.
  • That page mentions "If you see errors related to fork(), try closing and reopening the shell" - I've determined that you can reliably avoid that problem by following the instructions at http://sourceforge.net/p/msys2/wiki/MSYS2%20installation/#iii-updating-packages, ie by running "pacman --needed -S bash pacman msys2-runtime", then closing & re-opening the MSYS shell, before you tell pacman to install the GHC prerequisite packages.
  • A minor point: I found it helpful to include "man-db" in the list of packages to install - without it, "git help" breaks down with " failed to exec 'man'".
  • I note "./sync-all --help" says, under "Flags", that "--windows also clones the ghc-tarballs repository (enabled by default on Windows)", and I've confirmed that default behaviour experimentally - but https://ghc.haskell.org/trac/ghc/wiki/Building/GettingTheSources tells you to manually clone ghc-tarballs when on Windows. Is that line on the wiki obsolete, or am I overlooking something?
  •  And finally, the big one: cabal and/or ghc-pkg put some files outside the MSYS root directory, and caused me no end of trouble in doing so...
I made a bit of a mess at one point, and tried to fix it by starting over completely from scratch. I expected uninstalling & reinstalling MSYS to achieve this (it deletes its root directory when you uninstall it), but that left me with a huge pile of errors when I tried to run "cabal install -j --prefix=/usr/local alex happy", of the form "Could not find module `...': There are files missing in the `...' package".

I noticed that the cabal output made reference to "C:\Users\Martin\AppData\Roaming\cabal\", so tried moving that out of the way, but it only made the problem worse. I did figure it out eventually: in addition to that directory, "%APPDATA%\cabal", there were also files left over in "%APPDATA%\ghc". Once I removed that directory as well, things started working again - but it took me a lot of time & frustration to get there.

I'm not entirely sure, but I think the copy of Cabal I already had from installing the Platform may also have been storing files in those directories, in which case this process completely mangled them - which isn't great.

It seems to me that, ideally, the "build GHC inside MSYS" procedure would keep itself entirely inside the MSYS directory structure: if it were wholly self-contained, you'd know where everything is, and it couldn't break anything outside. As far as I can tell, the only breach is those two directories courtesy of Cabal, so I didn't think it would be too difficult - but none of the things I've tried (the --package-db cabal flag, a custom cabal --config-file, setting the GHC_PACKAGE_PATH environment variable, maybe some others I've forgotten) had the desired effect. Is it possible? Is it even a good idea?

If that's just how it has to be, I feel like there should be an obvious note somewhere for the sake of the next person to trip over it.

I'd be happy to amend the wiki for any/all of the first four points, if people think it'd be appropriate; I'm not sure at all what to do about the last one.

Any thoughts?

- Martin