Hi,
Just as a point of information, the following rules can help avoid some of the gotchas:

- Treat submodules are read-only (i.e., don't make commits there).  The reason for this is that a submodule is usually not on a branch, and so making a commit would result in a detached head.
- When you pull (or change branches) use "git submodule update" to move the submodules to their correct versions (yes, it's annoying that one has to do that).
- Changes to a sub-module should be done in a separate repo (not GHC's submodule).  This is where you switch "hats" and become a "base" developer rather then a "GHC" developer for a bit, and use whatever workflow you normally use for development.
 - Every now and then you update the sub-module "pointer" of your GHC branch to a newer versions of the sub-module.  You do this by setting the sub-module to the desired version (e.g., by a pull from its repo), and then committing the change to the submodule version (perhaps with other GHC changes).

I agree with Simon's assessment that it is probably  a good idea to start without submodules, at least until all developers are comfortable with the rest of git's model.

-Iavor


On Thu, Jan 13, 2011 at 12:49 AM, Simon Marlow <marlowsd@gmail.com> wrote:
On 12/01/2011 22:22, Iavor Diatchki wrote:
Hello,

On Wed, Jan 12, 2011 at 11:44 AM, Roman Leshchinskiy <rl@cse.unsw.edu.au
<mailto:rl@cse.unsw.edu.au>> wrote:

   On 12/01/2011, at 09:22, Simon Marlow wrote:

    > On 11/01/2011 23:11, Roman Leshchinskiy wrote:
    >>
    >> A quick look at the docs seems to indicate that we'd need to do
    >>
    >> git pull
    >> git submodule update
    >>
    >> which doesn't look like a win over darcs-all. Also, I completely
   fail to understand what git submodule update does. It doesn't seem
   to pull all patches from the master repo. The git submodule docs are
   even worse than the rest of the git docs which is rather discouraging.
    >
    > True, however the build system could automatically check whether
   you had missed this step, because it could check the hashes.

   That would be an improvement. How do you pull submodule patches
   which the main repo doesn't depend on, though? Out of curiousity,
   has anyone here used submodules for something similar to what we
   would need?


A "submodule" is basically a "pointer" to a particular state of a remote
repo.  So when you do "git pull" in GHC, you get changes to the code,
and also changes to this "pointer", but it won't automatically modify
your local version of the sub-module repo.  So at this point, if you
started "git gui" you'd see that there is a mismatch between your local
copy of the sub-module and the expected version.

When you issue the command "git submodule update", you are telling git
to advance the sub-module repo to the "expected version" (i.e., where
the pointer points to).  The reason this does not happen automatically
is that you might have also made changes to the submodule, so you might
want to do some merging there, instead of just pulling.

One thing to note is that if we were to set things up with sub-modules,
then every now and then we would have to advance the GHC's "expected
pointer" for various libraries to the latest (or a newer) version.  Of
course, we could have a script do this but, at least in theory, when
someone makes a commit which updates the version of a sub-module, they
are asserting that they things ought to work with the newer version of
the sub-module.

-Iavor
PS: I've only used sub-module on what project at work.  At first I too
was quite confused about what was going on, but I've come to think that
submodules are a pretty reasonable way to deal with a situation which is
inherently complex.

I spent quite some time yesterday playing with submodules to see if they would work for GHC.  I'm fairly sure there are no fundamental reasons that we couldn't use them, but there are enough gotchas to put me off. I wrote down what I discovered here:

 http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion#Submodules

The workflow is quite involved - more steps than are required with darcs-all (understandable, because we're storing more information). However, git isn't particularly helpful if you make a mistake or forget to do something.  I forsee spending a lot of time digging myself and Simon out of bizarre repository states.

I discovered that Google have this tool called "repo" which is their darcs-all for the Android source tree.  That might be worth looking at as an alternative in the future:

 https://sites.google.com/a/android.com/opensource/download/using-repo

If we go with git, I suggest we stick with sync-all for the time being and think about either submodules or repo as possibilities for the future.

Cheers,
       Simon