Repository Reorganization Question

Hi all, While discussing something with Herbert this week in preparation of making a new stable branch, he brought a good point to my attention, which is that if we go ahead and reorganize the repository situation post 7.8, merging things to the stable branch from HEAD will become a bit harder. Notably, we had planned to fold testsuite (and perhaps some other repositories) into the GHC tree. Once we do this, the two branches will have diverged quite a bit, so merging from HEAD to STABLE will become harder* (because HEAD would have rolled in testsuite changes for example, but the STABLE branch would not have this history.) Thinking about it, the best time to do such a move is, basically, when there is no active stable branch. Unfortunately this time is right now, but I'm not sure how everyone feels about this. So, the question is: should we go ahead and pull the trigger on some of these perhaps? Herbert collected some numbers on the git repositories and outlined all the basic details here: https://ghc.haskell.org/trac/ghc/wiki/GitRepoReorganization The only thing I'd honestly propose right now is folding 'testsuite' into the main repository, but of course we should see what people think - perhaps we should keep base/etc off the table for now, since they seem more controversial. * I'll point out they will only become *slightly* harder in most cases, because I can always instead apply unified diffs, rather than cherry pick or something. But it does lose the original metadata from commits too. But I won't cry if people vote against this. -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

I have no objection to doing it now, though my opinion should count for little, since I know so little. The main person who is affected is Austin, since he has to do the merging. And I'm not sure that even 'base' is really controversial, is it? Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Austin Seipp | Sent: 04 December 2013 21:25 | To: ghc-devs@haskell.org | Subject: Repository Reorganization Question | | Hi all, | | While discussing something with Herbert this week in preparation of | making a new stable branch, he brought a good point to my attention, | which is that if we go ahead and reorganize the repository situation | post 7.8, merging things to the stable branch from HEAD will become a | bit harder. | | Notably, we had planned to fold testsuite (and perhaps some other | repositories) into the GHC tree. Once we do this, the two branches | will have diverged quite a bit, so merging from HEAD to STABLE will | become harder* (because HEAD would have rolled in testsuite changes | for example, but the STABLE branch would not have this history.) | | Thinking about it, the best time to do such a move is, basically, when | there is no active stable branch. Unfortunately this time is right | now, but I'm not sure how everyone feels about this. | | So, the question is: should we go ahead and pull the trigger on some | of these perhaps? Herbert collected some numbers on the git | repositories and outlined all the basic details here: | | https://ghc.haskell.org/trac/ghc/wiki/GitRepoReorganization | | The only thing I'd honestly propose right now is folding 'testsuite' | into the main repository, but of course we should see what people | think - perhaps we should keep base/etc off the table for now, since | they seem more controversial. | | * I'll point out they will only become *slightly* harder in most | cases, because I can always instead apply unified diffs, rather than | cherry pick or something. But it does lose the original metadata from | commits too. But I won't cry if people vote against this. | | -- | Regards, | | Austin Seipp, Haskell Consultant | Well-Typed LLP, http://www.well-typed.com/ | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs

i'm fine with them being merged, otoh I don't have any large in progress
patches afoot, and having testsuite and ghc repos merged will make it much
easier for some of the patches i have planned for end of december.
-Carter
On Wed, Dec 4, 2013 at 5:13 PM, Simon Peyton-Jones
I have no objection to doing it now, though my opinion should count for little, since I know so little. The main person who is affected is Austin, since he has to do the merging. And I'm not sure that even 'base' is really controversial, is it?
Simon
| -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Austin Seipp | Sent: 04 December 2013 21:25 | To: ghc-devs@haskell.org | Subject: Repository Reorganization Question | | Hi all, | | While discussing something with Herbert this week in preparation of | making a new stable branch, he brought a good point to my attention, | which is that if we go ahead and reorganize the repository situation | post 7.8, merging things to the stable branch from HEAD will become a | bit harder. | | Notably, we had planned to fold testsuite (and perhaps some other | repositories) into the GHC tree. Once we do this, the two branches | will have diverged quite a bit, so merging from HEAD to STABLE will | become harder* (because HEAD would have rolled in testsuite changes | for example, but the STABLE branch would not have this history.) | | Thinking about it, the best time to do such a move is, basically, when | there is no active stable branch. Unfortunately this time is right | now, but I'm not sure how everyone feels about this. | | So, the question is: should we go ahead and pull the trigger on some | of these perhaps? Herbert collected some numbers on the git | repositories and outlined all the basic details here: | | https://ghc.haskell.org/trac/ghc/wiki/GitRepoReorganization | | The only thing I'd honestly propose right now is folding 'testsuite' | into the main repository, but of course we should see what people | think - perhaps we should keep base/etc off the table for now, since | they seem more controversial. | | * I'll point out they will only become *slightly* harder in most | cases, because I can always instead apply unified diffs, rather than | cherry pick or something. But it does lose the original metadata from | commits too. But I won't cry if people vote against this. | | -- | Regards, | | Austin Seipp, Haskell Consultant | Well-Typed LLP, http://www.well-typed.com/ | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hi, Am Mittwoch, den 04.12.2013, 15:24 -0600 schrieb Austin Seipp:
The only thing I'd honestly propose right now is folding 'testsuite' into the main repository, but of course we should see what people think - perhaps we should keep base/etc off the table for now, since they seem more controversial.
yes please! It will make live much easier for me. And it would allow to remove "expect_broken" flags with the same commit that unbreaks them. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On 2013-12-04 at 22:24:40 +0100, Austin Seipp wrote:
So, the question is: should we go ahead and pull the trigger on some of these perhaps? Herbert collected some numbers on the git repositories and outlined all the basic details here:
https://ghc.haskell.org/trac/ghc/wiki/GitRepoReorganization
The only thing I'd honestly propose right now is folding 'testsuite' into the main repository, but of course we should see what people think
Fyi, I've drafted how the change would look like in the new ghc.git branch 'wip/T8545' so we can test/evaluate the effects/fallout before peforming this operation on 'master'. So running git clone -b wip/T8545 git://git.haskell.org/ghc.git cd ghc/ ./sync-all get should result in a new checkout including the folded-in testsuite/ folder. PS: If we decide to fold in `base` too (which imho would make life easier for GHC devs as it reduces exposure to confusing git-submodule effects) I think we should also fold in ghc-prim/integer-{gmp,simple} in, as `base` depends on those and they're even more tightly coupled to GHC than `base` is and so imho don't benefit much from being kept in a separate Git repo. Cheers, hvr

On 2013-12-05 at 11:17:01 +0100, Herbert Valerio Riedel wrote: [...]
Fyi, I've drafted how the change would look like in the new ghc.git branch 'wip/T8545' so we can test/evaluate the effects/fallout before peforming this operation on 'master'.
So running
git clone -b wip/T8545 git://git.haskell.org/ghc.git cd ghc/ ./sync-all get
should result in a new checkout including the folded-in testsuite/ folder.
PS: I didn't merge in testsuite's Git history as that would bloat ghc.git quite a bit; however, 'git blame' functionality can be recovered in a local checkout by using something like Git's grafting feature: # make available old testsuite Git objects in local ghc.git git remote add -f old-testsuite git://git.haskell.org/testsuite.git # add 2nd parent commit to e45b9f57a90 pointing to testsuite.git echo e45b9f57a9044e8a20e3cc13bcff86b12b3da405 \ 1860dae3a7e377f085f3a4134f532a7f577fccbe \ 3e66489ebcef0f4cd86968c6781a1d4ad1981f94 > .git/info/grafts This way when peforming 'git blame' on files in the the testsuite/ folder results in sensible information dating back before the history-cut-off point.

What (if anything) do we need to do when updating existing local repos. Will everything be ok if I just do sync-all pull Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Herbert Valerio Riedel | Sent: 05 December 2013 11:15 | To: ghc-devs@haskell.org | Subject: Re: Repository Reorganization Question | | On 2013-12-05 at 11:17:01 +0100, Herbert Valerio Riedel wrote: | | [...] | | > Fyi, I've drafted how the change would look like in the new ghc.git | > branch 'wip/T8545' so we can test/evaluate the effects/fallout before | > peforming this operation on 'master'. | > | > So running | > | > git clone -b wip/T8545 git://git.haskell.org/ghc.git | > cd ghc/ | > ./sync-all get | > | > should result in a new checkout including the folded-in testsuite/ | > folder. | | PS: I didn't merge in testsuite's Git history as that would bloat | ghc.git quite a bit; however, 'git blame' functionality can be | recovered in a local checkout by using something like Git's | grafting | feature: | | # make available old testsuite Git objects in local ghc.git | git remote add -f old-testsuite git://git.haskell.org/testsuite.git | | # add 2nd parent commit to e45b9f57a90 pointing to testsuite.git | echo e45b9f57a9044e8a20e3cc13bcff86b12b3da405 \ | 1860dae3a7e377f085f3a4134f532a7f577fccbe \ | 3e66489ebcef0f4cd86968c6781a1d4ad1981f94 > .git/info/grafts | | This way when peforming 'git blame' on files in the the testsuite/ | folder results in sensible information dating back before the | history-cut-off point. | | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs

On 2013-12-05 at 12:31:40 +0100, Simon Peyton-Jones wrote:
What (if anything) do we need to do when updating existing local repos. Will everything be ok if I just do sync-all pull
...assuming there's no important uncommitted data left in testsuite/ (and ideally nowhere else in the source-tree), it *should* suffice to just './sync-all pull' as this operation *should* overwrite any testsuite/* files laying around (at least that's what I've observed so far in my tests) However, if the testsuite/ was already checked out before the 'sync-all pull', the 'testsuite/.git' folder won't be removed automatically (and it shouldn't hurt either, as 'sync-all' won't traverse it anymore after ghc.git was updated) Cheers, hvr

On Thu, Dec 05, 2013 at 03:03:42PM +0100, Herbert Valerio Riedel wrote:
However, if the testsuite/ was already checked out before the 'sync-all pull', the 'testsuite/.git' folder won't be removed automatically (and it shouldn't hurt either, as 'sync-all' won't traverse it anymore after ghc.git was updated)
But git commands under testsuite/ will use the wrong .git, won't they? Thanks Ian

On 2013-12-05 at 15:17:53 +0100, Ian Lynagh wrote:
On Thu, Dec 05, 2013 at 03:03:42PM +0100, Herbert Valerio Riedel wrote:
However, if the testsuite/ was already checked out before the 'sync-all pull', the 'testsuite/.git' folder won't be removed automatically (and it shouldn't hurt either, as 'sync-all' won't traverse it anymore after ghc.git was updated)
But git commands under testsuite/ will use the wrong .git, won't they?
good point, didn't think of that :-/ So I guess 'sync-all' should be tweaked to either delete the .git/ folder or warn the user to remove it;

Hi, Am Donnerstag, den 05.12.2013, 12:15 +0100 schrieb Herbert Valerio Riedel:
PS: I didn't merge in testsuite's Git history as that would bloat ghc.git quite a bit;
would that really be a problem? How different are the numbers? I’m a fan of keeping history readily available, so unless it really hurts I suggest to do a proper merge. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Hello Joachim, On 2013-12-05 at 12:56:55 +0100, Joachim Breitner wrote:
Am Donnerstag, den 05.12.2013, 12:15 +0100 schrieb Herbert Valerio Riedel:
PS: I didn't merge in testsuite's Git history as that would bloat ghc.git quite a bit;
would that really be a problem? How different are the numbers?
Here's a rough estimate: testsuite.git current single packfile: 1.8M Dec 5 14:18 .git/objects/pack/pack-5d85ce17a3003e44e0e36d757564ce7df09275d4.idx 27M Dec 5 14:18 .git/objects/pack/pack-5d85ce17a3003e44e0e36d757564ce7df09275d4.pack whereas, when I create a new git repo containing only the HEAD commit from testsuite.git, the resulting single packfile: 204K Dec 5 14:19 .git/objects/pack/pack-27355d714321978fd34c21ce341a7b55f416719a.idx 2.5M Dec 5 14:19 .git/objects/pack/pack-27355d714321978fd34c21ce341a7b55f416719a.pack this seemed to be a significant increase to me;
I’m a fan of keeping history readily available, so unless it really hurts I suggest to do a proper merge.
btw, it'd be easy to provide a simple script which would re-attach the testsuite history (and any other repositories with truncated history) but there's another subtle issue; there's multiple ways to merge in the old testsuite repo, one is without any path-translation, as accomplished by the grafting example I gave; the other is to first rewrite the 'testsuite.git' to have its root-folder being located in a 'testsuite/' folder, so that Git doesn't have to follow renames and thus maybe also simplify navigating/querying the Git history. Cheers, hvr

Lets not lose our history or make it annoying to access. Disk is cheap.
On Thu, Dec 5, 2013 at 2:32 PM, Herbert Valerio Riedel
Hello Joachim,
On 2013-12-05 at 12:56:55 +0100, Joachim Breitner wrote:
Am Donnerstag, den 05.12.2013, 12:15 +0100 schrieb Herbert Valerio Riedel:
PS: I didn't merge in testsuite's Git history as that would bloat ghc.git quite a bit;
would that really be a problem? How different are the numbers?
Here's a rough estimate:
testsuite.git current single packfile:
1.8M Dec 5 14:18 .git/objects/pack/pack-5d85ce17a3003e44e0e36d757564ce7df09275d4.idx 27M Dec 5 14:18 .git/objects/pack/pack-5d85ce17a3003e44e0e36d757564ce7df09275d4.pack
whereas, when I create a new git repo containing only the HEAD commit from testsuite.git, the resulting single packfile:
204K Dec 5 14:19 .git/objects/pack/pack-27355d714321978fd34c21ce341a7b55f416719a.idx 2.5M Dec 5 14:19 .git/objects/pack/pack-27355d714321978fd34c21ce341a7b55f416719a.pack
this seemed to be a significant increase to me;
I’m a fan of keeping history readily available, so unless it really hurts I suggest to do a proper merge.
btw, it'd be easy to provide a simple script which would re-attach the testsuite history (and any other repositories with truncated history)
but there's another subtle issue; there's multiple ways to merge in the old testsuite repo, one is without any path-translation, as accomplished by the grafting example I gave; the other is to first rewrite the 'testsuite.git' to have its root-folder being located in a 'testsuite/' folder, so that Git doesn't have to follow renames and thus maybe also simplify navigating/querying the Git history.
Cheers, hvr _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On 2013-12-05 at 14:32:10 +0100, Herbert Valerio Riedel wrote: [...]
whereas, when I create a new git repo containing only the HEAD commit from testsuite.git, the resulting single packfile:
204K Dec 5 14:19 .git/objects/pack/pack-27355d714321978fd34c21ce341a7b55f416719a.idx 2.5M Dec 5 14:19 .git/objects/pack/pack-27355d714321978fd34c21ce341a7b55f416719a.pack
this seemed to be a significant increase to me;
PS: if anyone wonders why the testsuite.git history is so large: there were a few *huge* binary files with bad compressibility checked in by accident, such as the one removed via http://git.haskell.org/testsuite.git/commitdiff/68acef7dd144452db12689db3299... tests/haddock/should_compile_flag_nohaddock/a.out | Bin 2745963 -> 0 bytes tests/haddock/should_compile_noflag_nohaddock/a.out | Bin 2745963 -> 0 bytes or http://git.haskell.org/testsuite.git/commitdiff/cb540135b26504cffe557fd57fa3... tests/ghc-regress/dph/diophantine/dph-diophantine-fast | Bin 16854700 -> 0 bytes tests/ghc-regress/dph/diophantine/dph-diophantine-opt | Bin 17017376 -> 0 bytes tests/ghc-regress/dph/primespj/dph-primespj-fast | Bin 16783780 -> 0 bytes tests/ghc-regress/dph/quickhull/dph-quickhull-fast | Bin 17092732 -> 0 bytes tests/ghc-regress/dph/smvm/dph-smvm | Bin 16581028 -> 0 bytes tests/ghc-regress/dph/sumnats/dph-sumnats | Bin 16101268 -> 0 bytes tests/ghc-regress/dph/words/dph-words-fast | Bin 17580076 -> 0 bytes which account for most of the ~20MiB history....

Hi, Am Freitag, den 06.12.2013, 11:05 +0100 schrieb Herbert Valerio Riedel:
PS: if anyone wonders why the testsuite.git history is so large: there were a few *huge* binary files with bad compressibility checked in by accident, such as the one removed via
[..]
s/dph/words/dph-words-fast | Bin 17580076 -> 0 bytes
which account for most of the ~20MiB history....
so if we were to do the marge including a pathname rewrite (which I tend to think of as fine), we could remove these files from the history as well, and get a reasonably sized repository? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

Hi,
When we merge in the testsuite repo, can we still keep the old commit IDs?
They're referenced from all over the place.
On Fri, Dec 6, 2013 at 12:07 PM, Joachim Breitner
Hi,
Am Freitag, den 06.12.2013, 11:05 +0100 schrieb Herbert Valerio Riedel:
PS: if anyone wonders why the testsuite.git history is so large: there were a few *huge* binary files with bad compressibility checked in by accident, such as the one removed via
[..]
s/dph/words/dph-words-fast | Bin 17580076 -> 0 bytes
which account for most of the ~20MiB history....
so if we were to do the marge including a pathname rewrite (which I tend to think of as fine), we could remove these files from the history as well, and get a reasonably sized repository?
Greetings, Joachim
-- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hi, Am Freitag, den 06.12.2013, 13:01 +0100 schrieb Johan Tibell:
When we merge in the testsuite repo, can we still keep the old commit IDs? They're referenced from all over the place.
that depends on the style of merge: * With pathname rewriting: + git can easily trace the history of a file + we can remove the large binary blobs while we are at it * Without pathname rewriting: + commit IDs stay the same Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On 2013-12-06 at 13:01:41 +0100, Johan Tibell wrote:
When we merge in the testsuite repo, can we still keep the old commit IDs? They're referenced from all over the place.
...if we want to preserve the old testsuite's commit-ids, then we'll have to live with carrying around those superflous large blobs in testsuite's history nobody really needs, as afaik[1] we can't remove the blobs w/o causing the commit-ids to change. I'm not so much concerned about disk-space, but rather the wasted network bandwidth involved (and yes, we're already suffering from that now with the current testsuite.git). But I don't feel very strongly on this one, if there's agreement that we want to keep around those dead weights in the Git history in order to retain testsuite.git's original commit ids *inside* ghc.git (nb: the old testsuite.git repo won't be deleted and thus remain available, it just won't be pushed into anymore). Otoh, which use-cases exactly do you have in mind wrt the testsuite.git commit-ids? [1]: maybe someone with more Git-foo knows a trick here?

Whichever way to go, we should write down the options and consequences and
communicating them widely enough so no core devs get surprised.
Commit IDs for the test suite are referenced in e.g. various Trac issues,
on mailing lists (although rarely), and perhaps even in code.
On Fri, Dec 6, 2013 at 1:15 PM, Herbert Valerio Riedel
On 2013-12-06 at 13:01:41 +0100, Johan Tibell wrote:
When we merge in the testsuite repo, can we still keep the old commit IDs? They're referenced from all over the place.
...if we want to preserve the old testsuite's commit-ids, then we'll have to live with carrying around those superflous large blobs in testsuite's history nobody really needs, as afaik[1] we can't remove the blobs w/o causing the commit-ids to change.
I'm not so much concerned about disk-space, but rather the wasted network bandwidth involved (and yes, we're already suffering from that now with the current testsuite.git). But I don't feel very strongly on this one, if there's agreement that we want to keep around those dead weights in the Git history in order to retain testsuite.git's original commit ids *inside* ghc.git (nb: the old testsuite.git repo won't be deleted and thus remain available, it just won't be pushed into anymore).
Otoh, which use-cases exactly do you have in mind wrt the testsuite.git commit-ids?
[1]: maybe someone with more Git-foo knows a trick here?

Trac tickets with links to commits are the important case. If the commit IDs change, someone will have to run a script over the Trac database and rewrite all those links to testsuite commits to the new ones. Sounds possible, but it'll be at least a few hours work I'd guess. I'm in favour of removing the unnecessary binary blobs from the history so long as we can do it without any serious disruption. Cheers, Simon On 06/12/2013 12:50, Johan Tibell wrote:
Whichever way to go, we should write down the options and consequences and communicating them widely enough so no core devs get surprised.
Commit IDs for the test suite are referenced in e.g. various Trac issues, on mailing lists (although rarely), and perhaps even in code.
On Fri, Dec 6, 2013 at 1:15 PM, Herbert Valerio Riedel
mailto:hvr@gnu.org> wrote: On 2013-12-06 at 13:01:41 +0100, Johan Tibell wrote: > When we merge in the testsuite repo, can we still keep the old commit IDs? > They're referenced from all over the place.
...if we want to preserve the old testsuite's commit-ids, then we'll have to live with carrying around those superflous large blobs in testsuite's history nobody really needs, as afaik[1] we can't remove the blobs w/o causing the commit-ids to change.
I'm not so much concerned about disk-space, but rather the wasted network bandwidth involved (and yes, we're already suffering from that now with the current testsuite.git). But I don't feel very strongly on this one, if there's agreement that we want to keep around those dead weights in the Git history in order to retain testsuite.git's original commit ids *inside* ghc.git (nb: the old testsuite.git repo won't be deleted and thus remain available, it just won't be pushed into anymore).
Otoh, which use-cases exactly do you have in mind wrt the testsuite.git commit-ids?
[1]: maybe someone with more Git-foo knows a trick here?
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On 2013-12-06 at 13:50:55 +0100, Johan Tibell wrote:
Whichever way to go, we should write down the options and consequences and communicating them widely enough so no core devs get surprised.
Commit IDs for the test suite are referenced in e.g. various Trac issues, on mailing lists (although rarely), and perhaps even in code.
...as I hinted at in an earlier post, the old commit-ids will still allow to find the original commit; for isntance, there's already the find-commit-by-sha1 service at http://git.haskell.org/.findhash/<commit-sha1-prefix> which searches all repos hosted at git.haskell.org for the given sha1 prefix; there's also a convenient text-entry field at http://git.haskell.org/ which allows you to copy'n'paste any commit-ids you might come across in emails, irc logs, trac comments or even commit messages... ...does this lookup-service alleviate your concerns? Cheers, hvr

On Fri, Dec 6, 2013 at 4:43 PM, Herbert Valerio Riedel
On 2013-12-06 at 13:50:55 +0100, Johan Tibell wrote:
Whichever way to go, we should write down the options and consequences and communicating them widely enough so no core devs get surprised.
Commit IDs for the test suite are referenced in e.g. various Trac issues, on mailing lists (although rarely), and perhaps even in code.
...as I hinted at in an earlier post, the old commit-ids will still allow to find the original commit; for isntance, there's already the find-commit-by-sha1 service at
http://git.haskell.org/.findhash/<commit-sha1-prefix>
which searches all repos hosted at git.haskell.org for the given sha1 prefix; there's also a convenient text-entry field at http://git.haskell.org/ which allows you to copy'n'paste any commit-ids you might come across in emails, irc logs, trac comments or even commit messages...
...does this lookup-service alleviate your concerns?
Personally I think it's still much friction; another thing to remember. Is it really worth it for a couple of megs of bandwidth* and some disk space? If it really is I believe git has some facility for nuking the data of old commits. That facility exists for the case when someone committed something sensitive to the code base that should never have been there. * GitHub's bandwidth if you use that mirror. -- Johan

personally i don't care about the bandwidth, and others are correct about
the value of logs. If theres a way to get both, awesome! If not, 20mb here
and there i don't care.
On Fri, Dec 6, 2013 at 11:26 AM, Johan Tibell
On Fri, Dec 6, 2013 at 4:43 PM, Herbert Valerio Riedel
wrote: On 2013-12-06 at 13:50:55 +0100, Johan Tibell wrote:
Whichever way to go, we should write down the options and consequences and communicating them widely enough so no core devs get surprised.
Commit IDs for the test suite are referenced in e.g. various Trac issues, on mailing lists (although rarely), and perhaps even in code.
...as I hinted at in an earlier post, the old commit-ids will still allow to find the original commit; for isntance, there's already the find-commit-by-sha1 service at
http://git.haskell.org/.findhash/<commit-sha1-prefix>
which searches all repos hosted at git.haskell.org for the given sha1 prefix; there's also a convenient text-entry field at http://git.haskell.org/ which allows you to copy'n'paste any commit-ids you might come across in emails, irc logs, trac comments or even commit messages...
...does this lookup-service alleviate your concerns?
Personally I think it's still much friction; another thing to remember. Is it really worth it for a couple of megs of bandwidth* and some disk space?
If it really is I believe git has some facility for nuking the data of old commits. That facility exists for the case when someone committed something sensitive to the code base that should never have been there.
* GitHub's bandwidth if you use that mirror.
-- Johan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

20MB of bandwidth represents 20 additional seconds to do an initial clone on my 1 megabyte/second connection. ghc.git is already about 75MB, so it wouldn't dramatically change the experience either way. Just a data point. On 12/06/2013 12:47 PM, Carter Schonwald wrote:
personally i don't care about the bandwidth, and others are correct about the value of logs. If theres a way to get both, awesome! If not, 20mb here and there i don't care.
On Fri, Dec 6, 2013 at 11:26 AM, Johan Tibell
mailto:johan.tibell@gmail.com> wrote: On Fri, Dec 6, 2013 at 4:43 PM, Herbert Valerio Riedel
mailto:hvr@gnu.org> wrote: On 2013-12-06 at 13:50:55 +0100, Johan Tibell wrote: > Whichever way to go, we should write down the options and consequences and > communicating them widely enough so no core devs get surprised. > > Commit IDs for the test suite are referenced in e.g. various Trac issues, > on mailing lists (although rarely), and perhaps even in code.
...as I hinted at in an earlier post, the old commit-ids will still allow to find the original commit; for isntance, there's already the find-commit-by-sha1 service at
http://git.haskell.org/.findhash/<commit-sha1-prefix>
which searches all repos hosted at git.haskell.org http://git.haskell.org for the given sha1 prefix; there's also a convenient text-entry field at http://git.haskell.org/ which allows you to copy'n'paste any commit-ids you might come across in emails, irc logs, trac comments or even commit messages...
...does this lookup-service alleviate your concerns?
Personally I think it's still much friction; another thing to remember. Is it really worth it for a couple of megs of bandwidth* and some disk space?
If it really is I believe git has some facility for nuking the data of old commits. That facility exists for the case when someone committed something sensitive to the code base that should never have been there.
* GitHub's bandwidth if you use that mirror.
-- Johan
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org mailto:ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

On 06/12/2013 15:43, Herbert Valerio Riedel wrote:
On 2013-12-06 at 13:50:55 +0100, Johan Tibell wrote:
Whichever way to go, we should write down the options and consequences and communicating them widely enough so no core devs get surprised.
Commit IDs for the test suite are referenced in e.g. various Trac issues, on mailing lists (although rarely), and perhaps even in code.
...as I hinted at in an earlier post, the old commit-ids will still allow to find the original commit; for isntance, there's already the find-commit-by-sha1 service at
http://git.haskell.org/.findhash/<commit-sha1-prefix>
which searches all repos hosted at git.haskell.org for the given sha1 prefix; there's also a convenient text-entry field at http://git.haskell.org/ which allows you to copy'n'paste any commit-ids you might come across in emails, irc logs, trac comments or even commit messages...
...does this lookup-service alleviate your concerns?
No :-) To be honest I would probably just paste the SHA1 into Google and find it that way, which would probably work. But it's *far* better if the links just work. While I'm here can I just point out that old links into the mailing list archives are still broken, AFAIK. I run into this quite often, and it's a total pain, because you have no idea how to find the message that the link was originally pointing at. Cheers, Simon

On 2013-12-09 at 09:18:09 +0100, Simon Marlow wrote: [...]
...as I hinted at in an earlier post, the old commit-ids will still allow to find the original commit; for isntance, there's already the find-commit-by-sha1 service at
http://git.haskell.org/.findhash/<commit-sha1-prefix>
which searches all repos hosted at git.haskell.org for the given sha1 prefix; there's also a convenient text-entry field at http://git.haskell.org/ which allows you to copy'n'paste any commit-ids you might come across in emails, irc logs, trac comments or even commit messages...
...does this lookup-service alleviate your concerns?
No :-) To be honest I would probably just paste the SHA1 into Google and find it that way, which would probably work. But it's *far* better if the links just work.
What kind of links are you referring to btw? I don't see any clickable GHC SHA1 ids these days anymore... :-) ...any SHA1 links into the testsuite never had any chance of working in the first place out-of-the-box, unless they used Trac syntax to specify they referred to the testsuite repository. And now that we've disabled wiki-format rendering of commit messages again, we have lost that again as well. What I'm trying to say is, that I don't see any regression at all here. I.e. we wouldn't break any facility that didn't work before anyway. Cheers, hvr

Hi, Am Montag, den 09.12.2013, 09:24 +0100 schrieb Herbert Valerio Riedel:
What kind of links are you referring to btw? I don't see any clickable GHC SHA1 ids these days anymore... :-)
well, people do write SHA1 ids in tickets comments directly. (At least I do. And then I rebase my branch. And then the link is dead ;-)) But in contrast to the mailing list link issue, even if we rewrite the testsuite before merging, it will be possible, although a bit more tedious, to look up the corresponding new hash. It is hard to predict what is more common: Following SHA1-links from old (and in the future even older) tickets, or doing git archeology inside the repo. I guess both are relatively rare that we are risking to spend more time discussing it than we’d save otherwise... The only thing that will permanently hurt if we do not fix it now are the binary blobs. I do often make new checkouts (I still do separate feature branches in separate checkouts, plus validate trees, plus baseline trees to compare the effect of my changes). So I’m still in favor of rewriting the branch. We could even check in the ID→ID mapping in the repo and have a easy to discover ./testsuite/lookup-old-id script so that this is even less an annoyance. Greetings, Joachim PS: What happens if we set up replace objects in the git repo that cgit and trac are using: https://www.kernel.org/pub/software/scm/git/docs/git-replace.html Would that make the old links still work? -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On 2013-12-09 at 09:34:23 +0100, Joachim Breitner wrote:
Am Montag, den 09.12.2013, 09:24 +0100 schrieb Herbert Valerio Riedel:
What kind of links are you referring to btw? I don't see any clickable GHC SHA1 ids these days anymore... :-)
well, people do write SHA1 ids in tickets comments directly. (At least I do. And then I rebase my branch. And then the link is dead ;-))
...those links manually written into ticket comments will continue to work since they had to explicitly refer to the testsuite repo anyway (e.g. `[27c42f4022/testsuite]`) if they had any chance of working before, and even if we retain the old commit ids in ghc.git, those links would still point to the testsuite.git repo and not to ghc.git
But in contrast to the mailing list link issue, even if we rewrite the testsuite before merging, it will be possible, although a bit more tedious, to look up the corresponding new hash.
What I don't understand yet is, why do we need to know the new hash ids at all? If we come across an old hash-id, aren't we just interested in finding the commitdiff (and possibly neighbor commits) it refers to? [...]
We could even check in the ID→ID mapping in the repo and have a easy to discover ./testsuite/lookup-old-id script so that this is even less an annoyance.
PS: What happens if we set up replace objects in the git repo that cgit and trac are using: https://www.kernel.org/pub/software/scm/git/docs/git-replace.html Would that make the old links still work?
That feature sounds *very* interesting to our problem at hand as it effectively allows to hide the real new SHA1 ids for the over 6000 "historic" testsuite commits (or put differently, it allows use to embed the oldid-newid mapping at the Git meta-data level); I'll experiment with 'git replace' to see if it could really work in our use-case. Cheers, hvr

Hi, Am Montag, den 09.12.2013, 10:04 +0100 schrieb Herbert Valerio Riedel:
But in contrast to the mailing list link issue, even if we rewrite the testsuite before merging, it will be possible, although a bit more tedious, to look up the corresponding new hash.
What I don't understand yet is, why do we need to know the new hash ids at all? If we come across an old hash-id, aren't we just interested in finding the commitdiff (and possibly neighbor commits) it refers to?
good point, I did not think of that. So if trac links are preserved anyways, doesn’t that then solve the problem? Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On 09/12/2013 08:24, Herbert Valerio Riedel wrote:
On 2013-12-09 at 09:18:09 +0100, Simon Marlow wrote:
[...]
...as I hinted at in an earlier post, the old commit-ids will still allow to find the original commit; for isntance, there's already the find-commit-by-sha1 service at
http://git.haskell.org/.findhash/<commit-sha1-prefix>
which searches all repos hosted at git.haskell.org for the given sha1 prefix; there's also a convenient text-entry field at http://git.haskell.org/ which allows you to copy'n'paste any commit-ids you might come across in emails, irc logs, trac comments or even commit messages...
...does this lookup-service alleviate your concerns?
No :-) To be honest I would probably just paste the SHA1 into Google and find it that way, which would probably work. But it's *far* better if the links just work.
What kind of links are you referring to btw? I don't see any clickable GHC SHA1 ids these days anymore... :-)
I'm confused. We definitely do have clickable commit links, inserted automatically by the post-commit hook, e.g.: https://ghc.haskell.org/trac/ghc/ticket/8577#comment:21 Those links would break if the hashes change, right? Cheers, Simon

Hi, Am Montag, den 09.12.2013, 09:23 +0000 schrieb Simon Marlow:
I'm confused. We definitely do have clickable commit links, inserted automatically by the post-commit hook, e.g.:
https://ghc.haskell.org/trac/ghc/ticket/8577#comment:21
Those links would break if the hashes change, right?
but, as Herbert pointed out, these links point to the testsuite repository explicitly (deadbeef/testsuite). If there did not do that, they would be dead already (e.g. if someone would write them manually and not paying attention to that). And since they are qualified by the repository they point to, they will continue to point to the (then uncontinued, but unmodified) testsuite repository. Which is – I believe – sufficient to make sense out of old tickets. The links to the ghc repository are of course unchanged, there is no plan to rewrite history here. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0x4743206C Debian Developer: nomeata@debian.org

On 09/12/2013 09:28, Joachim Breitner wrote:
Hi,
Am Montag, den 09.12.2013, 09:23 +0000 schrieb Simon Marlow:
I'm confused. We definitely do have clickable commit links, inserted automatically by the post-commit hook, e.g.:
https://ghc.haskell.org/trac/ghc/ticket/8577#comment:21
Those links would break if the hashes change, right?
but, as Herbert pointed out, these links point to the testsuite repository explicitly (deadbeef/testsuite). If there did not do that, they would be dead already (e.g. if someone would write them manually and not paying attention to that).
And since they are qualified by the repository they point to, they will continue to point to the (then uncontinued, but unmodified) testsuite repository. Which is – I believe – sufficient to make sense out of old tickets.
The links to the ghc repository are of course unchanged, there is no plan to rewrite history here.
Ah yes, I see. That's fine then. Cheers, Simon

Hi all,
It seems that while most people are in favor of migrating and
preserving the history there are a few sticky bits concerning some of
the minor details. So I think the discussion should continue, but we
clearly shouldn't pull the trigger quite yet. testsuite etc will live
on for a while longer.
In the mean time, maintaining a branch is relatively minimal cost, so
whatever solution we come up with won't hurt too badly in the mean
time.
On Mon, Dec 9, 2013 at 4:28 AM, Simon Marlow
On 09/12/2013 09:28, Joachim Breitner wrote:
Hi,
Am Montag, den 09.12.2013, 09:23 +0000 schrieb Simon Marlow:
I'm confused. We definitely do have clickable commit links, inserted automatically by the post-commit hook, e.g.:
https://ghc.haskell.org/trac/ghc/ticket/8577#comment:21
Those links would break if the hashes change, right?
but, as Herbert pointed out, these links point to the testsuite repository explicitly (deadbeef/testsuite). If there did not do that, they would be dead already (e.g. if someone would write them manually and not paying attention to that).
And since they are qualified by the repository they point to, they will continue to point to the (then uncontinued, but unmodified) testsuite repository. Which is – I believe – sufficient to make sense out of old tickets.
The links to the ghc repository are of course unchanged, there is no plan to rewrite history here.
Ah yes, I see. That's fine then.
Cheers, Simon
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

On 2013-12-09 at 13:31:15 +0100, Austin Seipp wrote:
It seems that while most people are in favor of migrating and preserving the history there are a few sticky bits concerning some of the minor details. So I think the discussion should continue, but we clearly shouldn't pull the trigger quite yet. testsuite etc will live on for a while longer.
In the mean time I've updated the branch http://git.haskell.org/ghc.git/log/refs/heads/wip/T8545 to include the full testsuite.git history, with the history rewritten as if the files have been always in the 'testsuite/' folder, and with the blobs removed; moreover I've tweaked the sync-all script to check for a left-over testsuite/.git folder, to remind users to move it out of the way. Thus, with the current state of the T8545 branch, - `git blame testsuite/tests/typecheck/should_compile/all.T` works as expected - `git log testsuite/` as well as `cd testsuite/ && git log .` works as expected - The cleaned-from-large-blob testsuite.git history adds only about 10MiB packed transfer size (as opposed to the full ~30MiB it would require if the blobs weren't removed) - The historic/original testsuite.git commit hash ids are not retained in ghc.git, as they continue to exist in the "legacy" testsuite.git (where all current Trac hyperlinks will continue to refer to anyway)[1]; - When peferming a `sync-all pull` (with the old `sync-all` script) developers will get the new ghc.git-tracked `testsuite/` folder and on the next `sync-all` invocation (which will use the updated sync-all script for the first time) they'll be reminded/forced to move away a possibly left-over `testsuite/.git` folder. - At some point after the switch, `testsuite.git` should be set read-only (for the `master` branch at least). So, what are the current sticky bits left over given the proof-of-concept in wip/T8545 that still need to be addressed? [1] If we really need this for our development workflows, we could still attempt to make use of the `git replace` facility in the future to overlay the old commit ids. But if we don't need it (and from the discussion so far I don't see a requirement for that), it's better to keep it simple.

Herbert Valerio Riedel
On 2013-12-09 at 13:31:15 +0100, Austin Seipp wrote:
It seems that while most people are in favor of migrating and preserving the history there are a few sticky bits concerning some of the minor details. So I think the discussion should continue, but we clearly shouldn't pull the trigger quite yet. testsuite etc will live on for a while longer.
In the mean time I've updated the branch
http://git.haskell.org/ghc.git/log/refs/heads/wip/T8545
to include the full testsuite.git history, with the history rewritten as if the files have been always in the 'testsuite/' folder, and with the blobs removed; moreover I've tweaked the sync-all script to check for a left-over testsuite/.git folder, to remind users to move it out of the way.
If the old commit IDs are really needed, one would think it wouldn't be too hard to write them into the commit message while rewriting history. That way you could at least `git log --grep` IIRC. Cheers, - Ben

Hi Ben, On 2013-12-10 at 17:53:23 +0100, Ben Gamari wrote:
If the old commit IDs are really needed, one would think it wouldn't be too hard to write them into the commit message while rewriting history. That way you could at least `git log --grep` IIRC.
Good idea, that's quite easy actually, I just need to perform
git filter-branch \
--msg-filter 'cat - && echo && echo "(original commit: [$GIT_COMMIT/testsuite])"'
as the first rewrite step (so that the $GIT_COMMIT variable still points
to the original commit hash ids), and then the imported commits will
look like e.g.
,----
| commit afe483f9b9e0eadda178f9f6786d835ea7b8395c
| Author: Joachim Breitner

Hi, Am Dienstag, den 10.12.2013, 22:42 +0100 schrieb Herbert Valerio Riedel:
So if we want it that way, it's easily accomplished...
yes, looks good. Make sure that when you merge master, you are not adding new comments to all the tickets references from commit messages... Greetings, Joachim -- Joachim Breitner e-Mail: mail@joachim-breitner.de Homepage: http://www.joachim-breitner.de Jabber-ID: nomeata@joachim-breitner.de

I don't feel terribly strongly about this, but I'd rather not clutter up the commit messages. As long as we keep the old testsuite.git repository attached to Trac, we can always find the old commits, and Google is a good hash table for SHA-1 keys. Cheers, Simon On 10/12/2013 21:42, Herbert Valerio Riedel wrote:
Hi Ben,
On 2013-12-10 at 17:53:23 +0100, Ben Gamari wrote:
If the old commit IDs are really needed, one would think it wouldn't be too hard to write them into the commit message while rewriting history. That way you could at least `git log --grep` IIRC.
Good idea, that's quite easy actually, I just need to perform
git filter-branch \ --msg-filter 'cat - && echo && echo "(original commit: [$GIT_COMMIT/testsuite])"'
as the first rewrite step (so that the $GIT_COMMIT variable still points to the original commit hash ids), and then the imported commits will look like e.g.
,---- | commit afe483f9b9e0eadda178f9f6786d835ea7b8395c | Author: Joachim Breitner
| Date: Mon Dec 9 15:40:20 2013 +0000 | | Use -ddump-strsigs in tests/stranal/sigs | | because it is more reliable than the previous GHC plugin (no need to | support annotations etc.), plus it works nicely with "make accept". | | (original commit: [323cab22d65ea88410a607ef22db23198c03e305/testsuite]) `---- So if we want it that way, it's easily accomplished...
Cheers, hvr

I'm all for converting to submodules. Since we will have submodules anyway, we could also convert testsuite et al to submodules and see how painful that is before deciding to fold them in to the main repo. I'm not clear on the pros/cons of having, e.g., testsuite, as a submodule vs folded in. The submodule approach will certainly maintain history! Geoff On 12/04/2013 04:24 PM, Austin Seipp wrote:
Hi all,
While discussing something with Herbert this week in preparation of making a new stable branch, he brought a good point to my attention, which is that if we go ahead and reorganize the repository situation post 7.8, merging things to the stable branch from HEAD will become a bit harder.
Notably, we had planned to fold testsuite (and perhaps some other repositories) into the GHC tree. Once we do this, the two branches will have diverged quite a bit, so merging from HEAD to STABLE will become harder* (because HEAD would have rolled in testsuite changes for example, but the STABLE branch would not have this history.)
Thinking about it, the best time to do such a move is, basically, when there is no active stable branch. Unfortunately this time is right now, but I'm not sure how everyone feels about this.
So, the question is: should we go ahead and pull the trigger on some of these perhaps? Herbert collected some numbers on the git repositories and outlined all the basic details here:
https://ghc.haskell.org/trac/ghc/wiki/GitRepoReorganization
The only thing I'd honestly propose right now is folding 'testsuite' into the main repository, but of course we should see what people think - perhaps we should keep base/etc off the table for now, since they seem more controversial.
* I'll point out they will only become *slightly* harder in most cases, because I can always instead apply unified diffs, rather than cherry pick or something. But it does lose the original metadata from commits too. But I won't cry if people vote against this.

My only concern with this is that we consider the workflow and tooling issues that I outlined in http://permalink.gmane.org/gmane.comp.lang.haskell.ghc.devel/2718 The main points are making sure the workflow for submodules doesn't have too much friction, that it's integrated nicely into sync-all, and that we avoid the worst pitfall of submodules: losing local changes to a submodule on a git submodule update. I think this means that sync-all pull should do either 'git submodule update --merge' or 'git submodule update --rebase' depending on whether you used sync-all pull --rebase or not. But caveats apply, I've only skimmed the docs and not tried this out for real :-) OTOH, it's not hard to do manual cherry-picks with git. 'git show <hash> | patch -p1' is something I often do, and 'git commit -C <commit>' reuses the metadata from another commit. So it's probably not essential that we do this now. Cheers, Simon On 04/12/2013 21:24, Austin Seipp wrote:
Hi all,
While discussing something with Herbert this week in preparation of making a new stable branch, he brought a good point to my attention, which is that if we go ahead and reorganize the repository situation post 7.8, merging things to the stable branch from HEAD will become a bit harder.
Notably, we had planned to fold testsuite (and perhaps some other repositories) into the GHC tree. Once we do this, the two branches will have diverged quite a bit, so merging from HEAD to STABLE will become harder* (because HEAD would have rolled in testsuite changes for example, but the STABLE branch would not have this history.)
Thinking about it, the best time to do such a move is, basically, when there is no active stable branch. Unfortunately this time is right now, but I'm not sure how everyone feels about this.
So, the question is: should we go ahead and pull the trigger on some of these perhaps? Herbert collected some numbers on the git repositories and outlined all the basic details here:
https://ghc.haskell.org/trac/ghc/wiki/GitRepoReorganization
The only thing I'd honestly propose right now is folding 'testsuite' into the main repository, but of course we should see what people think - perhaps we should keep base/etc off the table for now, since they seem more controversial.
* I'll point out they will only become *slightly* harder in most cases, because I can always instead apply unified diffs, rather than cherry pick or something. But it does lose the original metadata from commits too. But I won't cry if people vote against this.

Hello All, It seems to me, there were no major obstacles left unaddressed in the previous discussion[1] (see summary below) to merging testsuite.git into ghc.git. So here's one last attempt to get testsuite.git folded into ghc.git before Austin branches off 7.8 Please speak up *now*, if you have any objections to folding testsuite.git into ghc.git *soon* (with *soon* meaning upcoming Sunday, 12th Jan 2014) ---- A summary of the previous thread so far: - Let's fold testsuite into ghc before branching off 7.8RC - ghc/testsuite have the most coupled commits - make's it a bit easier to cherry pick ghc/testsuite between branches - while being low-risk, will provide empiric value for deciding how to proceed with folding in other Git repos - Proof of concept in http://git.haskell.org/ghc.git/shortlog/refs/heads/wip/T8545 - general support for it; consensus that it will be beneficial and shouldn't be a huge disruption - sync-all is adapted to abort operation if `testsuite/.git` is detected, and advising the user to remove (or move-out-of-the-way) - Concern about broken commit-refs in Trac and other places: - old testsuite.git repo will remain available (more or less) read-only; so old commit-shas will still be resolvable - (old) Trac commit-links which work currently will continue to work, as they refer specifically to the testsuite.git repo, and Trac will know they point to the old testsuite.git - If one doesn't know which Git repo a commit-id is in, there's still the SHA1 look-up service at http://git.haskell.org/ which will search all repos hosted at git.haskell.org for a commit SHA1 prefix. Or alternatively, just ask google about the SHA1. - Binary blobs (a few compiled executables) that were committed by accident and removed right away again are removed from history to avoid carrying around useless garbage in the Git history (saves ~20MiB) - Path names are rewritten to be based in testsuite/, in order to make it easier for Git operations (git log et al.) to follow history for folders/filenames - Old Commit-ids will *not* be written into the rewritten commits' messages in order not to add noise (old commit ids can be resolved via the remaining old testsuite.git repo) [1] http://permalink.gmane.org/gmane.comp.lang.haskell.ghc.devel/3099

I'm all for it! Simon | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Herbert Valerio Riedel | Sent: 09 January 2014 10:31 | To: ghc-devs | Subject: Folding ghc/testsuite repos *now*, 2nd attempt (was: Repository | Reorganization Question) | | Hello All, | | It seems to me, there were no major obstacles left unaddressed in the | previous discussion[1] (see summary below) to merging testsuite.git into | ghc.git. | | So here's one last attempt to get testsuite.git folded into ghc.git | before Austin branches off 7.8 | | Please speak up *now*, if you have any objections to folding | testsuite.git into ghc.git *soon* (with *soon* meaning upcoming Sunday, | 12th Jan 2014) | | ---- | | A summary of the previous thread so far: | | - Let's fold testsuite into ghc before branching off 7.8RC | - ghc/testsuite have the most coupled commits | - make's it a bit easier to cherry pick ghc/testsuite between | branches | - while being low-risk, will provide empiric value for deciding how | to proceed with folding in other Git repos | | - Proof of concept in | http://git.haskell.org/ghc.git/shortlog/refs/heads/wip/T8545 | | - general support for it; consensus that it will be beneficial and | shouldn't be a huge disruption | | - sync-all is adapted to abort operation if `testsuite/.git` is | detected, and advising the user to remove (or move-out-of-the-way) | | - Concern about broken commit-refs in Trac and other places: | | - old testsuite.git repo will remain available (more or less) | read-only; so old commit-shas will still be resolvable | | - (old) Trac commit-links which work currently will continue to | work, as they refer specifically to the testsuite.git repo, and | Trac will know they point to the old testsuite.git | | - If one doesn't know which Git repo a commit-id is in, there's | still the SHA1 look-up service at http://git.haskell.org/ which | will search all repos hosted at git.haskell.org for a commit | SHA1 prefix. Or alternatively, just ask google about the SHA1. | | - Binary blobs (a few compiled executables) that were committed by | accident and removed right away again are removed from history to | avoid carrying around useless garbage in the Git history (saves | ~20MiB) | | - Path names are rewritten to be based in testsuite/, in order to | make it easier for Git operations (git log et al.) to follow | history for folders/filenames | | - Old Commit-ids will *not* be written into the rewritten commits' | messages in order not to add noise (old commit ids can be resolved | via the remaining old testsuite.git repo) | | | | [1] http://permalink.gmane.org/gmane.comp.lang.haskell.ghc.devel/3099 | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs

+1
On Thu, Jan 9, 2014 at 1:48 PM, Simon Peyton Jones
I'm all for it!
Simon
| -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of | Herbert Valerio Riedel | Sent: 09 January 2014 10:31 | To: ghc-devs | Subject: Folding ghc/testsuite repos *now*, 2nd attempt (was: Repository | Reorganization Question) | | Hello All, | | It seems to me, there were no major obstacles left unaddressed in the | previous discussion[1] (see summary below) to merging testsuite.git into | ghc.git. | | So here's one last attempt to get testsuite.git folded into ghc.git | before Austin branches off 7.8 | | Please speak up *now*, if you have any objections to folding | testsuite.git into ghc.git *soon* (with *soon* meaning upcoming Sunday, | 12th Jan 2014) | | ---- | | A summary of the previous thread so far: | | - Let's fold testsuite into ghc before branching off 7.8RC | - ghc/testsuite have the most coupled commits | - make's it a bit easier to cherry pick ghc/testsuite between | branches | - while being low-risk, will provide empiric value for deciding how | to proceed with folding in other Git repos | | - Proof of concept in | http://git.haskell.org/ghc.git/shortlog/refs/heads/wip/T8545 | | - general support for it; consensus that it will be beneficial and | shouldn't be a huge disruption | | - sync-all is adapted to abort operation if `testsuite/.git` is | detected, and advising the user to remove (or move-out-of-the-way) | | - Concern about broken commit-refs in Trac and other places: | | - old testsuite.git repo will remain available (more or less) | read-only; so old commit-shas will still be resolvable | | - (old) Trac commit-links which work currently will continue to | work, as they refer specifically to the testsuite.git repo, and | Trac will know they point to the old testsuite.git | | - If one doesn't know which Git repo a commit-id is in, there's | still the SHA1 look-up service at http://git.haskell.org/ which | will search all repos hosted at git.haskell.org for a commit | SHA1 prefix. Or alternatively, just ask google about the SHA1. | | - Binary blobs (a few compiled executables) that were committed by | accident and removed right away again are removed from history to | avoid carrying around useless garbage in the Git history (saves | ~20MiB) | | - Path names are rewritten to be based in testsuite/, in order to | make it easier for Git operations (git log et al.) to follow | history for folders/filenames | | - Old Commit-ids will *not* be written into the rewritten commits' | messages in order not to add noise (old commit ids can be resolved | via the remaining old testsuite.git repo) | | | | [1] http://permalink.gmane.org/gmane.comp.lang.haskell.ghc.devel/3099 | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

+1 from me as well.
On Thu, Jan 9, 2014 at 4:31 AM, Herbert Valerio Riedel
Hello All,
It seems to me, there were no major obstacles left unaddressed in the previous discussion[1] (see summary below) to merging testsuite.git into ghc.git.
So here's one last attempt to get testsuite.git folded into ghc.git before Austin branches off 7.8
Please speak up *now*, if you have any objections to folding testsuite.git into ghc.git *soon* (with *soon* meaning upcoming Sunday, 12th Jan 2014)
----
A summary of the previous thread so far:
- Let's fold testsuite into ghc before branching off 7.8RC - ghc/testsuite have the most coupled commits - make's it a bit easier to cherry pick ghc/testsuite between branches - while being low-risk, will provide empiric value for deciding how to proceed with folding in other Git repos
- Proof of concept in http://git.haskell.org/ghc.git/shortlog/refs/heads/wip/T8545
- general support for it; consensus that it will be beneficial and shouldn't be a huge disruption
- sync-all is adapted to abort operation if `testsuite/.git` is detected, and advising the user to remove (or move-out-of-the-way)
- Concern about broken commit-refs in Trac and other places:
- old testsuite.git repo will remain available (more or less) read-only; so old commit-shas will still be resolvable
- (old) Trac commit-links which work currently will continue to work, as they refer specifically to the testsuite.git repo, and Trac will know they point to the old testsuite.git
- If one doesn't know which Git repo a commit-id is in, there's still the SHA1 look-up service at http://git.haskell.org/ which will search all repos hosted at git.haskell.org for a commit SHA1 prefix. Or alternatively, just ask google about the SHA1.
- Binary blobs (a few compiled executables) that were committed by accident and removed right away again are removed from history to avoid carrying around useless garbage in the Git history (saves ~20MiB)
- Path names are rewritten to be based in testsuite/, in order to make it easier for Git operations (git log et al.) to follow history for folders/filenames
- Old Commit-ids will *not* be written into the rewritten commits' messages in order not to add noise (old commit ids can be resolved via the remaining old testsuite.git repo)
[1] http://permalink.gmane.org/gmane.comp.lang.haskell.ghc.devel/3099 _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
-- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/
participants (13)
-
Austin Seipp
-
Ben Gamari
-
Carter Schonwald
-
Geoffrey Mainland
-
Herbert Valerio Riedel
-
Herbert Valerio Riedel
-
Ian Lynagh
-
Isaac Dupree
-
Joachim Breitner
-
Johan Tibell
-
Simon Marlow
-
Simon Peyton Jones
-
Simon Peyton-Jones