
A quick googling discovered https://githubnext.com/projects/repo-visualization, which has some of the desired features. :) (CC'ing the author and team mentioned in the article, too.) Has: * very visual * subdirectory breakdown * filetype breakdown Doesn't have: * Separating code from comments * History is listed under "future work" * Drop-in support for gitlab (it's presented as a GitHub Action) The article has an interactive widget you can point at a repo. I pointed it at ghc/ghc, and although my browser is still churning ten minutes later, here's a preview. It's pretty cool! On 14/06/2022 16:20, Hécate wrote:
I'm taking the liberty of forwarding this to Bryan, as he's in a unique position to help on this front. :)
Le 14/06/2022 à 16:18, Simon Peyton Jones a écrit :
Thanks Hecate. I used your figures in my talk. Really helpful.
A note to all ghc-devs: it's be lovely to have a regularly-updated summary visualisation of GHC's source code:
* Separating code from comments * Broken up by sub-directory * As visual as possible * Ideally with some kind of historical time-line ability
This can't be new. Zillions of GitHub repositories could be visualised like this. There must be prior art; probably a lot of it. Can we just press a button and get it?
Simon
On Fri, 10 Jun 2022 at 17:45, Hécate
wrote: If you don't have a nix shell handy, here is what I'm getting:
❯ cloc compiler rts driver 1148 text files. 1137 unique files. 108 files ignored.
github.com/AlDanial/cloc http://github.com/AlDanial/cloc v 1.88 T=1.31 s (794.3 files/s, 431269.4 lines/s) --------------------------------------------------------------------------------------- Language files blank comment code --------------------------------------------------------------------------------------- Haskell 635 68541 140216 231567 C 158 10529 16953 51162 C/C++ Header 209 4329 8984 14536 yacc 2 971 10 5024 Logos 3 530 0 3642 Pascal 1 661 936 2312 make 14 252 409 850 Windows Module Definition 7 27 0 489 Assembly 5 76 269 478 Puppet 1 106 0 445 Python 1 32 19 162 D 1 16 42 60 YAML 1 6 10 18 Lisp 1 2 4 7 Windows Resource File 1 0 0 1 --------------------------------------------------------------------------------------- SUM: 1040 86078 167852 310753 ---------------------------------------------------------------------------------------
Le 10/06/2022 à 18:29, chessai a écrit :
You might be able to do something with cloc and a shell script for a rough estimate.
``` $ cd ghc $ nix-shell -p clock --run "cloc ." ```
will output a detailed report of the loc and language breakdown of the top level ghc directory (it is comment-aware and aware of many languages). there might be a way to get cloc or a similar tool to output something more inspect able (eg json), and then use a shell script to gather everything from the appropriate directories/files.
I suspect something could be hacked up in less than a day, but it would require a bit of research. Hopefully this is helpful and gets you going - I'd be happy to hear of better solutions.
Thanks
On Fri, Jun 10, 2022, 11:20 Simon Peyton Jones
wrote: Dear GHC devs
Is it possible to get a "lines-of-code" summary of GHC these days? Like the one below, from 2011.
It needs more than `wc` because it's helpful to split lines of code from lines of comments and notes.
We used to have `count_lines` but I'm not sure whether it is still extant.
I'm giving a talk at Zurihac on Sunday morning, about the internals of GHC. Any data before then, preferably in a form comparable to that below, would be terrific.
But you have a lot else to do. This isn't do-or-die, just nice to have.
Thanks
Simon
image.png
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW:https://glitchbra.in RUN: BSD
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW:https://glitchbra.in RUN: BSD

That's really cool, thanks Bryan. On Tue, 14 Jun 2022 at 17:04, Bryan Richter via ghc-devs < ghc-devs@haskell.org> wrote:
A quick googling discovered https://githubnext.com/projects/repo-visualization, which has some of the desired features. :)
(CC'ing the author and team mentioned in the article, too.)
Has:
* very visual * subdirectory breakdown * filetype breakdown
Doesn't have:
* Separating code from comments * History is listed under "future work" * Drop-in support for gitlab (it's presented as a GitHub Action)
The article has an interactive widget you can point at a repo. I pointed it at ghc/ghc, and although my browser is still churning ten minutes later, here's a preview. It's pretty cool!
On 14/06/2022 16:20, Hécate wrote:
I'm taking the liberty of forwarding this to Bryan, as he's in a unique position to help on this front. :) Le 14/06/2022 à 16:18, Simon Peyton Jones a écrit :
Thanks Hecate. I used your figures in my talk. Really helpful.
A note to all ghc-devs: it's be lovely to have a regularly-updated summary visualisation of GHC's source code:
- Separating code from comments - Broken up by sub-directory - As visual as possible - Ideally with some kind of historical time-line ability
This can't be new. Zillions of GitHub repositories could be visualised like this. There must be prior art; probably a lot of it. Can we just press a button and get it?
Simon
On Fri, 10 Jun 2022 at 17:45, Hécate
wrote: If you don't have a nix shell handy, here is what I'm getting:
❯ cloc compiler rts driver 1148 text files. 1137 unique files. 108 files ignored.
github.com/AlDanial/cloc v 1.88 T=1.31 s (794.3 files/s, 431269.4 lines/s)
--------------------------------------------------------------------------------------- Language files blank comment code
--------------------------------------------------------------------------------------- Haskell 635 68541 140216 231567 C 158 10529 16953 51162 C/C++ Header 209 4329 8984 14536 yacc 2 971 10 5024 Logos 3 530 0 3642 Pascal 1 661 936 2312 make 14 252 409 850 Windows Module Definition 7 27 0 489 Assembly 5 76 269 478 Puppet 1 106 0 445 Python 1 32 19 162 D 1 16 42 60 YAML 1 6 10 18 Lisp 1 2 4 7 Windows Resource File 1 0 0 1
--------------------------------------------------------------------------------------- SUM: 1040 86078 167852 310753
--------------------------------------------------------------------------------------- Le 10/06/2022 à 18:29, chessai a écrit :
You might be able to do something with cloc and a shell script for a rough estimate.
``` $ cd ghc $ nix-shell -p clock --run "cloc ." ```
will output a detailed report of the loc and language breakdown of the top level ghc directory (it is comment-aware and aware of many languages). there might be a way to get cloc or a similar tool to output something more inspect able (eg json), and then use a shell script to gather everything from the appropriate directories/files.
I suspect something could be hacked up in less than a day, but it would require a bit of research. Hopefully this is helpful and gets you going - I'd be happy to hear of better solutions.
Thanks
On Fri, Jun 10, 2022, 11:20 Simon Peyton Jones < simon.peytonjones@gmail.com> wrote:
Dear GHC devs
Is it possible to get a "lines-of-code" summary of GHC these days? Like the one below, from 2011.
It needs more than `wc` because it's helpful to split lines of code from lines of comments and notes.
We used to have `count_lines` but I'm not sure whether it is still extant.
I'm giving a talk at Zurihac on Sunday morning, about the internals of GHC. Any data before then, preferably in a form comparable to that below, would be terrific.
But you have a lot else to do. This isn't do-or-die, just nice to have.
Thanks
Simon
[image: image.png]
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Indeed cool.
- Can it do numeric breakdowns too? Like I needed for my talk?
- Can it distinguish code from comments?
Simon
On Tue, 14 Jun 2022 at 16:04, Bryan Richter
A quick googling discovered https://githubnext.com/projects/repo-visualization, which has some of the desired features. :)
(CC'ing the author and team mentioned in the article, too.)
Has:
* very visual * subdirectory breakdown * filetype breakdown
Doesn't have:
* Separating code from comments * History is listed under "future work" * Drop-in support for gitlab (it's presented as a GitHub Action)
The article has an interactive widget you can point at a repo. I pointed it at ghc/ghc, and although my browser is still churning ten minutes later, here's a preview. It's pretty cool!
On 14/06/2022 16:20, Hécate wrote:
I'm taking the liberty of forwarding this to Bryan, as he's in a unique position to help on this front. :) Le 14/06/2022 à 16:18, Simon Peyton Jones a écrit :
Thanks Hecate. I used your figures in my talk. Really helpful.
A note to all ghc-devs: it's be lovely to have a regularly-updated summary visualisation of GHC's source code:
- Separating code from comments - Broken up by sub-directory - As visual as possible - Ideally with some kind of historical time-line ability
This can't be new. Zillions of GitHub repositories could be visualised like this. There must be prior art; probably a lot of it. Can we just press a button and get it?
Simon
On Fri, 10 Jun 2022 at 17:45, Hécate
wrote: If you don't have a nix shell handy, here is what I'm getting:
❯ cloc compiler rts driver 1148 text files. 1137 unique files. 108 files ignored.
github.com/AlDanial/cloc v 1.88 T=1.31 s (794.3 files/s, 431269.4 lines/s)
--------------------------------------------------------------------------------------- Language files blank comment code
--------------------------------------------------------------------------------------- Haskell 635 68541 140216 231567 C 158 10529 16953 51162 C/C++ Header 209 4329 8984 14536 yacc 2 971 10 5024 Logos 3 530 0 3642 Pascal 1 661 936 2312 make 14 252 409 850 Windows Module Definition 7 27 0 489 Assembly 5 76 269 478 Puppet 1 106 0 445 Python 1 32 19 162 D 1 16 42 60 YAML 1 6 10 18 Lisp 1 2 4 7 Windows Resource File 1 0 0 1
--------------------------------------------------------------------------------------- SUM: 1040 86078 167852 310753
--------------------------------------------------------------------------------------- Le 10/06/2022 à 18:29, chessai a écrit :
You might be able to do something with cloc and a shell script for a rough estimate.
``` $ cd ghc $ nix-shell -p clock --run "cloc ." ```
will output a detailed report of the loc and language breakdown of the top level ghc directory (it is comment-aware and aware of many languages). there might be a way to get cloc or a similar tool to output something more inspect able (eg json), and then use a shell script to gather everything from the appropriate directories/files.
I suspect something could be hacked up in less than a day, but it would require a bit of research. Hopefully this is helpful and gets you going - I'd be happy to hear of better solutions.
Thanks
On Fri, Jun 10, 2022, 11:20 Simon Peyton Jones < simon.peytonjones@gmail.com> wrote:
Dear GHC devs
Is it possible to get a "lines-of-code" summary of GHC these days? Like the one below, from 2011.
It needs more than `wc` because it's helpful to split lines of code from lines of comments and notes.
We used to have `count_lines` but I'm not sure whether it is still extant.
I'm giving a talk at Zurihac on Sunday morning, about the internals of GHC. Any data before then, preferably in a form comparable to that below, would be terrific.
But you have a lot else to do. This isn't do-or-die, just nice to have.
Thanks
Simon
[image: image.png]
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD

I tried to run `cloc` in the `compiler` folder and got this. Not sure how to further break down the Compiler into smaller modules. ``` ➜ ghc git:(master) ✗ cloc compiler 723 text files. 655 unique files. 68 files ignored. github.com/AlDanial/cloc v 1.92 T=0.64 s (1025.4 files/s, 716024.5 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- Haskell 635 68541 147036 224747 yacc 2 971 10 5024 Logos 3 530 0 3642 Pascal 1 661 936 2312 C/C++ Header 7 187 304 1512 Puppet 1 106 0 445 make 2 47 84 187 C 3 11 16 37 YAML 1 6 10 18 ------------------------------------------------------------------------------- SUM: 655 71060 148396 237924 ------------------------------------------------------------------------------- ``` -Haisheng On Tue, Jun 14, 2022 at 9:14 AM Simon Peyton Jones < simon.peytonjones@gmail.com> wrote:
Indeed cool.
- Can it do numeric breakdowns too? Like I needed for my talk? - Can it distinguish code from comments?
Simon
On Tue, 14 Jun 2022 at 16:04, Bryan Richter
wrote: A quick googling discovered https://githubnext.com/projects/repo-visualization, which has some of the desired features. :)
(CC'ing the author and team mentioned in the article, too.)
Has:
* very visual * subdirectory breakdown * filetype breakdown
Doesn't have:
* Separating code from comments * History is listed under "future work" * Drop-in support for gitlab (it's presented as a GitHub Action)
The article has an interactive widget you can point at a repo. I pointed it at ghc/ghc, and although my browser is still churning ten minutes later, here's a preview. It's pretty cool!
On 14/06/2022 16:20, Hécate wrote:
I'm taking the liberty of forwarding this to Bryan, as he's in a unique position to help on this front. :) Le 14/06/2022 à 16:18, Simon Peyton Jones a écrit :
Thanks Hecate. I used your figures in my talk. Really helpful.
A note to all ghc-devs: it's be lovely to have a regularly-updated summary visualisation of GHC's source code:
- Separating code from comments - Broken up by sub-directory - As visual as possible - Ideally with some kind of historical time-line ability
This can't be new. Zillions of GitHub repositories could be visualised like this. There must be prior art; probably a lot of it. Can we just press a button and get it?
Simon
On Fri, 10 Jun 2022 at 17:45, Hécate
wrote: If you don't have a nix shell handy, here is what I'm getting:
❯ cloc compiler rts driver 1148 text files. 1137 unique files. 108 files ignored.
github.com/AlDanial/cloc v 1.88 T=1.31 s (794.3 files/s, 431269.4 lines/s)
--------------------------------------------------------------------------------------- Language files blank comment code
--------------------------------------------------------------------------------------- Haskell 635 68541 140216 231567 C 158 10529 16953 51162 C/C++ Header 209 4329 8984 14536 yacc 2 971 10 5024 Logos 3 530 0 3642 Pascal 1 661 936 2312 make 14 252 409 850 Windows Module Definition 7 27 0 489 Assembly 5 76 269 478 Puppet 1 106 0 445 Python 1 32 19 162 D 1 16 42 60 YAML 1 6 10 18 Lisp 1 2 4 7 Windows Resource File 1 0 0 1
--------------------------------------------------------------------------------------- SUM: 1040 86078 167852 310753
--------------------------------------------------------------------------------------- Le 10/06/2022 à 18:29, chessai a écrit :
You might be able to do something with cloc and a shell script for a rough estimate.
``` $ cd ghc $ nix-shell -p clock --run "cloc ." ```
will output a detailed report of the loc and language breakdown of the top level ghc directory (it is comment-aware and aware of many languages). there might be a way to get cloc or a similar tool to output something more inspect able (eg json), and then use a shell script to gather everything from the appropriate directories/files.
I suspect something could be hacked up in less than a day, but it would require a bit of research. Hopefully this is helpful and gets you going - I'd be happy to hear of better solutions.
Thanks
On Fri, Jun 10, 2022, 11:20 Simon Peyton Jones < simon.peytonjones@gmail.com> wrote:
Dear GHC devs
Is it possible to get a "lines-of-code" summary of GHC these days? Like the one below, from 2011.
It needs more than `wc` because it's helpful to split lines of code from lines of comments and notes.
We used to have `count_lines` but I'm not sure whether it is still extant.
I'm giving a talk at Zurihac on Sunday morning, about the internals of GHC. Any data before then, preferably in a form comparable to that below, would be terrific.
But you have a lot else to do. This isn't do-or-die, just nice to have.
Thanks
Simon
[image: image.png]
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing listghc-devs@haskell.orghttp://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
-- Hécate ✨ 🐦: @TechnoEmpress IRC: Hecate WWW: https://glitchbra.in RUN: BSD
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Bryan Richter via ghc-devs
A quick googling discovered https://githubnext.com/projects/repo-visualization, which has some of the desired features. :)
Another somewhat-related visualiation tool that can produce some pretty pictures is gource [1]. I wouldn't call the output "useful" per se, but it is mildly amusing to see the avatars frenetically flying about the source tree. It gives you a sense of just how many people are responsible for building the GHC that we know and love. Here [2] is a rendering of ghc's history. Best to skip the first minute or so, which is largely just Will Partain setting things up. Things really start to pick up around 2012 (around the 6 minute mark); it's truly dizzying. Happily, this momentum has persisted to this day. Cheers, - Ben [1] https://gource.io/ [2] http://home.smart-cactus.org/~ben/ghc/gource-2022-06-14.mkv

Hi Simon,
If you prefer to use the old `count_lines`, you can retrieve it like this:
git show 0cd989577a8b8d2666741fcac4fd3032ae212b80^:utils/count_lines/
count_lines.pl >/tmp/count_lines.pl
It appears to need a list of files. The output is quite busy (not quite
like your nice summary above). For example, the output of
perl /tmp/count_lines.pl $(find compiler -name '*.hs')
on `master` is
https://gist.github.com/steshaw/b636fb76c805bfa0fff7484f5da11ed6
Perhaps if you recall how you used `count_lines` in the past, you can make
an accurate comparison!
Cheers,
Steve
On Wed, 15 Jun 2022 at 04:15, Ben Gamari
Bryan Richter via ghc-devs
writes: A quick googling discovered https://githubnext.com/projects/repo-visualization, which has some of the desired features. :)
Another somewhat-related visualiation tool that can produce some pretty pictures is gource [1]. I wouldn't call the output "useful" per se, but it is mildly amusing to see the avatars frenetically flying about the source tree. It gives you a sense of just how many people are responsible for building the GHC that we know and love.
Here [2] is a rendering of ghc's history. Best to skip the first minute or so, which is largely just Will Partain setting things up. Things really start to pick up around 2012 (around the 6 minute mark); it's truly dizzying. Happily, this momentum has persisted to this day.
Cheers,
- Ben
[1] https://gource.io/ [2] http://home.smart-cactus.org/~ben/ghc/gource-2022-06-14.mkv
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (6)
-
Ben Gamari
-
Bryan Richter
-
Haisheng Wu
-
Sam Derbyshire
-
Simon Peyton Jones
-
Steven Shaw