RFC: style cleanup & guidelines for GHC, and related bikeshedding

From these numbers, we can see a few useful things at least, primarily
Hi *, First off, WARNING: BIKESHEDDING AHEAD. With that out of the way - today on IRC, there was some discussion about some stylistic/consistency issues in GHC, and being spurred by Johans recent proposal for top-level documentation, I figured perhaps we should beat the drum on this issue as well. The TL;DR is that GHC has a lot of inconsistent style issues, including things like: - Mixing literate haskell with non-literate haskell files - Legacy code with tabs and spaces intermixed - Related to the last one, trailing whitespace - Mixing styles of do notation in different parts of the compiler (braces vs no braces) - Probably things like indentation mismatches even in the same code - Probably many other things I've missed, obvious or not. These issues by themselves aren't too bad, but together they make the coding style for GHC very inconsistent, and this hurts maintainability a bit I feel. Furthermore, some of these issues block related improvements - for example, https://ghc.haskell.org/trac/ghc/ticket/9230 which is probably quite reasonable will likely be a bit annoying to implement until GHC itself is de-tabbed - we use -Werror during ./validate. This particular issue is what started the discussion. Also, with developers now using arcanist and phabricator, they have linting enabled for new patches, but they will often warn about surrounding issues, mostly tabs and trailing spaces. This is a bit annoying for submitters, and would be fixed by enforcing it. First attack plan ~~~~~~~~~~~~~~~ So, to start, I'd like to propose that we make some guidelines for these kinds of things, and also a plan to fix some of them. To start: #1) We should really consider going ahead and detabbing the remaining files that have them. We already enforce this on new commits with git hooks, but by doing this, we can make -fwarn-tabs a default flag and then validate with -Werror in the development process. #2) Similarly, we should kill all the trailing whitespace. (I think this is less controversial than #1) #3) We should most certainly move the remaining files from literate haskell to non-literate haskell. Most of the files in the compiler are already in this form, and the literate haskell documentation can't be used to generate PDFs or anything similar. I suggest we get rid of it. More Haskell users use non-literate files anyway. This is probably the least controversial. Merge issues ~~~~~~~~~~~~~~~~~ The reason we haven't done the above three things historically is that it makes merge conflicts nastier. A useful approximation suggested on IRC might be to detab and remove whitespace for files older than a certain date (say, 6 months). However, in general I'm thinking perhaps it's best to go ahead and bite the bullet. maybe. I'd like to know what other people think! If we have a vote and most people are in favor of doing this, maybe we should really do it. I'd especially like to hear about this if you have an outstanding branch. Some numbers on these issues ~~~~~~~~~~~~~~~~~~~~~~~~ Here are some quick numbers on where most of the tabs reside, as well as the breakdown of literate files vs non-literate files. NOTE: these tests occurred in the 'compiler' subdirectory of the GHC repository, which is where most of the relevant code is. LITERATE vs NON-LITERATE: $ find . -type f -iname '*.hs' | wc -l 206 $ find . -type f -iname '*.lhs' | wc -l 194 Non-literate wins by a slim margin! But having the compiler divided in half is really not a good thing IMO... NUMBER OF TABS PER SUBDIRECTORY: NOTE: this counts the number of lines which have tabs in them. It does not count the total number of tab occurrences. $ for x in `echo */`; do echo -n "$x:\t\t"; find $x -type f -regex '.*\.\(lhs\|hs\)' | xargs grep -P '\t' | wc -l; done basicTypes/: 919 cbits/: 0 cmm/: 38 codeGen/: 0 coreSyn/: 843 deSugar/: 545 ghci/: 90 hsSyn/: 120 iface/: 213 llvmGen/: 0 main/: 8 nativeGen/: 1213 parser/: 19 prelude/: 182 profiling/: 39 rename/: 188 simplCore/: 754 simplStg/: 0 specialise/: 0 stgSyn/: 0 stranal/: 336 typecheck/: 1171 types/: 301 utils/: 220 vectorise/: 0 that there are definitely some places where removing tabs should be easy. For example, parser/, profiling/, main/, and cmm/ can all be de-tabbed without much of a problem, I think. nativeGen is very often not touched, so even though it has a *huge* amount of tabs, it can likely be de-tabbed as well with minimal impact. Other style issues ~~~~~~~~~~~~~~~~~ We should also discuss some related issues, like what general block-width to use for indentations, naming conventions, and other stuff. However, I leave this all to you, and perhaps it is best we split that part off into a separate thread. Some things I'd like you all to consider: - Block width for indentation - Naming conventions (we use camelCase and_underscores_sometimes which isReally_confusing) - Import/export styles (I think we have some sloppiness here too) - Other things worth arguing forever about. Thoughts on the above issues? -- Regards, Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/

On Wed, 2 Jul 2014 14:59:13 -0500
Austin Seipp
Hi *,
First off, WARNING: BIKESHEDDING AHEAD.
With that out of the way - today on IRC, there was some discussion about some stylistic/consistency issues in GHC, and being spurred by Johans recent proposal for top-level documentation, I figured perhaps we should beat the drum on this issue as well.
The TL;DR is that GHC has a lot of inconsistent style issues, including things like:
- Mixing literate haskell with non-literate haskell files - Legacy code with tabs and spaces intermixed - Related to the last one, trailing whitespace - Mixing styles of do notation in different parts of the compiler (braces vs no braces) - Probably things like indentation mismatches even in the same code - Probably many other things I've missed, obvious or not.
I'd vote for detabbing/un-.lhs-ing in one go and ASAP. Fore example, this weekend. There will always be people creating derivative work on top of current source tree. I, for example, didn't dare to detab rts bits to make patches readable. Detabbing commits are very easy to check in git: 'git show -w' (should output nothing). It's a pain for current patches, but I'd say it's bearable. The rest of changes are harder to achieve, but should be quite easy to maintain with hlint forcing one rule (everyone agree on) at a time. -- Sergei

I have mixed feelings on all of this.
First, a disclaimer: I have a significant (~10,000 lines of difference, perhaps) branch and would be hit hard by this change. (Branch is at github.com/goldfirere/ghc under the "nokinds" branch.) That said, if I'm careful as I'm merging, I could probably make this less painful by merging up to the commit right before the changeover, then merging just whitespace changes, then merging everything afterward. It probably wouldn't be much worse than merges are for me, anyway. (They're already terrible, but that's strictly my problem.)
But, I'm not sure of the practical benefits of doing any of this. The aesthete in me really wants this change to happen -- these inconsistencies and the tabs are surely a blemish. At the same time, I don't think my understanding of the code has ever really been hindered by these problems. If anything, I think a rigid style guide would slow me down a bit, because the perfectionist in me would require making sure all my code conformed well. Now, I try to have my code match the surrounding context, but I can see that it's not critical I get it bang on. So, addressing advocates of this change: why do you want it? Is it just for the sake of beauty (not to diminish the importance of beauty)? Or do the problems outlines here properly trip you up?
If the answers are mostly about beauty, that doesn't kill the proposal, but it allows us to evaluate the pros and cons a little more crisply.
Richard
On Jul 2, 2014, at 3:59 PM, Austin Seipp
Hi *,
First off, WARNING: BIKESHEDDING AHEAD.
With that out of the way - today on IRC, there was some discussion about some stylistic/consistency issues in GHC, and being spurred by Johans recent proposal for top-level documentation, I figured perhaps we should beat the drum on this issue as well.
The TL;DR is that GHC has a lot of inconsistent style issues, including things like:
- Mixing literate haskell with non-literate haskell files - Legacy code with tabs and spaces intermixed - Related to the last one, trailing whitespace - Mixing styles of do notation in different parts of the compiler (braces vs no braces) - Probably things like indentation mismatches even in the same code - Probably many other things I've missed, obvious or not.
These issues by themselves aren't too bad, but together they make the coding style for GHC very inconsistent, and this hurts maintainability a bit I feel. Furthermore, some of these issues block related improvements - for example, https://ghc.haskell.org/trac/ghc/ticket/9230 which is probably quite reasonable will likely be a bit annoying to implement until GHC itself is de-tabbed - we use -Werror during ./validate. This particular issue is what started the discussion.
Also, with developers now using arcanist and phabricator, they have linting enabled for new patches, but they will often warn about surrounding issues, mostly tabs and trailing spaces. This is a bit annoying for submitters, and would be fixed by enforcing it.
First attack plan ~~~~~~~~~~~~~~~
So, to start, I'd like to propose that we make some guidelines for these kinds of things, and also a plan to fix some of them. To start:
#1) We should really consider going ahead and detabbing the remaining files that have them. We already enforce this on new commits with git hooks, but by doing this, we can make -fwarn-tabs a default flag and then validate with -Werror in the development process.
#2) Similarly, we should kill all the trailing whitespace. (I think this is less controversial than #1)
#3) We should most certainly move the remaining files from literate haskell to non-literate haskell. Most of the files in the compiler are already in this form, and the literate haskell documentation can't be used to generate PDFs or anything similar. I suggest we get rid of it. More Haskell users use non-literate files anyway. This is probably the least controversial.
Merge issues ~~~~~~~~~~~~~~~~~
The reason we haven't done the above three things historically is that it makes merge conflicts nastier. A useful approximation suggested on IRC might be to detab and remove whitespace for files older than a certain date (say, 6 months).
However, in general I'm thinking perhaps it's best to go ahead and bite the bullet. maybe. I'd like to know what other people think! If we have a vote and most people are in favor of doing this, maybe we should really do it.
I'd especially like to hear about this if you have an outstanding branch.
Some numbers on these issues ~~~~~~~~~~~~~~~~~~~~~~~~
Here are some quick numbers on where most of the tabs reside, as well as the breakdown of literate files vs non-literate files.
NOTE: these tests occurred in the 'compiler' subdirectory of the GHC repository, which is where most of the relevant code is.
LITERATE vs NON-LITERATE:
$ find . -type f -iname '*.hs' | wc -l 206
$ find . -type f -iname '*.lhs' | wc -l 194
Non-literate wins by a slim margin! But having the compiler divided in half is really not a good thing IMO...
NUMBER OF TABS PER SUBDIRECTORY:
NOTE: this counts the number of lines which have tabs in them. It does not count the total number of tab occurrences.
$ for x in `echo */`; do echo -n "$x:\t\t"; find $x -type f -regex '.*\.\(lhs\|hs\)' | xargs grep -P '\t' | wc -l; done basicTypes/: 919 cbits/: 0 cmm/: 38 codeGen/: 0 coreSyn/: 843 deSugar/: 545 ghci/: 90 hsSyn/: 120 iface/: 213 llvmGen/: 0 main/: 8 nativeGen/: 1213 parser/: 19 prelude/: 182 profiling/: 39 rename/: 188 simplCore/: 754 simplStg/: 0 specialise/: 0 stgSyn/: 0 stranal/: 336 typecheck/: 1171 types/: 301 utils/: 220 vectorise/: 0
From these numbers, we can see a few useful things at least, primarily that there are definitely some places where removing tabs should be easy. For example, parser/, profiling/, main/, and cmm/ can all be de-tabbed without much of a problem, I think.
nativeGen is very often not touched, so even though it has a *huge* amount of tabs, it can likely be de-tabbed as well with minimal impact.
Other style issues ~~~~~~~~~~~~~~~~~
We should also discuss some related issues, like what general block-width to use for indentations, naming conventions, and other stuff. However, I leave this all to you, and perhaps it is best we split that part off into a separate thread. Some things I'd like you all to consider:
- Block width for indentation - Naming conventions (we use camelCase and_underscores_sometimes which isReally_confusing) - Import/export styles (I think we have some sloppiness here too) - Other things worth arguing forever about.
Thoughts on the above issues?
-- Regards,
Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

Hello,
I also have somewhat mixed feelings about this: yes, it would be nice to
get rid of tabs, trailing space, and perhaps even literal files but,
honestly, I don't think that they are any kind of serious blocker to
working on the compiler. Otoh, I'd have to do a bunch of mergining also,
and I am really not looking forward to that... The only practical
benefit for me would be, possibly, better syntax highlighting in `vim`,
which is pretty terrible at handling the \begin\end style literate files.
So, overall, I kind of like our current policy of fixing tabs and
white-space when we have to modify the code anyway.
Turning literate to non-literate files would make relatively few changes to
the actual file (i.e. add {- -} here and there), but the names would be
different. Does anyone have experience with git and this sort of change
(i.e., would it be able to work out what happened, or would manual
intervention be required)?
Cheers,
-Iavor
On Wed, Jul 2, 2014 at 5:32 PM, Richard Eisenberg
I have mixed feelings on all of this.
First, a disclaimer: I have a significant (~10,000 lines of difference, perhaps) branch and would be hit hard by this change. (Branch is at github.com/goldfirere/ghc under the "nokinds" branch.) That said, if I'm careful as I'm merging, I could probably make this less painful by merging up to the commit right before the changeover, then merging just whitespace changes, then merging everything afterward. It probably wouldn't be much worse than merges are for me, anyway. (They're already terrible, but that's strictly my problem.)
But, I'm not sure of the practical benefits of doing any of this. The aesthete in me really wants this change to happen -- these inconsistencies and the tabs are surely a blemish. At the same time, I don't think my understanding of the code has ever really been hindered by these problems. If anything, I think a rigid style guide would slow me down a bit, because the perfectionist in me would require making sure all my code conformed well. Now, I try to have my code match the surrounding context, but I can see that it's not critical I get it bang on. So, addressing advocates of this change: why do you want it? Is it just for the sake of beauty (not to diminish the importance of beauty)? Or do the problems outlines here properly trip you up?
If the answers are mostly about beauty, that doesn't kill the proposal, but it allows us to evaluate the pros and cons a little more crisply.
Richard
On Jul 2, 2014, at 3:59 PM, Austin Seipp
wrote: Hi *,
First off, WARNING: BIKESHEDDING AHEAD.
With that out of the way - today on IRC, there was some discussion about some stylistic/consistency issues in GHC, and being spurred by Johans recent proposal for top-level documentation, I figured perhaps we should beat the drum on this issue as well.
The TL;DR is that GHC has a lot of inconsistent style issues, including things like:
- Mixing literate haskell with non-literate haskell files - Legacy code with tabs and spaces intermixed - Related to the last one, trailing whitespace - Mixing styles of do notation in different parts of the compiler (braces vs no braces) - Probably things like indentation mismatches even in the same code - Probably many other things I've missed, obvious or not.
These issues by themselves aren't too bad, but together they make the coding style for GHC very inconsistent, and this hurts maintainability a bit I feel. Furthermore, some of these issues block related improvements - for example, https://ghc.haskell.org/trac/ghc/ticket/9230 which is probably quite reasonable will likely be a bit annoying to implement until GHC itself is de-tabbed - we use -Werror during ./validate. This particular issue is what started the discussion.
Also, with developers now using arcanist and phabricator, they have linting enabled for new patches, but they will often warn about surrounding issues, mostly tabs and trailing spaces. This is a bit annoying for submitters, and would be fixed by enforcing it.
First attack plan ~~~~~~~~~~~~~~~
So, to start, I'd like to propose that we make some guidelines for these kinds of things, and also a plan to fix some of them. To start:
#1) We should really consider going ahead and detabbing the remaining files that have them. We already enforce this on new commits with git hooks, but by doing this, we can make -fwarn-tabs a default flag and then validate with -Werror in the development process.
#2) Similarly, we should kill all the trailing whitespace. (I think this is less controversial than #1)
#3) We should most certainly move the remaining files from literate haskell to non-literate haskell. Most of the files in the compiler are already in this form, and the literate haskell documentation can't be used to generate PDFs or anything similar. I suggest we get rid of it. More Haskell users use non-literate files anyway. This is probably the least controversial.
Merge issues ~~~~~~~~~~~~~~~~~
The reason we haven't done the above three things historically is that it makes merge conflicts nastier. A useful approximation suggested on IRC might be to detab and remove whitespace for files older than a certain date (say, 6 months).
However, in general I'm thinking perhaps it's best to go ahead and bite the bullet. maybe. I'd like to know what other people think! If we have a vote and most people are in favor of doing this, maybe we should really do it.
I'd especially like to hear about this if you have an outstanding branch.
Some numbers on these issues ~~~~~~~~~~~~~~~~~~~~~~~~
Here are some quick numbers on where most of the tabs reside, as well as the breakdown of literate files vs non-literate files.
NOTE: these tests occurred in the 'compiler' subdirectory of the GHC repository, which is where most of the relevant code is.
LITERATE vs NON-LITERATE:
$ find . -type f -iname '*.hs' | wc -l 206
$ find . -type f -iname '*.lhs' | wc -l 194
Non-literate wins by a slim margin! But having the compiler divided in half is really not a good thing IMO...
NUMBER OF TABS PER SUBDIRECTORY:
NOTE: this counts the number of lines which have tabs in them. It does not count the total number of tab occurrences.
$ for x in `echo */`; do echo -n "$x:\t\t"; find $x -type f -regex '.*\.\(lhs\|hs\)' | xargs grep -P '\t' | wc -l; done basicTypes/: 919 cbits/: 0 cmm/: 38 codeGen/: 0 coreSyn/: 843 deSugar/: 545 ghci/: 90 hsSyn/: 120 iface/: 213 llvmGen/: 0 main/: 8 nativeGen/: 1213 parser/: 19 prelude/: 182 profiling/: 39 rename/: 188 simplCore/: 754 simplStg/: 0 specialise/: 0 stgSyn/: 0 stranal/: 336 typecheck/: 1171 types/: 301 utils/: 220 vectorise/: 0
From these numbers, we can see a few useful things at least, primarily that there are definitely some places where removing tabs should be easy. For example, parser/, profiling/, main/, and cmm/ can all be de-tabbed without much of a problem, I think.
nativeGen is very often not touched, so even though it has a *huge* amount of tabs, it can likely be de-tabbed as well with minimal impact.
Other style issues ~~~~~~~~~~~~~~~~~
We should also discuss some related issues, like what general block-width to use for indentations, naming conventions, and other stuff. However, I leave this all to you, and perhaps it is best we split that part off into a separate thread. Some things I'd like you all to consider:
- Block width for indentation - Naming conventions (we use camelCase and_underscores_sometimes which isReally_confusing) - Import/export styles (I think we have some sloppiness here too) - Other things worth arguing forever about.
Thoughts on the above issues?
-- Regards,
Austin Seipp, Haskell Consultant Well-Typed LLP, http://www.well-typed.com/ _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

I fully support Austin's proposal. My eyes hurt when I work on 5 files and each of them is written in a different style. Now, to address a few points that were raised.
Is it just for the sake of beauty (not to diminish the importance of beauty)?
* I believe that trailing whitespaces are a practical issue: they are invisible to the human eye (unless you have (setq-default show-trailing-whitespace t) \n (setq-default indicate-empty-lines t) in your .emacs file) but carry a semantic meaning for git and other version control systems. This means that accidental removal of a trailing whitespace - which can and will happen if you don't highlight them - will lead to false changes in the diff. That said, simply removing existing trailing whitespaces is not enough - we would need a way to keep them from reappearing. Sadly, this idea was rejected: http://www.haskell.org/pipermail/ghc-devs/2013-August/002074.html Git has some cool tools (like git diff --check) that aid programmer in dealing with trailing whitespaces, but not everyone uses them. Here's a relatively recent example of a new trailing whitespace sneaking into the source code: https://github.com/ghc/ghc/commit/ce19d5079ea85d3190e837a1fc60000fbd82134d#d... So without a way to enforce this policy removing trailing whitespaces doesn't seem to make sense. * Tabs will become a practical problem once #9230 is fixed. Also, tab width can be custom-set to whatever value in an editor, which can trip some people. I for one used to have tab width set to 2 (unlike the default of 4 or 8) but gave up, because too many tab-formatted things were displayed incorrectly. * I strongly support turning lhs files into hs files. This is practical and explained below. From my point of view other things are just pure aesthetics. ====== I'm not sure about the style of do-notation. When I first saw do { ... ; ... } I thought it was terrible but now, after some time, I see the merit of it. Consider this code snippet from StgCmmBind: \(_offset, node, arg_regs) -> do { (...) ; withSelfLoop (bndr, loop_header_id, arg_regs) $ do { (...) ; entryHeapCheck cl_info node' arity arg_regs $ do { (...) }}} I believe that without { ... ; ... } these nested dos would have to be indented, which wouldn't aid readability IMO. I agree that we should use one style of the do-notation but I'm not sure which one should that be. I'd favor the { ... ; ... } one. =====
So, overall, I kind of like our current policy of fixing tabs and white-space when we have to modify the code anyway. I proposed to remove tabs from the source code 1,5 year ago:
http://www.haskell.org/pipermail/ghc-devs/2013-January/000053.html Back then my conclusion was that most of the files that contain tabs were modified in the previous 6 months. This means that people are not following the policy to remove tabs when they modify the file (or at least they weren't following that policy back then). So this policy seems to be only theory. ====== One more style issue that was not mentioned by Austin are block comments: I strongly advocate removal of all block comments in favour of single-line ones. Why? Here's a reason: {- Note [Blah blah] ~~~~~~~~~~~~ ... Code snippet: foo :: Foo -> Bar foo = ... -} My primary method of finding things in the source code is grepping and I believe that many people do the same. With block comments it is impossible to tell whether the line found by grep is part of the comment. With line comments it is immediately obvious. Moreover, if all comments were line comments it would be really easy to grep for something only in the comments. Currently this is impossible. This is also true for lhs files and for that reason I also consider lhs files to be practical issue. Another reason (albeit minor) to have line comments instead of block comments is that with the former style each comment is independent and can be freely moved around. With block comments we might need to create extra comment blocks when moving comments around. ====== Now, I understand people who don't want such change because of merge conflicts. But the truth is there will never be a good moment to implement such changes because there is always some ongoing work and outstanding branches. So I believe we should think whether these changes move us in a good direction or not and if we decide that these changes are a Good Thing - which I believe they are - we should bite the bullet. Otherwise we will have mess in the source code forever. Since I'm advocating strongly for these I am of course willing to put my work into making this happen. So far I've assigned #9230 to myself and if we agree to detab all the source code I can be the person to do it. Janek PS. I am proud of myself, because I wrote a mail as long as Austin's, which I always thought was impossible :-)

Hi, Am Donnerstag, den 03.07.2014, 10:44 +0200 schrieb Jan Stolarek:
Now, I understand people who don't want such change because of merge conflicts. But the truth is there will never be a good moment to implement such changes because there is always some ongoing work and outstanding branches.
when I first looked at GHC code I also thought “ugh, ugly”. But I can cope, it does not actually hinder me while working on GHC. On the other hand, having a “detab and rename” horizon where merging patches from before is much harder, and where "git log -L" and "git blame" fail to work properly would be a hindrance. Also, backporting patches from GHC HEAD to distribution releases would become annoying, for at least one release cycle. So my conclusion is that it’s ok to have the mess in the source code forever. Greetings, Joachim -- Joachim “nomeata” Breitner mail@joachim-breitner.de • http://www.joachim-breitner.de/ Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F Debian Developer: nomeata@debian.org

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 03/07/14 11:13, Joachim Breitner wrote:
So my conclusion is that it?s ok to have the mess in the source code forever. I mostly agree. The only way to handle this that I can presently identify is to agree on some guidelines, put them up on the Wiki, enforce them socially (maybe technologically as well, using e.g. Phabricator) with code review. I.e. make sure to not make the source *uglier* at least. Anything more drastic than this will get in everyone's way.
Of course if something sufficiently drastic has to happen anyhow (AMP etc.), then this is an apt opportunity to beautifying the related code. And if you are changing 90% of a file for some reason, it probably doesn't hurt to detab it as well, etc. - -- Alexander alexander@plaimi.net https://secure.plaimi.net/~alexander -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlO1bJYACgkQRtClrXBQc7Wp8wEAjdnJRecFseuhW5Vxd41N7Z6f E0+PqXXpU/T8oiYuzt4A/1+7imYJOt1vsaRhj+HvSCvM5SoRk7T1M8Aoxh3QEOsC =CeaG -----END PGP SIGNATURE-----

And if you are changing 90% of a file for some reason, it probably doesn't hurt to detab it as well, etc. I think the reason we still have tabs in the source code is that people usually don't change 90% of a file, but 5% or something like that and they feel this is not enough to justify detabing of a whole file.
Janek

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 03/07/14 17:38, Jan Stolarek wrote:
I think the reason we still have tabs in the source code is that people usually don't change 90% of a file, but 5% or something like that and they feel this is not enough to justify detabing of a whole file. Which is fine, in my opinion.
Alexander alexander@plaimi.net https://secure.plaimi.net/~alexander -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlO1eTcACgkQRtClrXBQc7WqzwD+NdW5OqYSLpIlMgbOBI3grR5i EpKPA2+SWboRvdB1iAIBAKBn4lZ7CaofQ2YOnXpjX0eV3kauOxBZ27YpaQ2xQ11i =QDfM -----END PGP SIGNATURE-----

On Thu, Jul 3, 2014 at 4:13 AM, Joachim Breitner
On the other hand, having a “detab and rename” horizon where merging patches from before is much harder, and where "git log -L" and "git blame" fail to work properly would be a hindrance.
Minor point, but you can use "git blame -w" to tell blame to ignore whitespace changes and show you the last commit that actually changed the code.

Just to say that * In general I don't have a strong opinion about these stylistic issues. Moreover I have little bikeshed time, and if I don't contribute to a debate I can't expect to influence it much. So I'm mostly happy to accept a consensus view. However, some thoughts * I don't think we should be over-prescriptive. Eg personally I like a do-notation style with semicolons at the beginning, eg do { blah ; x <- foo ; etc } but I'm disinclined to force everyone to do so, and I don't want to be forced to do something different myself. I can adapt to the style of the author here. * A *primary* form of consumption is the source code itself. I've found that Haddock-compliant comments can be rather less readable in source code. (Eg. CoreSyn.lhs where the #blah# notation coexists uneasily with Note [blah].) So I'd be nervous of mandating Haddock-compliance. * Insisting on line comments exclusively, carries a cost. The on-screen distraction of line comments, and the nuisance of writing them, is not trivial. Perhaps it is bearable, but it's non-zero. See example below. Simon With line comments -- Note [Extra dependencies from .hs-boot files] -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- Consider the following case: -- -- module A where -- import B -- data A1 = A1 B1 -- -- module B where -- import {-# SOURCE #-} A -- type DisguisedA1 = A1 -- data B1 = B1 DisguisedA1 -- -- We do not follow type synonyms when building the dependencies for each datatype, -- so we will not find out that B1 really depends on A1 (which means it depends on -- itself). To handle this problem, at the moment we add dependencies to everything -- that comes from an .hs-boot file. But we don't add those dependencies to -- everything. Imagine module B above had another datatype declaration: -- -- data B2 = B2 Int -- -- Even though B2 has a dependency (on Int), all its dependencies are from things -- that live on other packages. Since we don't have mutual dependencies across -- packages, it is safe not to add the dependencies on the .hs-boot stuff to B2. -- -- See also Note [Grouping of type and class declarations] in TcTyClsDecls. With block comments Note [Extra dependencies from .hs-boot files] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider the following case: module A where import B data A1 = A1 B1 module B where import {-# SOURCE #-} A type DisguisedA1 = A1 data B1 = B1 DisguisedA1 We do not follow type synonyms when building the dependencies for each datatype, so we will not find out that B1 really depends on A1 (which means it depends on itself). To handle this problem, at the moment we add dependencies to everything that comes from an .hs-boot file. But we don't add those dependencies to everything. Imagine module B above had another datatype declaration: data B2 = B2 Int Even though B2 has a dependency (on Int), all its dependencies are from things that live on other packages. Since we don't have mutual dependencies across packages, it is safe not to add the dependencies on the .hs-boot stuff to B2. See also Note [Grouping of type and class declarations] in TcTyClsDecls. | -----Original Message----- | From: ghc-devs [mailto:ghc-devs-bounces@haskell.org] On Behalf Of Jan | Stolarek | Sent: 03 July 2014 09:45 | To: ghc-devs@haskell.org | Subject: Re: RFC: style cleanup & guidelines for GHC, and related | bikeshedding | | I fully support Austin's proposal. My eyes hurt when I work on 5 files | and each of them is written | in a different style. | | Now, to address a few points that were raised. | | > Is it just for the sake of beauty (not to diminish the importance of | beauty)? | | * I believe that trailing whitespaces are a practical issue: they are | invisible to the human eye | (unless you have (setq-default show-trailing-whitespace t) \n (setq- | default indicate-empty-lines | t) in your .emacs file) but carry a semantic meaning for git and other | version control systems. | This means that accidental removal of a trailing whitespace - which can | and will happen if you | don't highlight them - will lead to false changes in the diff. | That said, simply removing existing trailing whitespaces is not enough - | we would need a way to | keep them from reappearing. Sadly, this idea was rejected: | http://www.haskell.org/pipermail/ghc-devs/2013-August/002074.html | Git has some cool tools (like git diff --check) that aid programmer in | dealing with trailing | whitespaces, but not everyone uses them. Here's a relatively recent | example of a new trailing | whitespace sneaking into the source code: | | https://github.com/ghc/ghc/commit/ce19d5079ea85d3190e837a1fc60000fbd82134 | d#diff-ababf87bf3da1f44484a901a8fbc0eb6R388 | | So without a way to enforce this policy removing trailing whitespaces | doesn't seem to make sense. | | * Tabs will become a practical problem once #9230 is fixed. Also, tab | width can be custom-set to | whatever value in an editor, which can trip some people. I for one used | to have tab width set to | 2 (unlike the default of 4 or 8) but gave up, because too many tab- | formatted things were | displayed incorrectly. | | * I strongly support turning lhs files into hs files. This is practical | and explained below. | | From my point of view other things are just pure aesthetics. | | ====== | | I'm not sure about the style of do-notation. When I first saw do { ... ; | ... } I thought it was | terrible but now, after some time, I see the merit of it. Consider this | code snippet from | StgCmmBind: | | \(_offset, node, arg_regs) -> do | { (...) | ; withSelfLoop (bndr, loop_header_id, arg_regs) $ do | { (...) | ; entryHeapCheck cl_info node' arity arg_regs $ do | { | (...) | }}} | | I believe that without { ... ; ... } these nested dos would have to be | indented, which wouldn't | aid readability IMO. I agree that we should use one style of the do- | notation but I'm not sure | which one should that be. I'd favor the { ... ; ... } one. | | ===== | | > So, overall, I kind of like our current policy of fixing tabs and | white-space when we have to | modify the code anyway. | I proposed to remove tabs from the source code 1,5 year ago: | | http://www.haskell.org/pipermail/ghc-devs/2013-January/000053.html | | Back then my conclusion was that most of the files that contain tabs were | modified in the previous | 6 months. This means that people are not following the policy to remove | tabs when they modify the | file (or at least they weren't following that policy back then). So this | policy seems to be only | theory. | | ====== | | One more style issue that was not mentioned by Austin are block comments: | I strongly advocate | removal of all block comments in favour of single-line ones. Why? Here's | a reason: | | {- | Note [Blah blah] | ~~~~~~~~~~~~ | ... Code snippet: | foo :: Foo -> Bar | foo = ... | -} | | My primary method of finding things in the source code is grepping and I | believe that many people | do the same. With block comments it is impossible to tell whether the | line found by grep is part | of the comment. With line comments it is immediately obvious. Moreover, | if all comments were line | comments it would be really easy to grep for something only in the | comments. Currently this is | impossible. This is also true for lhs files and for that reason I also | consider lhs files to be | practical issue. | Another reason (albeit minor) to have line comments instead of block | comments is that with the | former style each comment is independent and can be freely moved around. | With block comments we | might need to create extra comment blocks when moving comments around. | | ====== | | Now, I understand people who don't want such change because of merge | conflicts. But the truth is | there will never be a good moment to implement such changes because there | is always some ongoing | work and outstanding branches. So I believe we should think whether these | changes move us in a | good direction or not and if we decide that these changes are a Good | Thing - which I believe they | are - we should bite the bullet. Otherwise we will have mess in the | source code forever. | | Since I'm advocating strongly for these I am of course willing to put my | work into making this | happen. So far I've assigned #9230 to myself and if we agree to detab all | the source code I can | be the person to do it. | | Janek | | PS. I am proud of myself, because I wrote a mail as long as Austin's, | which I always thought was | impossible :-) | _______________________________________________ | ghc-devs mailing list | ghc-devs@haskell.org | http://www.haskell.org/mailman/listinfo/ghc-devs

On Thu, Jul 3, 2014 at 6:29 PM, Simon Peyton Jones
* Insisting on line comments exclusively, carries a cost. The on-screen distraction of line comments, and the nuisance of writing them, is not trivial. Perhaps it is bearable, but it's non-zero. See example below.
I think you can use {-| Note [Extra dependencies from .hs-boot files] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider the following case: module A where import B data A1 = A1 B1 module B where import {-# SOURCE #-} A type DisguisedA1 = A1 data B1 = B1 DisguisedA1 We do not follow type synonyms when building the dependencies for each datatype, so we will not find out that B1 really depends on A1 (which means it depends on itself). To handle this problem, at the moment we add dependencies to everything that comes from an .hs-boot file. But we don't add those dependencies to everything. Imagine module B above had another datatype declaration: data B2 = B2 Int Even though B2 has a dependency (on Int), all its dependencies are from things that live on other packages. Since we don't have mutual dependencies across packages, it is safe not to add the dependencies on the .hs-boot stuff to B2. See also Note [Grouping of type and class declarations] in TcTyClsDecls. -} instead (skip the | after {- if you don't want it to be a Haddock comment).

The on-screen distraction of line comments, and the nuisance of writing them, is not trivial.
Well, I don't consider one-line comments to be distracting at all. To each his own, I guess., though your opinion on this is more important than mine. But I don't agree that writing one-line comments is a nuisance. Actually, it's very simple in Emacs: M-j will start a new line of comment. Maintains indentation. M-q will reformat a single paragraph of a comment to match the default line length (known as "fill-collumn" in Emacs). Use C-x f to set or add this to your Emacs file: (setq-default fill-column 80) Janek

On 03/07/2014 09:29, Simon Peyton Jones wrote:
* A *primary* form of consumption is the source code itself. I've found that Haddock-compliant comments can be rather less readable in source code. (Eg. CoreSyn.lhs where the #blah# notation coexists uneasily with Note [blah].) So I'd be nervous of mandating Haddock-compliance.
I agree with this point, and it's why I always disliked literate source code: we only look at source code in a text editor, so using unwieldy markup compromises the most common use case. But I don't think there was a proposal to use Haddock markup everywhere (e.g. Notes), only in function documentation. Notes are for looking at in a text editor, so we can use whatever conventions we like. Cheers, Simon

Austin didn't mention this, so I will: we have a wiki page for style https://ghc.haskell.org/trac/ghc/wiki/Commentary/CodingStyle It has a pretty clear set of guidelines for imports/exports, for example (that we don't follow as much as we should). I'd be in favour of changing .lhs files to .hs files, replacing all the \begin{code}...\end{code} with -}...{-. As I said in my reply to Simon, literate source files aren't providing any real benefit to us, and in the name of consistency this would be a positive step. I'm all in favour of gardening the code base to clean up things like this. However, the best time for a big stylistic sweep is a time that minimizes the number of merges we have to do across these commits. That would be just before we branch for a new major release; hopefully at that point most of the feature branches will be merged and we're not going to merge any further patches into the previous release branch. I'm less enthusiastic about fixing whitespace things. It's a tough call, but I'm guessing that fixing it would cause more pain than not fixing it. Opinions might differ, and I wouldn't mind at all if the consensus were to do a whitespace sweep too. One other thing I'd like to propose is an 80-column limit on new code. Personally I've always used an 80-column limit for various reasons. This is the biggest bikeshed ever and we could talk all day about it, but here are a couple of concrete points that I think are uncontroversial: - there has to be *some* limit, so that we know how wide to make our windows. The only valid discussion is what the limit should be. - Phabricator's side-by-side diffs are hard to read on a laptop screen when lines go beyond 80 columns. And I think 80 is a good enough number, especially for Haskell where you can pack a lot into an 80-column line. Phabricator is already flagging up >80 column lines in its linter, which is its default setting. Cheers, Simon On 02/07/2014 12:59, Austin Seipp wrote:
Hi *,
First off, WARNING: BIKESHEDDING AHEAD.
With that out of the way - today on IRC, there was some discussion about some stylistic/consistency issues in GHC, and being spurred by Johans recent proposal for top-level documentation, I figured perhaps we should beat the drum on this issue as well.
The TL;DR is that GHC has a lot of inconsistent style issues, including things like:
- Mixing literate haskell with non-literate haskell files - Legacy code with tabs and spaces intermixed - Related to the last one, trailing whitespace - Mixing styles of do notation in different parts of the compiler (braces vs no braces) - Probably things like indentation mismatches even in the same code - Probably many other things I've missed, obvious or not.
These issues by themselves aren't too bad, but together they make the coding style for GHC very inconsistent, and this hurts maintainability a bit I feel. Furthermore, some of these issues block related improvements - for example, https://ghc.haskell.org/trac/ghc/ticket/9230 which is probably quite reasonable will likely be a bit annoying to implement until GHC itself is de-tabbed - we use -Werror during ./validate. This particular issue is what started the discussion.
Also, with developers now using arcanist and phabricator, they have linting enabled for new patches, but they will often warn about surrounding issues, mostly tabs and trailing spaces. This is a bit annoying for submitters, and would be fixed by enforcing it.
First attack plan ~~~~~~~~~~~~~~~
So, to start, I'd like to propose that we make some guidelines for these kinds of things, and also a plan to fix some of them. To start:
#1) We should really consider going ahead and detabbing the remaining files that have them. We already enforce this on new commits with git hooks, but by doing this, we can make -fwarn-tabs a default flag and then validate with -Werror in the development process.
#2) Similarly, we should kill all the trailing whitespace. (I think this is less controversial than #1)
#3) We should most certainly move the remaining files from literate haskell to non-literate haskell. Most of the files in the compiler are already in this form, and the literate haskell documentation can't be used to generate PDFs or anything similar. I suggest we get rid of it. More Haskell users use non-literate files anyway. This is probably the least controversial.
Merge issues ~~~~~~~~~~~~~~~~~
The reason we haven't done the above three things historically is that it makes merge conflicts nastier. A useful approximation suggested on IRC might be to detab and remove whitespace for files older than a certain date (say, 6 months).
However, in general I'm thinking perhaps it's best to go ahead and bite the bullet. maybe. I'd like to know what other people think! If we have a vote and most people are in favor of doing this, maybe we should really do it.
I'd especially like to hear about this if you have an outstanding branch.
Some numbers on these issues ~~~~~~~~~~~~~~~~~~~~~~~~
Here are some quick numbers on where most of the tabs reside, as well as the breakdown of literate files vs non-literate files.
NOTE: these tests occurred in the 'compiler' subdirectory of the GHC repository, which is where most of the relevant code is.
LITERATE vs NON-LITERATE:
$ find . -type f -iname '*.hs' | wc -l 206
$ find . -type f -iname '*.lhs' | wc -l 194
Non-literate wins by a slim margin! But having the compiler divided in half is really not a good thing IMO...
NUMBER OF TABS PER SUBDIRECTORY:
NOTE: this counts the number of lines which have tabs in them. It does not count the total number of tab occurrences.
$ for x in `echo */`; do echo -n "$x:\t\t"; find $x -type f -regex '.*\.\(lhs\|hs\)' | xargs grep -P '\t' | wc -l; done basicTypes/: 919 cbits/: 0 cmm/: 38 codeGen/: 0 coreSyn/: 843 deSugar/: 545 ghci/: 90 hsSyn/: 120 iface/: 213 llvmGen/: 0 main/: 8 nativeGen/: 1213 parser/: 19 prelude/: 182 profiling/: 39 rename/: 188 simplCore/: 754 simplStg/: 0 specialise/: 0 stgSyn/: 0 stranal/: 336 typecheck/: 1171 types/: 301 utils/: 220 vectorise/: 0
From these numbers, we can see a few useful things at least, primarily that there are definitely some places where removing tabs should be easy. For example, parser/, profiling/, main/, and cmm/ can all be de-tabbed without much of a problem, I think.
nativeGen is very often not touched, so even though it has a *huge* amount of tabs, it can likely be de-tabbed as well with minimal impact.
Other style issues ~~~~~~~~~~~~~~~~~
We should also discuss some related issues, like what general block-width to use for indentations, naming conventions, and other stuff. However, I leave this all to you, and perhaps it is best we split that part off into a separate thread. Some things I'd like you all to consider:
- Block width for indentation - Naming conventions (we use camelCase and_underscores_sometimes which isReally_confusing) - Import/export styles (I think we have some sloppiness here too) - Other things worth arguing forever about.
Thoughts on the above issues?

Simon Marlow
Austin didn't mention this, so I will: we have a wiki page for style
https://ghc.haskell.org/trac/ghc/wiki/Commentary/CodingStyle
It has a pretty clear set of guidelines for imports/exports, for example (that we don't follow as much as we should).
I'd be in favour of changing .lhs files to .hs files, replacing all the \begin{code}...\end{code} with -}...{-. As I said in my reply to Simon, literate source files aren't providing any real benefit to us, and in the name of consistency this would be a positive step. I’m all in favour of gardening the code base to clean up things like this.
Yes!
However, the best time for a big stylistic sweep is a time that minimizes the number of merges we have to do across these commits. That would be just before we branch for a new major release; hopefully at that point most of the feature branches will be merged and we're not going to merge any further patches into the previous release branch.
I'm less enthusiastic about fixing whitespace things. It's a tough call, but I'm guessing that fixing it would cause more pain than not fixing it. Opinions might differ, and I wouldn’t mind at all if the consensus were to do a whitespace sweep too.
I think, the tabs should go — it’s been dragging on for a long time now. (The rest is less important.)
One other thing I'd like to propose is an 80-column limit on new code. Personally I've always used an 80-column limit for various reasons. This is the biggest bikeshed ever and we could talk all day about it, but here are a couple of concrete points that I think are uncontroversial:
- there has to be *some* limit, so that we know how wide to make our windows. The only valid discussion is what the limit should be.
- Phabricator's side-by-side diffs are hard to read on a laptop screen when lines go beyond 80 columns.
And I think 80 is a good enough number, especially for Haskell where you can pack a lot into an 80-column line. Phabricator is already flagging up >80 column lines in its linter, which is its default setting.
I used to be a 80 column guy, but moved away from that the last years. But you are right, there must be an upper limit and, if >80 is a problem for code reviews, then it’s a reasonable choice. Cheers, Manuel
On 02/07/2014 12:59, Austin Seipp wrote:
Hi *,
First off, WARNING: BIKESHEDDING AHEAD.
With that out of the way - today on IRC, there was some discussion about some stylistic/consistency issues in GHC, and being spurred by Johans recent proposal for top-level documentation, I figured perhaps we should beat the drum on this issue as well.
The TL;DR is that GHC has a lot of inconsistent style issues, including things like:
- Mixing literate haskell with non-literate haskell files - Legacy code with tabs and spaces intermixed - Related to the last one, trailing whitespace - Mixing styles of do notation in different parts of the compiler (braces vs no braces) - Probably things like indentation mismatches even in the same code - Probably many other things I've missed, obvious or not.
These issues by themselves aren't too bad, but together they make the coding style for GHC very inconsistent, and this hurts maintainability a bit I feel. Furthermore, some of these issues block related improvements - for example, https://ghc.haskell.org/trac/ghc/ticket/9230 which is probably quite reasonable will likely be a bit annoying to implement until GHC itself is de-tabbed - we use -Werror during ./validate. This particular issue is what started the discussion.
Also, with developers now using arcanist and phabricator, they have linting enabled for new patches, but they will often warn about surrounding issues, mostly tabs and trailing spaces. This is a bit annoying for submitters, and would be fixed by enforcing it.
First attack plan ~~~~~~~~~~~~~~~
So, to start, I'd like to propose that we make some guidelines for these kinds of things, and also a plan to fix some of them. To start:
#1) We should really consider going ahead and detabbing the remaining files that have them. We already enforce this on new commits with git hooks, but by doing this, we can make -fwarn-tabs a default flag and then validate with -Werror in the development process.
#2) Similarly, we should kill all the trailing whitespace. (I think this is less controversial than #1)
#3) We should most certainly move the remaining files from literate haskell to non-literate haskell. Most of the files in the compiler are already in this form, and the literate haskell documentation can't be used to generate PDFs or anything similar. I suggest we get rid of it. More Haskell users use non-literate files anyway. This is probably the least controversial.
Merge issues ~~~~~~~~~~~~~~~~~
The reason we haven't done the above three things historically is that it makes merge conflicts nastier. A useful approximation suggested on IRC might be to detab and remove whitespace for files older than a certain date (say, 6 months).
However, in general I'm thinking perhaps it's best to go ahead and bite the bullet. maybe. I'd like to know what other people think! If we have a vote and most people are in favor of doing this, maybe we should really do it.
I'd especially like to hear about this if you have an outstanding branch.
Some numbers on these issues ~~~~~~~~~~~~~~~~~~~~~~~~
Here are some quick numbers on where most of the tabs reside, as well as the breakdown of literate files vs non-literate files.
NOTE: these tests occurred in the 'compiler' subdirectory of the GHC repository, which is where most of the relevant code is.
LITERATE vs NON-LITERATE:
$ find . -type f -iname '*.hs' | wc -l 206
$ find . -type f -iname '*.lhs' | wc -l 194
Non-literate wins by a slim margin! But having the compiler divided in half is really not a good thing IMO...
NUMBER OF TABS PER SUBDIRECTORY:
NOTE: this counts the number of lines which have tabs in them. It does not count the total number of tab occurrences.
$ for x in `echo */`; do echo -n "$x:\t\t"; find $x -type f -regex '.*\.\(lhs\|hs\)' | xargs grep -P '\t' | wc -l; done basicTypes/: 919 cbits/: 0 cmm/: 38 codeGen/: 0 coreSyn/: 843 deSugar/: 545 ghci/: 90 hsSyn/: 120 iface/: 213 llvmGen/: 0 main/: 8 nativeGen/: 1213 parser/: 19 prelude/: 182 profiling/: 39 rename/: 188 simplCore/: 754 simplStg/: 0 specialise/: 0 stgSyn/: 0 stranal/: 336 typecheck/: 1171 types/: 301 utils/: 220 vectorise/: 0
From these numbers, we can see a few useful things at least, primarily that there are definitely some places where removing tabs should be easy. For example, parser/, profiling/, main/, and cmm/ can all be de-tabbed without much of a problem, I think.
nativeGen is very often not touched, so even though it has a *huge* amount of tabs, it can likely be de-tabbed as well with minimal impact.
Other style issues ~~~~~~~~~~~~~~~~~
We should also discuss some related issues, like what general block-width to use for indentations, naming conventions, and other stuff. However, I leave this all to you, and perhaps it is best we split that part off into a separate thread. Some things I'd like you all to consider:
- Block width for indentation - Naming conventions (we use camelCase and_underscores_sometimes which isReally_confusing) - Import/export styles (I think we have some sloppiness here too) - Other things worth arguing forever about.
Thoughts on the above issues?
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs

| I used to be a 80 column guy, but moved away from that the last years. | But you are right, there must be an upper limit and, if >80 is a | problem for code reviews, then it's a reasonable choice. As laptop screens have successively more horizontal pixels and fewer vertical pixels, longer lines use screen real estate better. 80 columns now seems a bit narrow to me. 100 would be better. But I'm not going to die for this Simon

On Thu, Jul 17, 2014 at 8:40 AM, Simon Peyton Jones
| I used to be a 80 column guy, but moved away from that the last years. | But you are right, there must be an upper limit and, if >80 is a | problem for code reviews, then it's a reasonable choice.
As laptop screens have successively more horizontal pixels and fewer vertical pixels, longer lines use screen real estate better. 80 columns now seems a bit narrow to me. 100 would be better.
But I'm not going to die for this
Here we go! * Wider screens let you have several Emacs buffers next to each other. At 80 chars you can have about 2 buffers next to each other on a 13" screen. * The average line length is about 30-35 characters in Python. If it's anything similar in Haskell shorter line length are more efficient, looking how much of the lines times columns space is filled with characters. * The eye has trouble traveling back to the next line if lines get too long (at least when reading prose). Research says around 60-70 characters is optimal, if I recall correctly.

Johan Tibell
On Thu, Jul 17, 2014 at 8:40 AM, Simon Peyton Jones
wrote: | I used to be a 80 column guy, but moved away from that the last years. | But you are right, there must be an upper limit and, if >80 is a | problem for code reviews, then it's a reasonable choice.
As laptop screens have successively more horizontal pixels and fewer vertical pixels, longer lines use screen real estate better. 80 columns now seems a bit narrow to me. 100 would be better.
But I'm not going to die for this
Here we go!
* Wider screens let you have several Emacs buffers next to each other. At 80 chars you can have about 2 buffers next to each other on a 13" screen.
I think that was SimonM's premise for code reviews, that you want lines short enough to have two versions besides each other.
* The average line length is about 30-35 characters in Python. If it's anything similar in Haskell shorter line length are more efficient, looking how much of the lines times columns space is filled with characters.
The problem is that indentation and long identifiers push you towards longer lines.
* The eye has trouble traveling back to the next line if lines get too long (at least when reading prose). Research says around 60-70 characters is optimal, if I recall correctly.
I think we read code differently to prose (and prose is not much indented), so I don't think these numbers transfer. Manuel

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 17/07/14 08:57, Johan Tibell wrote:
* Wider screens let you have several Emacs buffers next to each other. At 80 chars you can have about 2 buffers next to each other on a 13" screen. This is my main grief with 100 char lines (which is the Android standard, by the way). I like to have 6 or 8 files open side by side (including diffs and other meta-code).
* The eye has trouble traveling back to the next line if lines get too long (at least when reading prose). Research says around 60-70 characters is optimal, if I recall correctly. 66 as far as I remember, but that number is for prose and thus not *very* relevant, as Manuel points out. But I do think it's a problem in code too, regardless of the exact number. Being at 80+ is also often an indication that you're in dire need of refactoring.
So while we're all chiming in, my preferences in order: 78 72 80 less than 72 more than 80 - -- Alexander alexander@plaimi.net https://secure.plaimi.net/~alexander -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iF4EAREIAAYFAlPHkAIACgkQRtClrXBQc7VVpwD+K2Q8NYplnmJdNYTulHx4hQBP GVeBJjYQifYrr6MoQq8BAJNP3IUyq+pg+VsGqJg4tCkrv6nmfM1teExzE2avz0/u =76Tq -----END PGP SIGNATURE-----
participants (12)
-
Alexander Berntsen
-
Andrew Farmer
-
Austin Seipp
-
Iavor Diatchki
-
Jan Stolarek
-
Joachim Breitner
-
Johan Tibell
-
Manuel M T Chakravarty
-
Richard Eisenberg
-
Sergei Trofimovich
-
Simon Marlow
-
Simon Peyton Jones