Prelimary GHC extensions stats

Dear Committee, in hopefully anticipation that we can start the GHC2021 process soon, and also given that the Haskell Survey closes today (so we get that data also soon, I guess), I ran a simple analysis against hackage today. I’ll share the data in a proper way when it's time; this is just a preview into what I have in mind. ## Methodology The methodology is: * For each package on hackage, get the latest version * For each such package, try to read out the extensions used in the cabal file and in the modules (This uses https://github.com/kowainik/extensions) * Aggregate these numbers. Out of 12729 packages fetched (4GB), 12025 could be parsed by the extensions library (if anyone cares about the missing ones, please submit fixes to hat library). Only 1116 specify extensions in the cabal file using default- extensions; the rest uses per-file extensions exclusively. This is a relatively low number, so barring bugs in the analysis code, this means that most developers prefer per-module extensions (I am one of those). I extracted three metrics, all represented as percentages: * Proliferation: #(packages using that extensions)/#(packages parsed) (I’ll reserve the word “Popularity” for the data from the poll.) * Innocuousness #(packages enable it in .cabal)/#(packages using default-extensions) A high number here indicates that many developers want this on by default in their project, across all modules. * Aloofness #(packages using this, but _only_ in modules, despite using default-extensions) /#(packages using default-extensions) A high number here indicates that, although a developer was in principle happy with putting extensions into the default-extensions field, they did not include this particular one. I take this as an indication that this extension does _not_ make a great on-by-default extension. (Matching my expectation, TemplateHaskell and CPP make the highest here) Yes, I went overboard with finding fun names to describe the metrics. Happy to take suggestions for more suitable names, maybe from Richard, our master of sophisticated English vocabulary. ## Source I am using this code: https://github.com/nomeata/ghc-proposals-stats/tree/master/ext-stats The are proof that I am very much capable of writing horrible Haskell code that looks like bad python code. If someone (including anyone outside the committee) feels like contributing better code, talk to me! Besides just prettier code, the tool could understand and be smart about implications between extensions, be clever about extensions that are part of the default-language already, and maybe provide nicer reporting. ## Results Eventually I’ll provide a CSV file with all relevant data (hackage stats and poll results), and maybe even upload it on some suitable web service that allows you to explore such a table easily (suggestions?). For now, a preview of the complete output, sorted by Proliferation: Proliferation | Innocousness | | | Aloofness | | | 0% 0% 0% NoMonoLocalBinds 0% 0% 0% NoRebindableSyntax 0% 0% 0% NoMagicHash 0% 1% 0% RelaxedPolyRec 0% 3% 0% DoAndIfThenElse 0% 4% 0% ForeignFunctionInterface 0% 12% 0% PatternGuards 0% 14% 0% EmptyDataDecls 0% 0% 0% ParallelArrays 0% 0% 0% NoForeignFunctionInterface 0% 0% 0% NoDatatypeContexts 0% 0% 0% TransformListComp 0% 0% 0% InterruptibleFFI 0% 1% 0% MonadFailDesugaring 0% 0% 0% NullaryTypeClasses 0% 0% 0% UnboxedSums 0% 0% 0% CApiFFI 0% 0% 0% StaticPointers 0% 0% 0% DerivingStrategies 0% 1% 0% NamedWildCards 0% 0% 0% GHCForeignImportPrim 0% 0% 0% TemplateHaskellQuotes 0% 0% 0% GADTSyntax 0% 0% 0% JavaScriptFFI 0% 0% 0% PostfixOperators 0% 0% 0% DeriveLift 0% 0% 0% NondecreasingIndentation 0% 0% 0% Strict 0% 1% 0% NumDecimals 0% 0% 0% AutoDeriveTypeable 0% 1% 1% StrictData 0% 0% 0% ConstrainedClassMethods 0% 1% 0% DisambiguateRecordFields 0% 2% 0% NegativeLiterals 0% 2% 0% MonadComprehensions 0% 0% 0% UnliftedFFITypes 0% 2% 0% OverloadedLabels 0% 2% 0% BinaryLiterals 0% 2% 0% ApplicativeDo 0% 1% 0% MonoLocalBinds 0% 0% 1% ExplicitNamespaces 0% 1% 1% TypeFamilyDependencies 0% 0% 1% UndecidableSuperClasses 0% 0% 1% RoleAnnotations 1% 0% 1% ExplicitForAll 1% 0% 1% ExtendedDefaultRules 1% 2% 0% RebindableSyntax 1% 3% 0% EmptyCase 1% 3% 0% PartialTypeSignatures 1% 3% 1% DuplicateRecordFields 1% 2% 1% TypeInType 1% 2% 1% ImpredicativeTypes 1% 2% 0% RecursiveDo 1% 0% 2% ImplicitParams 1% 0% 2% OverloadedLists 1% 0% 1% IncoherentInstances 1% 11% 0% LiberalTypeSynonyms 1% 2% 2% AllowAmbiguousTypes 1% 1% 3% DeriveAnyClass 1% 10% 0% ParallelListComp 2% 2% 1% PackageImports 2% 6% 1% InstanceSigs 2% 11% 0% Arrows 2% 6% 0% UnicodeSyntax 2% 4% 2% PatternSynonyms 2% 6% 3% TypeApplications 2% 0% 2% OverlappingInstances 3% 9% 1% UnboxedTuples 3% 15% 1% MultiWayIf 3% 6% 3% NamedFieldPuns 4% 16% 2% DeriveFoldable 4% 7% 3% PolyKinds 4% 16% 2% DeriveTraversable 4% 10% 3% MagicHash 4% 13% 3% NoMonomorphismRestriction 4% 18% 4% DefaultSignatures 6% 8% 5% ViewPatterns 6% 10% 4% KindSignatures 6% 15% 6% QuasiQuotes 6% 3% 6% ExistentialQuantification 7% 30% 2% NoImplicitPrelude 7% 25% 6% ConstraintKinds 8% 23% 6% DeriveFunctor 8% 22% 6% FunctionalDependencies 9% 24% 6% StandaloneDeriving 9% 24% 6% TupleSections 10% 25% 6% DataKinds 10% 6% 8% TypeSynonymInstances 11% 28% 4% LambdaCase 11% 24% 7% GADTs 12% 27% 5% TypeOperators 12% 6% 15% UndecidableInstances 12% 20% 7% BangPatterns 14% 27% 10% DeriveGeneric 15% 27% 8% RecordWildCards 17% 21% 16% TemplateHaskell 17% 29% 14% GeneralizedNewtypeDeriving 19% 29% 12% RankNTypes 20% 26% 8% DeriveDataTypeable 20% 31% 11% TypeFamilies 21% 34% 11% MultiParamTypeClasses 22% 11% 18% CPP 25% 37% 13% ScopedTypeVariables 26% 42% 14% FlexibleContexts 31% 43% 16% FlexibleInstances 34% 53% 10% OverloadedStrings Enjoy! Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

Hi again, Am Sonntag, den 15.11.2020, 18:19 +0100 schrieb Joachim Breitner:
## Results
I just noticed that I was using a hackage copy from 2018. All the prose still applies; the actual stats are a bit outdated… anyways, this way there is more suspense until I can provide the real data :-) Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

On Nov 15, 2020, at 12:19 PM, Joachim Breitner
wrote: Happy to take suggestions for more suitable names, maybe from Richard, our master of sophisticated English vocabulary.
Hmph. And I always consider Simon PJ to be a more erudite logophile than I. But, really, I think Joachim is just parading the fact that he unseated my near-miss at getting "bespoke" to be a keyword in Haskell (instead of the "stock" deriving strategy). Thanks for doing this analysis! Richard

Hi, Am Montag, den 16.11.2020, 14:05 +0000 schrieb Richard Eisenberg:
On Nov 15, 2020, at 12:19 PM, Joachim Breitner
wrote: Happy to take suggestions for more suitable names, maybe from Richard, our master of sophisticated English vocabulary.
Hmph. And I always consider Simon PJ to be a more erudite logophile than I. But, really, I think Joachim is just parading the fact that he unseated my near-miss at getting "bespoke" to be a keyword in Haskell (instead of the "stock" deriving strategy).
I don’t deny Simon’s logophilia, but whoever calls other logophiles “erudite” is definitely humble bragging. And clearly, while I am very fond of the style of logophilia that brought us gems like “zonking“, the “WizWoz machine” and many others, the criteria names in my original message were a genuine attempt attempt to imitate Richard’s very own style – as as such, be assured that it was most sincere flattery, and not at all parading of past quarrels :-) Ok, might also have been influenced by just finished reading a book by Alan Bradley… Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

Thanks for the analysis! We have an extremely long tail of extensions with 0% Proliferation. This is somewhat tangential to the GHC20XX process, but I'm curious how many of these extensions actually have no uses as opposed to being rounded to zero. There might be some opportunities to remove unused extensions that just add to GHC's maintenance burden. On Sun, Nov 15, 2020, at 12:19, Joachim Breitner wrote:
Dear Committee,
in hopefully anticipation that we can start the GHC2021 process soon, and also given that the Haskell Survey closes today (so we get that data also soon, I guess), I ran a simple analysis against hackage today. I’ll share the data in a proper way when it's time; this is just a preview into what I have in mind.
## Methodology
The methodology is:
* For each package on hackage, get the latest version * For each such package, try to read out the extensions used in the cabal file and in the modules (This uses https://github.com/kowainik/extensions) * Aggregate these numbers.
Out of 12729 packages fetched (4GB), 12025 could be parsed by the extensions library (if anyone cares about the missing ones, please submit fixes to hat library).
Only 1116 specify extensions in the cabal file using default- extensions; the rest uses per-file extensions exclusively. This is a relatively low number, so barring bugs in the analysis code, this means that most developers prefer per-module extensions (I am one of those).
I extracted three metrics, all represented as percentages:
* Proliferation: #(packages using that extensions)/#(packages parsed)
(I’ll reserve the word “Popularity” for the data from the poll.)
* Innocuousness #(packages enable it in .cabal)/#(packages using default-extensions)
A high number here indicates that many developers want this on by default in their project, across all modules.
* Aloofness #(packages using this, but _only_ in modules, despite using default-extensions) /#(packages using default-extensions)
A high number here indicates that, although a developer was in principle happy with putting extensions into the default-extensions field, they did not include this particular one. I take this as an indication that this extension does _not_ make a great on-by-default extension. (Matching my expectation, TemplateHaskell and CPP make the highest here)
Yes, I went overboard with finding fun names to describe the metrics. Happy to take suggestions for more suitable names, maybe from Richard, our master of sophisticated English vocabulary.
## Source
I am using this code:
https://github.com/nomeata/ghc-proposals-stats/tree/master/ext-stats
The are proof that I am very much capable of writing horrible Haskell code that looks like bad python code.
If someone (including anyone outside the committee) feels like contributing better code, talk to me! Besides just prettier code, the tool could understand and be smart about implications between extensions, be clever about extensions that are part of the default-language already, and maybe provide nicer reporting.
## Results
Eventually I’ll provide a CSV file with all relevant data (hackage stats and poll results), and maybe even upload it on some suitable web service that allows you to explore such a table easily (suggestions?).
For now, a preview of the complete output, sorted by Proliferation:
Proliferation | Innocousness | | | Aloofness | | | 0% 0% 0% NoMonoLocalBinds 0% 0% 0% NoRebindableSyntax 0% 0% 0% NoMagicHash 0% 1% 0% RelaxedPolyRec 0% 3% 0% DoAndIfThenElse 0% 4% 0% ForeignFunctionInterface 0% 12% 0% PatternGuards 0% 14% 0% EmptyDataDecls 0% 0% 0% ParallelArrays 0% 0% 0% NoForeignFunctionInterface 0% 0% 0% NoDatatypeContexts 0% 0% 0% TransformListComp 0% 0% 0% InterruptibleFFI 0% 1% 0% MonadFailDesugaring 0% 0% 0% NullaryTypeClasses 0% 0% 0% UnboxedSums 0% 0% 0% CApiFFI 0% 0% 0% StaticPointers 0% 0% 0% DerivingStrategies 0% 1% 0% NamedWildCards 0% 0% 0% GHCForeignImportPrim 0% 0% 0% TemplateHaskellQuotes 0% 0% 0% GADTSyntax 0% 0% 0% JavaScriptFFI 0% 0% 0% PostfixOperators 0% 0% 0% DeriveLift 0% 0% 0% NondecreasingIndentation 0% 0% 0% Strict 0% 1% 0% NumDecimals 0% 0% 0% AutoDeriveTypeable 0% 1% 1% StrictData 0% 0% 0% ConstrainedClassMethods 0% 1% 0% DisambiguateRecordFields 0% 2% 0% NegativeLiterals 0% 2% 0% MonadComprehensions 0% 0% 0% UnliftedFFITypes 0% 2% 0% OverloadedLabels 0% 2% 0% BinaryLiterals 0% 2% 0% ApplicativeDo 0% 1% 0% MonoLocalBinds 0% 0% 1% ExplicitNamespaces 0% 1% 1% TypeFamilyDependencies 0% 0% 1% UndecidableSuperClasses 0% 0% 1% RoleAnnotations 1% 0% 1% ExplicitForAll 1% 0% 1% ExtendedDefaultRules 1% 2% 0% RebindableSyntax 1% 3% 0% EmptyCase 1% 3% 0% PartialTypeSignatures 1% 3% 1% DuplicateRecordFields 1% 2% 1% TypeInType 1% 2% 1% ImpredicativeTypes 1% 2% 0% RecursiveDo 1% 0% 2% ImplicitParams 1% 0% 2% OverloadedLists 1% 0% 1% IncoherentInstances 1% 11% 0% LiberalTypeSynonyms 1% 2% 2% AllowAmbiguousTypes 1% 1% 3% DeriveAnyClass 1% 10% 0% ParallelListComp 2% 2% 1% PackageImports 2% 6% 1% InstanceSigs 2% 11% 0% Arrows 2% 6% 0% UnicodeSyntax 2% 4% 2% PatternSynonyms 2% 6% 3% TypeApplications 2% 0% 2% OverlappingInstances 3% 9% 1% UnboxedTuples 3% 15% 1% MultiWayIf 3% 6% 3% NamedFieldPuns 4% 16% 2% DeriveFoldable 4% 7% 3% PolyKinds 4% 16% 2% DeriveTraversable 4% 10% 3% MagicHash 4% 13% 3% NoMonomorphismRestriction 4% 18% 4% DefaultSignatures 6% 8% 5% ViewPatterns 6% 10% 4% KindSignatures 6% 15% 6% QuasiQuotes 6% 3% 6% ExistentialQuantification 7% 30% 2% NoImplicitPrelude 7% 25% 6% ConstraintKinds 8% 23% 6% DeriveFunctor 8% 22% 6% FunctionalDependencies 9% 24% 6% StandaloneDeriving 9% 24% 6% TupleSections 10% 25% 6% DataKinds 10% 6% 8% TypeSynonymInstances 11% 28% 4% LambdaCase 11% 24% 7% GADTs 12% 27% 5% TypeOperators 12% 6% 15% UndecidableInstances 12% 20% 7% BangPatterns 14% 27% 10% DeriveGeneric 15% 27% 8% RecordWildCards 17% 21% 16% TemplateHaskell 17% 29% 14% GeneralizedNewtypeDeriving 19% 29% 12% RankNTypes 20% 26% 8% DeriveDataTypeable 20% 31% 11% TypeFamilies 21% 34% 11% MultiParamTypeClasses 22% 11% 18% CPP 25% 37% 13% ScopedTypeVariables 26% 42% 14% FlexibleContexts 31% 43% 16% FlexibleInstances 34% 53% 10% OverloadedStrings
Enjoy! Joachim
-- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/
_______________________________________________ ghc-steering-committee mailing list ghc-steering-committee@haskell.org https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee

Hi, I believe the way this script works, extensions with literally no use would not even show up. So all these are used somewhere. Cheers, Joachim Am Montag, den 16.11.2020, 09:33 -0500 schrieb Eric Seidel:
Thanks for the analysis!
We have an extremely long tail of extensions with 0% Proliferation. This is somewhat tangential to the GHC20XX process, but I'm curious how many of these extensions actually have no uses as opposed to being rounded to zero. There might be some opportunities to remove unused extensions that just add to GHC's maintenance burden.
On Sun, Nov 15, 2020, at 12:19, Joachim Breitner wrote:
Dear Committee,
in hopefully anticipation that we can start the GHC2021 process soon, and also given that the Haskell Survey closes today (so we get that data also soon, I guess), I ran a simple analysis against hackage today. I’ll share the data in a proper way when it's time; this is just a preview into what I have in mind.
## Methodology
The methodology is:
* For each package on hackage, get the latest version * For each such package, try to read out the extensions used in the cabal file and in the modules (This uses https://github.com/kowainik/extensions) * Aggregate these numbers.
Out of 12729 packages fetched (4GB), 12025 could be parsed by the extensions library (if anyone cares about the missing ones, please submit fixes to hat library).
Only 1116 specify extensions in the cabal file using default- extensions; the rest uses per-file extensions exclusively. This is a relatively low number, so barring bugs in the analysis code, this means that most developers prefer per-module extensions (I am one of those).
I extracted three metrics, all represented as percentages:
* Proliferation: #(packages using that extensions)/#(packages parsed)
(I’ll reserve the word “Popularity” for the data from the poll.)
* Innocuousness #(packages enable it in .cabal)/#(packages using default-extensions)
A high number here indicates that many developers want this on by default in their project, across all modules.
* Aloofness #(packages using this, but _only_ in modules, despite using default-extensions) /#(packages using default-extensions)
A high number here indicates that, although a developer was in principle happy with putting extensions into the default-extensions field, they did not include this particular one. I take this as an indication that this extension does _not_ make a great on-by-default extension. (Matching my expectation, TemplateHaskell and CPP make the highest here)
Yes, I went overboard with finding fun names to describe the metrics. Happy to take suggestions for more suitable names, maybe from Richard, our master of sophisticated English vocabulary.
## Source
I am using this code:
https://github.com/nomeata/ghc-proposals-stats/tree/master/ext-stats
The are proof that I am very much capable of writing horrible Haskell code that looks like bad python code.
If someone (including anyone outside the committee) feels like contributing better code, talk to me! Besides just prettier code, the tool could understand and be smart about implications between extensions, be clever about extensions that are part of the default-language already, and maybe provide nicer reporting.
## Results
Eventually I’ll provide a CSV file with all relevant data (hackage stats and poll results), and maybe even upload it on some suitable web service that allows you to explore such a table easily (suggestions?).
For now, a preview of the complete output, sorted by Proliferation:
Proliferation | Innocousness | | | Aloofness | | | 0% 0% 0% NoMonoLocalBinds 0% 0% 0% NoRebindableSyntax 0% 0% 0% NoMagicHash 0% 1% 0% RelaxedPolyRec 0% 3% 0% DoAndIfThenElse 0% 4% 0% ForeignFunctionInterface 0% 12% 0% PatternGuards 0% 14% 0% EmptyDataDecls 0% 0% 0% ParallelArrays 0% 0% 0% NoForeignFunctionInterface 0% 0% 0% NoDatatypeContexts 0% 0% 0% TransformListComp 0% 0% 0% InterruptibleFFI 0% 1% 0% MonadFailDesugaring 0% 0% 0% NullaryTypeClasses 0% 0% 0% UnboxedSums 0% 0% 0% CApiFFI 0% 0% 0% StaticPointers 0% 0% 0% DerivingStrategies 0% 1% 0% NamedWildCards 0% 0% 0% GHCForeignImportPrim 0% 0% 0% TemplateHaskellQuotes 0% 0% 0% GADTSyntax 0% 0% 0% JavaScriptFFI 0% 0% 0% PostfixOperators 0% 0% 0% DeriveLift 0% 0% 0% NondecreasingIndentation 0% 0% 0% Strict 0% 1% 0% NumDecimals 0% 0% 0% AutoDeriveTypeable 0% 1% 1% StrictData 0% 0% 0% ConstrainedClassMethods 0% 1% 0% DisambiguateRecordFields 0% 2% 0% NegativeLiterals 0% 2% 0% MonadComprehensions 0% 0% 0% UnliftedFFITypes 0% 2% 0% OverloadedLabels 0% 2% 0% BinaryLiterals 0% 2% 0% ApplicativeDo 0% 1% 0% MonoLocalBinds 0% 0% 1% ExplicitNamespaces 0% 1% 1% TypeFamilyDependencies 0% 0% 1% UndecidableSuperClasses 0% 0% 1% RoleAnnotations 1% 0% 1% ExplicitForAll 1% 0% 1% ExtendedDefaultRules 1% 2% 0% RebindableSyntax 1% 3% 0% EmptyCase 1% 3% 0% PartialTypeSignatures 1% 3% 1% DuplicateRecordFields 1% 2% 1% TypeInType 1% 2% 1% ImpredicativeTypes 1% 2% 0% RecursiveDo 1% 0% 2% ImplicitParams 1% 0% 2% OverloadedLists 1% 0% 1% IncoherentInstances 1% 11% 0% LiberalTypeSynonyms 1% 2% 2% AllowAmbiguousTypes 1% 1% 3% DeriveAnyClass 1% 10% 0% ParallelListComp 2% 2% 1% PackageImports 2% 6% 1% InstanceSigs 2% 11% 0% Arrows 2% 6% 0% UnicodeSyntax 2% 4% 2% PatternSynonyms 2% 6% 3% TypeApplications 2% 0% 2% OverlappingInstances 3% 9% 1% UnboxedTuples 3% 15% 1% MultiWayIf 3% 6% 3% NamedFieldPuns 4% 16% 2% DeriveFoldable 4% 7% 3% PolyKinds 4% 16% 2% DeriveTraversable 4% 10% 3% MagicHash 4% 13% 3% NoMonomorphismRestriction 4% 18% 4% DefaultSignatures 6% 8% 5% ViewPatterns 6% 10% 4% KindSignatures 6% 15% 6% QuasiQuotes 6% 3% 6% ExistentialQuantification 7% 30% 2% NoImplicitPrelude 7% 25% 6% ConstraintKinds 8% 23% 6% DeriveFunctor 8% 22% 6% FunctionalDependencies 9% 24% 6% StandaloneDeriving 9% 24% 6% TupleSections 10% 25% 6% DataKinds 10% 6% 8% TypeSynonymInstances 11% 28% 4% LambdaCase 11% 24% 7% GADTs 12% 27% 5% TypeOperators 12% 6% 15% UndecidableInstances 12% 20% 7% BangPatterns 14% 27% 10% DeriveGeneric 15% 27% 8% RecordWildCards 17% 21% 16% TemplateHaskell 17% 29% 14% GeneralizedNewtypeDeriving 19% 29% 12% RankNTypes 20% 26% 8% DeriveDataTypeable 20% 31% 11% TypeFamilies 21% 34% 11% MultiParamTypeClasses 22% 11% 18% CPP 25% 37% 13% ScopedTypeVariables 26% 42% 14% FlexibleContexts 31% 43% 16% FlexibleInstances 34% 53% 10% OverloadedStrings
Enjoy! Joachim
-- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/
_______________________________________________ ghc-steering-committee mailing list ghc-steering-committee@haskell.org https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee
_______________________________________________ ghc-steering-committee mailing list ghc-steering-committee@haskell.org https://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-steering-committee -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/

Hi, Am Sonntag, den 15.11.2020, 18:19 +0100 schrieb Joachim Breitner:
in hopefully anticipation that we can start the GHC2021 process soon, and also given that the Haskell Survey closes today (so we get that data also soon, I guess)
for those of us who are as eager as I am to use the momentum here: I expect the survey data on the weekend. Then I’ll get the data into a presentable form, and start the process here. Cheers, Joachim -- Joachim Breitner mail@joachim-breitner.de http://www.joachim-breitner.de/
participants (3)
-
Eric Seidel
-
Joachim Breitner
-
Richard Eisenberg