Prelimary GHC extensions stats

15 Nov 2020

      Dear Committee,

in hopefully anticipation that we can start the GHC2021 process soon,
and also given that the Haskell Survey closes today (so we get that
data also soon, I guess), I ran a simple analysis against hackage
today. I’ll share the data in a proper way when it's time; this is just
a preview into what I have in mind.

## Methodology

The methodology is:

 * For each package on hackage, get the latest version
 * For each such package, try to read out the extensions used
   in the cabal file and in the modules
   (This uses https://github.com/kowainik/extensions)
 * Aggregate these numbers.

Out of 12729 packages fetched (4GB), 12025 could be parsed by the
extensions library (if anyone cares about the missing ones, please
submit fixes to hat library).

Only 1116 specify extensions in the cabal file using default-
extensions; the rest uses per-file extensions exclusively. This is a
relatively low number, so barring bugs in the analysis code, this means
that most developers prefer per-module extensions (I am one of those).

I extracted three metrics, all represented as percentages:

 * Proliferation:
   #(packages using that extensions)/#(packages parsed)

   (I’ll reserve the word “Popularity” for the data from the poll.)

 * Innocuousness
   #(packages enable it in .cabal)/#(packages using default-extensions)

   A high number here indicates that many developers want this on by
   default in their project, across all modules.

 * Aloofness
   #(packages using this, but _only_ in modules, despite using default-extensions)
      /#(packages using default-extensions)

   A high number here indicates that, although a developer was in
   principle happy with putting extensions into the default-extensions
   field, they did not include this particular one. I take this as an
   indication that this extension does _not_ make a great on-by-default
   extension. (Matching my expectation, TemplateHaskell and CPP make
   the highest here)

Yes, I went overboard with finding fun names to describe the metrics.
Happy to take suggestions for more suitable names, maybe from Richard,
our master of sophisticated English vocabulary.

## Source

I am using this code:

https://github.com/nomeata/ghc-proposals-stats/tree/master/ext-stats

The are proof that I am very much capable of writing horrible Haskell
code that looks like bad python code.

If someone (including anyone
outside the committee) feels like contributing better code, talk to me!
Besides just prettier code, the tool could understand and be smart
about implications between extensions, be clever about extensions that
are part of the default-language already, and maybe provide nicer
reporting.

## Results

Eventually I’ll provide a CSV file with all relevant data (hackage
stats and poll results), and maybe even upload it on some suitable web
service that allows you to explore such a table easily (suggestions?).

For now, a preview of the complete output, sorted by Proliferation:

  Proliferation
  |   Innocousness
  |   |   | Aloofness
  |   |   |
 0%  0%  0% NoMonoLocalBinds
 0%  0%  0% NoRebindableSyntax
 0%  0%  0% NoMagicHash
 0%  1%  0% RelaxedPolyRec
 0%  3%  0% DoAndIfThenElse
 0%  4%  0% ForeignFunctionInterface
 0% 12%  0% PatternGuards
 0% 14%  0% EmptyDataDecls
 0%  0%  0% ParallelArrays
 0%  0%  0% NoForeignFunctionInterface
 0%  0%  0% NoDatatypeContexts
 0%  0%  0% TransformListComp
 0%  0%  0% InterruptibleFFI
 0%  1%  0% MonadFailDesugaring
 0%  0%  0% NullaryTypeClasses
 0%  0%  0% UnboxedSums
 0%  0%  0% CApiFFI
 0%  0%  0% StaticPointers
 0%  0%  0% DerivingStrategies
 0%  1%  0% NamedWildCards
 0%  0%  0% GHCForeignImportPrim
 0%  0%  0% TemplateHaskellQuotes
 0%  0%  0% GADTSyntax
 0%  0%  0% JavaScriptFFI
 0%  0%  0% PostfixOperators
 0%  0%  0% DeriveLift
 0%  0%  0% NondecreasingIndentation
 0%  0%  0% Strict
 0%  1%  0% NumDecimals
 0%  0%  0% AutoDeriveTypeable
 0%  1%  1% StrictData
 0%  0%  0% ConstrainedClassMethods
 0%  1%  0% DisambiguateRecordFields
 0%  2%  0% NegativeLiterals
 0%  2%  0% MonadComprehensions
 0%  0%  0% UnliftedFFITypes
 0%  2%  0% OverloadedLabels
 0%  2%  0% BinaryLiterals
 0%  2%  0% ApplicativeDo
 0%  1%  0% MonoLocalBinds
 0%  0%  1% ExplicitNamespaces
 0%  1%  1% TypeFamilyDependencies
 0%  0%  1% UndecidableSuperClasses
 0%  0%  1% RoleAnnotations
 1%  0%  1% ExplicitForAll
 1%  0%  1% ExtendedDefaultRules
 1%  2%  0% RebindableSyntax
 1%  3%  0% EmptyCase
 1%  3%  0% PartialTypeSignatures
 1%  3%  1% DuplicateRecordFields
 1%  2%  1% TypeInType
 1%  2%  1% ImpredicativeTypes
 1%  2%  0% RecursiveDo
 1%  0%  2% ImplicitParams
 1%  0%  2% OverloadedLists
 1%  0%  1% IncoherentInstances
 1% 11%  0% LiberalTypeSynonyms
 1%  2%  2% AllowAmbiguousTypes
 1%  1%  3% DeriveAnyClass
 1% 10%  0% ParallelListComp
 2%  2%  1% PackageImports
 2%  6%  1% InstanceSigs
 2% 11%  0% Arrows
 2%  6%  0% UnicodeSyntax
 2%  4%  2% PatternSynonyms
 2%  6%  3% TypeApplications
 2%  0%  2% OverlappingInstances
 3%  9%  1% UnboxedTuples
 3% 15%  1% MultiWayIf
 3%  6%  3% NamedFieldPuns
 4% 16%  2% DeriveFoldable
 4%  7%  3% PolyKinds
 4% 16%  2% DeriveTraversable
 4% 10%  3% MagicHash
 4% 13%  3% NoMonomorphismRestriction
 4% 18%  4% DefaultSignatures
 6%  8%  5% ViewPatterns
 6% 10%  4% KindSignatures
 6% 15%  6% QuasiQuotes
 6%  3%  6% ExistentialQuantification
 7% 30%  2% NoImplicitPrelude
 7% 25%  6% ConstraintKinds
 8% 23%  6% DeriveFunctor
 8% 22%  6% FunctionalDependencies
 9% 24%  6% StandaloneDeriving
 9% 24%  6% TupleSections
10% 25%  6% DataKinds
10%  6%  8% TypeSynonymInstances
11% 28%  4% LambdaCase
11% 24%  7% GADTs
12% 27%  5% TypeOperators
12%  6% 15% UndecidableInstances
12% 20%  7% BangPatterns
14% 27% 10% DeriveGeneric
15% 27%  8% RecordWildCards
17% 21% 16% TemplateHaskell
17% 29% 14% GeneralizedNewtypeDeriving
19% 29% 12% RankNTypes
20% 26%  8% DeriveDataTypeable
20% 31% 11% TypeFamilies
21% 34% 11% MultiParamTypeClasses
22% 11% 18% CPP
25% 37% 13% ScopedTypeVariables
26% 42% 14% FlexibleContexts
31% 43% 16% FlexibleInstances
34% 53% 10% OverloadedStrings

Enjoy!
Joachim

-- 
Joachim Breitner
  mail@joachim-breitner.de
  http://www.joachim-breitner.de/

Joachim Breitner

Joachim Breitner

Richard Eisenberg

Joachim Breitner

Eric Seidel

Joachim Breitner

Joachim Breitner

tags

participants (3)