GSOC Idea: Bytecode serialization and/or Fat Interface files

12 Mar 2021

      Hi all,

This is following up on this recent discussion on the list concerning fat
interface files: https://mail.haskell.org/pipermail/ghc-devs/2020-October/019324.html

Now that we have been accepted as a GSOC organisation, I think
it would be a good project idea for a sufficiently motivated and
advanced student. This is a call for mentors (and students as
well!) who would be interested in this project

The problem is the following:

Haskell Language Server (and ghci with `-fno-code`) have very
fast startup times for codebases which don't make use of Template
Haskell, and thus don't require any code-gen to typecheck. This
is because they can simply read the cached iface files generated by a
previous compile and don't need to re-invoke the typechecker.

However, as soon as TH is involved, we are forced to retypecheck and
compile files, since it is not possible to restart the code-gen process
starting with only a iface file. I can think of two ways to address this
problem:

1. Allow bytecode to be serialized

2. Serialize desugared Core into iface files (fat interfaces), so that
(byte)code-gen can be restarted from this point and doesn't need 

(1) might be challenging, but offers a few more advantages over (2),
in that we can reduce the work done to load TH-heavy codebases to just
a load of the cached bytecode objects from disk, and could make the
load process (and times) for these codebases directly comparable to
their TH-free cousins.

It would also make ghci startup a lot faster with a warm cache of
bytecode objects, bringing ghci startup times in line with those of
-fno-code 

However (2) might be much easier to achieve and offers many
of the same advantages, in that we would not need to re-run
the compiler frontend or core-to-core optimisation phases.
There is also already a (slightly bitrotted) implementation
of (2) thanks to the work of Edward Yang.

If any of this sounds exciting to you as a student or a mentor, please
get in touch.

In particular, I think (2) is a feasible project that can be completed
with minimal mentoring effort. However, I'm only vaguely familiar with
the details of the byte code generator, so if (1) is a direction we want
to pursue, we would need a mentor familiar with the details of this part
of GHC.

Cheers,
Zubin

Zubin Duggal

Cheng Shao

Moritz Angermann

John Ericson

Moritz Angermann

Ben Gamari

tags

participants (5)