Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

30 Mar 2021

      Hello folks,

as some of you might know me and Richard are reworking how GHC constructs,
emits and deals with errors and warnings (See
https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values
and #18516).

To summarise very briefly the spirit, we will have (eventually) proper
domain-specific types instead of SDocs. The idea is to have very precise
and "focused" types for the different phases of the compilation pipeline,
and a "catch-all" monomorphic `GhcMessage` type used for the final
pretty-printing and exception-throwing:

data GhcMessage where
  GhcPsMessage      :: PsMessage -> GhcMessage
  GhcTcRnMessage    :: TcRnMessage -> GhcMessage
  GhcDsMessage      :: DsMessage -> GhcMessage
  GhcDriverMessage  :: DriverMessage -> GhcMessage
  GhcUnknownMessage :: forall a. (Diagnostic a, Typeable a) => a ->
GhcMessage

While starting to refactor GHC to use these types, I have stepped into
something bizarre: the `DsGblEnv` and `TcLclEnv` envs both share the same
`IORef` to store the diagnostics (i.e. warnings and errors) accumulated
during compilation. More specifically, a function like
`GHC.HsToCore.Monad.mkDsEnvsFromTcGbl` simply receives as input the `IORef`
coming straight from the `TcLclEnv`, and stores it into the `DsGblEnv`.

This is unfortunate, because it would force me to change the type of this
`IORef` to be
`IORef (Messages GhcMessage)` to accommodate both diagnostic types, but
this would bubble up into top-level functions like `initTc`, which would
now return a `Messages GhcMessage`. This is once again unfortunate, because
is "premature": ideally it might still be nice to return `Messages
TcRnMessage`, so that GHC API users could get a very precise diagnostic
type rather than the bag `GhcMessage` is. It also violates an implicit
contract: we are saying that `initTc` might return (potentially) *any* GHC
diagnostic message (including, for example, driver errors/warnings), which
I think is misleading.

Having said all of that, it's also possible that returning `Messages
GhcMessage` is totally fine here and we don't need to be able to do this
fine-grained distinction for the GHC API functions. Regardless, I would
like to ask the audience:

* Why `TcLclEnv` and `DsGblEnv` need to share the same IORef?
* Is this for efficiency reasons?
* Is this because we need the two monads to independently accumulate errors
into the
  same IORef?

Thanks!

Alfredo

Alfredo Di Napoli

Simon Peyton Jones

Alfredo Di Napoli

Richard Eisenberg

Alfredo Di Napoli

John Ericson

Alfredo Di Napoli

Alfredo Di Napoli

John Ericson

Alfredo Di Napoli

John Ericson

tags

participants (4)