Right, I see, thanks.

This is what I was attempting so far:

data DsMessage =
    DsUnknownMessage !DiagnosticMessage
  | DsLiftedTcRnMessage !TcRnMessage
  -- ^ A diagnostic coming straight from the Typecheck-renamer.

and later:

liftTcRnMessages :: MonadIO m => IORef (Messages TcRnMessage) -> m (IORef (Messages DsMessage))
liftTcRnMessages ref = liftIO $ do
  oldContent <- readIORef ref
  newIORef (DsLiftedTcRnMessage <$> oldContent)

...

mkDsEnvsFromTcGbl :: MonadIO m
                  => HscEnv -> IORef (Messages TcRnMessage) -> TcGblEnv
                  -> m (DsGblEnv, DsLclEnv)
mkDsEnvsFromTcGbl hsc_env msg_var tcg_env
  = do { cc_st_var   <- liftIO $ newIORef newCostCentreState
       ...
       ; msg_var' <- liftTcRnMessages msg_var
       ; return $ mkDsEnvs unit_env this_mod rdr_env type_env fam_inst_env
                           msg_var' cc_st_var complete_matches
       }


While this typechecks, I wonder if that's the right way to think about it -- from your reply, it seems like the dependency is in the opposite direction -- we need to store desugaring diagnostics in the TcM due to TH splicing, not the other way around.

I'll explore the idea of adding a second IORef.

Thanks!


On Tue, 30 Mar 2021 at 10:51, Simon Peyton Jones <simonpj@microsoft.com> wrote:

I think the main reason is that for Template Haskell the renamer/type-checker need to run the desugarer.  See the call to initDsTc in GHC.Tc.Gen.Splice.

 

I suppose an alternative is that the TcGblEnv could have a second IORef to use for error messages that come from desugaring during TH splices.

 

Nothing deeper than that I think.

 

Simon

 

From: ghc-devs <ghc-devs-bounces@haskell.org> On Behalf Of Alfredo Di Napoli
Sent: 30 March 2021 08:42
To: Simon Peyton Jones via ghc-devs <ghc-devs@haskell.org>
Subject: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

 

Hello folks,

 

as some of you might know me and Richard are reworking how GHC constructs, emits and deals with errors and warnings (See https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values and #18516).

 

To summarise very briefly the spirit, we will have (eventually) proper domain-specific types instead of SDocs. The idea is to have very precise and "focused" types for the different phases of the compilation pipeline, and a "catch-all" monomorphic `GhcMessage` type used for the final pretty-printing and exception-throwing:

 

data GhcMessage where

  GhcPsMessage      :: PsMessage -> GhcMessage

  GhcTcRnMessage    :: TcRnMessage -> GhcMessage

  GhcDsMessage      :: DsMessage -> GhcMessage

  GhcDriverMessage  :: DriverMessage -> GhcMessage

  GhcUnknownMessage :: forall a. (Diagnostic a, Typeable a) => a -> GhcMessage

 

While starting to refactor GHC to use these types, I have stepped into something bizarre: the `DsGblEnv` and `TcLclEnv` envs both share the same `IORef` to store the diagnostics (i.e. warnings and errors) accumulated during compilation. More specifically, a function like `GHC.HsToCore.Monad.mkDsEnvsFromTcGbl` simply receives as input the `IORef` coming straight from the `TcLclEnv`, and stores it into the `DsGblEnv`.

 

This is unfortunate, because it would force me to change the type of this `IORef` to be

`IORef (Messages GhcMessage)` to accommodate both diagnostic types, but this would bubble up into top-level functions like `initTc`, which would now return a `Messages GhcMessage`. This is once again unfortunate, because is "premature": ideally it might still be nice to return `Messages TcRnMessage`, so that GHC API users could get a very precise diagnostic type rather than the bag `GhcMessage` is. It also violates an implicit contract: we are saying that `initTc` might return (potentially) *any* GHC diagnostic message (including, for example, driver errors/warnings), which I think is misleading.

 

Having said all of that, it's also possible that returning `Messages GhcMessage` is totally fine here and we don't need to be able to do this fine-grained distinction for the GHC API functions. Regardless, I would like to ask the audience:

 

* Why `TcLclEnv` and `DsGblEnv` need to share the same IORef?

* Is this for efficiency reasons?

* Is this because we need the two monads to independently accumulate errors into the

  same IORef?

 

Thanks!

 

Alfredo