Oh, also my complete pipeline is here: https://github.com/chrisdone/prana/blob/0cbb7b4b96bbfdb4f0d6a60e08f4b1f53abdfb15/prana-ghc/src/Prana/Ghc.hs#L106-L154

Parse/typecheck/desugar, tidy, prep, core-to-stg (ripped from HscMain) and then I try to resolve all names in the AST and that leads me to this.

On Sat, 30 Mar 2019 at 14:05, Christopher Done <chrisdone@gmail.com> wrote:

Hi all,

Could you offer some insight into newtypes at the STG level? Here’s the
context:

  1. I’m working on this interpreter for STG
    (https://github.com/chrisdone/prana) and I’m trying to generate a
    pristine AST where all names in it are resolved to something known to
    me.

  2. I’ve compiled ghc-prim and integer-gmp without issue, and I’m
    compiling base and there remains one last frontier which is newtypes.

These are the culprits pointed out if I compile base:

chris@precision:~/Work/chrisdone/prana/ghc-8.4/libraries/base-4.11.1.0$ PRANA_INDEX=../prana.idx stack build --exec './Setup build --ghc-options=-O0' --file-watch
Preprocessing library for base-4.11.1.0..
Building library for base-4.11.1.0..
[1 of 244] Compiling GHC.Base
[2 of 244] Compiling GHC.IO
[3 of 244] Compiling GHC.Real
[4 of 244] Compiling Data.Semigroup.Internal
... [snip] ...
[242 of 244] Compiling Data.Functor.Compose
[243 of 244] Compiling Data.Fixed
[244 of 244] Compiling Data.Complex

Errors in Data.Foldable:
  Variable name not found: base:Data.Semigroup.Internal.All
  Variable name not found: base:Data.Semigroup.Internal.Any

Errors in Foreign.Marshal.Pool:
  Variable name not found: base:Foreign.Marshal.Pool.Pool

Errors in GHC.ExecutionStack.Internal:
  Variable name not found: base:GHC.ExecutionStack.Internal.StackTrace

Errors in Data.Bifoldable:
  Variable name not found: base:Data.Semigroup.Internal.All
  Variable name not found: base:Data.Semigroup.Internal.Any

Errors in System.Timeout:
  Variable name not found: base:System.Timeout.Timeout

Errors in Data.Data:
  Variable name not found: base:Foreign.Ptr.WordPtr
  Variable name not found: base:Foreign.Ptr.IntPtr
  Variable name not found: base:Data.Semigroup.Internal.Any
  Variable name not found: base:Data.Semigroup.Internal.All

I looked these up, and they all appear to be uses of a
newtype.

For example, in the Timeout function:

http://hackage.haskell.org/package/base-4.12.0.0/docs/src/System.Timeout.html#timeout

I printed out the full [StgTopBinding] list and got:

let {
  sat_s27NC [Occ=Once] :: IO Timeout
  [LclId] =
      [] \u [] fmap $fFunctorIO Timeout newUnique;
} in  >>= $fMonadIO sat_s27NC sat_s27NT;

Oddly (or not?), they’re used as values, not constructors. This error
comes from this part of my code:

StgSyn.StgApp occ arguments ->
  AppExpr <$> lookupSomeVarId occ <*> traverse fromStgGenArg arguments

If it was used as a constructor, it’d appear in this position, where
looking up the ID would produce a “Data constructor name not found”
error:

StgSyn.StgConApp dataCon arguments types ->
ConAppExpr <$> lookupDataConId dataCon <> traverse fromStgGenArg arguments <>
pure (map (const Type) types)

My understanding of newtypes at this stage is hazy. It seems like:

  • They ought to be erased and replaced with coercions by now. If
    they’re not replaced, it’s because they’re in a not-quite-id position
    like fmap Timeout .... (Arguably these could be fixed in base with a
    Data.Coerce.coerce?)

  • However, the CoreTidy/PrepPgm processing modules
    don’t seem to have removed or replaced these, or introduced a binding
    that would do something.

At this stage what would you recommend? At this point type-checking is
done, and I only care about interpreting the code. So I suppose they
could be id for all it matters to the interpreter?

I imagine they aren’t actually supposed to allocate something here. And
I’m certain that any pattern matching on a newtype is also erased away
by this point.

Cheers!

Chris