Writing a simple Core evaluator, having trouble with name lookups

newer
Considerations for Control flow...

older
make slow test suite failures (at...

Christopher Done

28 Nov 2018 28 Nov '18

1:59 a.m.

Hi all, I'm attempting to make a simple evaluator for GHC core, but I'm not clear on how to reliably looking up names. I'm compiling each of ghc-prim, integer-simple and base with a patched version of GHC which performs an extra output step with the Core AST to a file for each module. Later, I load those files in. So for an input Haskell file like this: module Main (main,Foo(..)) where class Foo a where foo :: a -> Int instance Foo Int where foo x = x * x instance Foo Char where foo x = 99 main = print (foo (123 :: Int)) I have an output set of bindings like this: https://gist.github.com/chrisdone/cb05a77d3fcb081a4580b5f85289674a One thing that I immediately notice is that the names of things are completely non-unique, especially in generated names. So here are two implementations of the method "foo" for the class "Foo": ( Id {idStableName = "main:Main:$cfoo", idUnique = Unique 6989586621679010917}, ...) -- Int ( Id {idStableName = "main:Main:$cfoo", idUnique = Unique 6989586621679010923}, ...) -- Char So e.g. the instance for "Foo Int" refers to the above method implementation via its Unique (6989586621679010923): ( Id {idStableName = "main:Main:$fFooInt", idUnique = Unique 8214565720323784705} , CastE (VarE (Id { idStableName = "main:Main:$cfoo" , idUnique = Unique 6989586621679010923 <---- HERE }))) At first, I thought I would use the Unique associated with every Name to make a lookup. This is completely reliable within one GHC compilation, but I've read in the docs that it's not stable across multiple invocations? What does that mean for my case? Another thing I notice is that type-class methods are not generated at the core level. I have, for example, this method call that provides it the instance dictionary, (AppE (AppE (VarE (Id { idStableName = "main:Main:foo" , idUnique = Unique 8214565720323784707 <---- MISSING })) (TypE (Typ "Int"))) (VarE (Id { idStableName = "main:Main:$fFooInt" , idUnique = Unique 8214565720323784705 }))) But the "main:Main:foo" (8214565720323784707) is not produced in the CoreProgram, it seems. My compile step is very simple: compile :: GHC.GhcMonad m => GHC.ModSummary -> m [CoreSyn.Bind GHC.Var] compile modSummary = do parsedModule <- GHC.parseModule modSummary typecheckedModule <- GHC.typecheckModule parsedModule desugared <- GHC.desugarModule typecheckedModule let binds = GHC.mg_binds (GHC.dm_core_module desugared) pure binds It simply gets the bindings and that's all from the ModGuts. mg_binds :: !CoreProgram Two questions: 1) How do I recognize class methods when I see one, like the "main:Main:foo" above? Maybe this? isClassOpId_maybe :: Id -> Maybe Class Is an "op" what GHC calls type-class methods? 2) If I compile e.g. ghc-prim and that generates a binding Name with ID 123, and then I compile base -- will the ID 123 be re-used by base for something else, or will any reference to 123 in the compiled Names for base refer ONLY to that one in ghc-prim? In other words, when GHC loads the iface for ghc-prim, does it generate a fresh set of names for everything in ghc-prim, or does it load them from file? Cheers!

Attachments:

attachment.html (text/html — 4.9 KB)

Show replies by date

Csaba Hruska

30 Nov 30 Nov

4:17 p.m.

Hi! I can give some info for your second question. GHC uses wired-in id's for the primitives and some other AST construction too. Read more here: https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/WiredIn Regarding the names you can use qualified (module + occ) names for exported ids. (see: *Var.isExportedId*). For non exported Id's you can rely on unique values. Use the *Name.nameModule_maybe* and *Name.getName* function to get the id's module name. In my project I export STG for further compilation. Here is the conversion code with proper name conversion: https://github.com/grin-tech/ghc-grin/blob/master/ghc-dump-core/GhcDump_StgC... I've also learned that GHC wraps the* Main.main* function with another function called *:Main.main* which is the first function called by the RTS. Cheers, Csaba On Tue, Nov 27, 2018 at 7:00 PM Christopher Done wrote:

...

Hi all,

I'm attempting to make a simple evaluator for GHC core, but I'm not clear on how to reliably looking up names. I'm compiling each of ghc-prim, integer-simple and base with a patched version of GHC which performs an extra output step with the Core AST to a file for each module.

Later, I load those files in. So for an input Haskell file like this:

module Main (main,Foo(..)) where class Foo a where foo :: a -> Int instance Foo Int where foo x = x * x instance Foo Char where foo x = 99 main = print (foo (123 :: Int))

I have an output set of bindings like this:

https://gist.github.com/chrisdone/cb05a77d3fcb081a4580b5f85289674a

One thing that I immediately notice is that the names of things are completely non-unique, especially in generated names. So here are two implementations of the method "foo" for the class "Foo":

( Id {idStableName = "main:Main:$cfoo", idUnique = Unique 6989586621679010917}, ...) -- Int ( Id {idStableName = "main:Main:$cfoo", idUnique = Unique 6989586621679010923}, ...) -- Char

So e.g. the instance for "Foo Int" refers to the above method implementation via its Unique (6989586621679010923):

( Id {idStableName = "main:Main:$fFooInt", idUnique = Unique 8214565720323784705} , CastE (VarE (Id { idStableName = "main:Main:$cfoo" , idUnique = Unique 6989586621679010923 <---- HERE })))

At first, I thought I would use the Unique associated with every Name to make a lookup. This is completely reliable within one GHC compilation, but I've read in the docs that it's not stable across multiple invocations? What does that mean for my case?

Another thing I notice is that type-class methods are not generated at the core level. I have, for example, this method call that provides it the instance dictionary,

(AppE (AppE (VarE (Id { idStableName = "main:Main:foo" , idUnique = Unique 8214565720323784707 <---- MISSING })) (TypE (Typ "Int"))) (VarE (Id { idStableName = "main:Main:$fFooInt" , idUnique = Unique 8214565720323784705 })))

But the "main:Main:foo" (8214565720323784707) is not produced in the CoreProgram, it seems. My compile step is very simple:

compile :: GHC.GhcMonad m => GHC.ModSummary -> m [CoreSyn.Bind GHC.Var] compile modSummary = do parsedModule <- GHC.parseModule modSummary typecheckedModule <- GHC.typecheckModule parsedModule desugared <- GHC.desugarModule typecheckedModule let binds = GHC.mg_binds (GHC.dm_core_module desugared) pure binds

It simply gets the bindings and that's all from the ModGuts.

mg_binds :: !CoreProgram

Two questions:

1) How do I recognize class methods when I see one, like the "main:Main:foo" above?

Maybe this? isClassOpId_maybe :: Id -> Maybe Class

Is an "op" what GHC calls type-class methods?

2) If I compile e.g. ghc-prim and that generates a binding Name with ID 123, and then I compile base -- will the ID 123 be re-used by base for something else, or will any reference to 123 in the compiled Names for base refer ONLY to that one in ghc-prim? In other words, when GHC loads the iface for ghc-prim, does it generate a fresh set of names for everything in ghc-prim, or does it load them from file?

Cheers! _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Christopher Done

7:23 p.m.

Hi Csaba, Thanks for your answer. I think I'm doing something very similar to you. I'm converting the GHC AST to my own AST with this code: https://gist.github.com/chrisdone/96e228f12bdbc3c43d06718467e69029#file-main... I have an Id type that's similar to yours, data Id = Id { idStableName :: {-# UNPACK #-}!ByteString , idUnique :: {-# UNPACK #-}!Unique , idCategory :: !Cat } deriving (Generic, Data, Typeable, Eq, Show, Ord) And (I think) I've solved my first question by adding that category field, which can be one of: data Cat = ValCat | DataCat | ClassCat deriving (Generic, Data, Typeable, Eq, Show, Ord) When an expression like this comes along: AppE (AppE (VarE (Id { idStableName = "main:Main.C:Wiggle" , idUnique = Unique 8214565720323785205 , idCategory = ClassCat })) (TypE (Typ "Int"))) (VarE (Id { idStableName = "main:Main.$cfoo" , idUnique = Unique 6989586621679010971 , idCategory = ValCat })) VarE (Id { idStableName = "main:Main.$cbar" , idUnique = Unique 6989586621679010975 , idCategory = ValCat }) It constructs a class dictionary for the class "Wiggle", with the methods "foo" and "bar", and the interpreter treats Wiggle as a regular data constructor. Meanwhile, when an nth class method (like "main:Main.$cfoo") is resolved, it produces a Method, and when applying a method to a data constructor Wiggle, the interpreter access the nth slot. That works out. An example is here: https://gist.github.com/chrisdone/771292425a9f1bb428ef4eda3779dc40 See how the "main:Main.$fWiggleInt" is resolved to the above dictionary. As for the second question, and your answer, I am a bit puzzled, maybe you can clarify?

...

Regarding the names you can use qualified (module + occ) names for exported ids. (see: Var.isExportedId).

For non exported Id's you can rely on unique values.

You say that for exported names I can rely soley on the stable name (package+module+ident), and for internal IDs I can use the Unique? I believe that's what I'm reproducing, here. The following name isn't resolved: Id { idStableName = "base:GHC.Base.$fApplicativeIO" , idUnique = Unique 8214565720323784738 , idCategory = ValCat } If I list all bindings available, I find this name, which has a different Unique: Id { idStableName = "base:GHC.Base.$fApplicativeIO" , idUnique = Unique 8214565720323793553 <--- differing , idCategory = ValCat } So if I were to apply your approach and do exported lookups by idStableName, and local-module lookups (beta substitution) by Unique, it should work out. The question is whether "base:GHC.Base.$fApplicativeIO" is actually exported -- I presume all instance dictionaries are exported. I will give it a try and report back to you! Cheers!

Csaba Hruska

7:46 p.m.

Just to clarify, for exported names ignore the unique value completely and use module name + occurrence name. In your case that is the idStableName field. For non exported names use idStableName + idUnique. I'll also check your project. Cheers, Csaba On Fri, Nov 30, 2018 at 12:23 PM Christopher Done wrote:

...

Hi Csaba,

Thanks for your answer. I think I'm doing something very similar to you. I'm converting the GHC AST to my own AST with this code:

https://gist.github.com/chrisdone/96e228f12bdbc3c43d06718467e69029#file-main...

I have an Id type that's similar to yours,

data Id = Id { idStableName :: {-# UNPACK #-}!ByteString , idUnique :: {-# UNPACK #-}!Unique , idCategory :: !Cat } deriving (Generic, Data, Typeable, Eq, Show, Ord)

And (I think) I've solved my first question by adding that category field, which can be one of:

data Cat = ValCat | DataCat | ClassCat deriving (Generic, Data, Typeable, Eq, Show, Ord)

When an expression like this comes along:

AppE (AppE (VarE (Id { idStableName = "main:Main.C:Wiggle" , idUnique = Unique 8214565720323785205 , idCategory = ClassCat })) (TypE (Typ "Int"))) (VarE (Id { idStableName = "main:Main.$cfoo" , idUnique = Unique 6989586621679010971 , idCategory = ValCat })) VarE (Id { idStableName = "main:Main.$cbar" , idUnique = Unique 6989586621679010975 , idCategory = ValCat })

It constructs a class dictionary for the class "Wiggle", with the methods "foo" and "bar", and the interpreter treats Wiggle as a regular data constructor. Meanwhile, when an nth class method (like "main:Main.$cfoo") is resolved, it produces a Method, and when applying a method to a data constructor Wiggle, the interpreter access the nth slot.

That works out. An example is here:

https://gist.github.com/chrisdone/771292425a9f1bb428ef4eda3779dc40

See how the "main:Main.$fWiggleInt" is resolved to the above dictionary.

As for the second question, and your answer, I am a bit puzzled, maybe you can clarify?

...
Regarding the names you can use qualified (module + occ) names for exported ids. (see: Var.isExportedId).

For non exported Id's you can rely on unique values.

You say that for exported names I can rely soley on the stable name (package+module+ident), and for internal IDs I can use the Unique?

I believe that's what I'm reproducing, here. The following name isn't resolved:

Id { idStableName = "base:GHC.Base.$fApplicativeIO" , idUnique = Unique 8214565720323784738 , idCategory = ValCat }

If I list all bindings available, I find this name, which has a different Unique:

Id { idStableName = "base:GHC.Base.$fApplicativeIO" , idUnique = Unique 8214565720323793553 <--- differing , idCategory = ValCat }

So if I were to apply your approach and do exported lookups by idStableName, and local-module lookups (beta substitution) by Unique, it should work out.

The question is whether "base:GHC.Base.$fApplicativeIO" is actually exported -- I presume all instance dictionaries are exported.

I will give it a try and report back to you!

Cheers!

Ben Gamari

9:58 p.m.

Christopher Done writes:

...

Hi all,

I'm attempting to make a simple evaluator for GHC core, but I'm not clear on how to reliably looking up names. I'm compiling each of ghc-prim, integer-simple and base with a patched version of GHC which performs an extra output step with the Core AST to a file for each module.

...

Two questions:

1) How do I recognize class methods when I see one, like the "main:Main:foo" above?

Maybe this? isClassOpId_maybe :: Id -> Maybe Class

Is an "op" what GHC calls type-class methods?

Yes, I believe this will do what you are looking for.

...

2) If I compile e.g. ghc-prim and that generates a binding Name with ID 123, and then I compile base -- will the ID 123 be re-used by base for something else, or will any reference to 123 in the compiled Names for base refer ONLY to that one in ghc-prim? In other words, when GHC loads the iface for ghc-prim, does it generate a fresh set of names for everything in ghc-prim, or does it load them from file?

Perhaps I am misunderstanding Csaba's point, but I don't believe you can rely on uniques here. Except in the case of known key things (which is certainly the minority of things), uniques are generated afresh with every GHC compilation. While it's possible that the same compiler run will happen to produce the same Name/Unique correspondence, this is not guaranteed and may be broken by GHC `-j`, different recompilation-checker conditions, etc. The only part of the name that we guarantee will be stable across compiler sessions is the OccName of Names coming from interface files. The uniqueness of these names is ensured when we create the interface file by TidyPgm.chooseExternalIds. In principle you could do something similar to the entire Core program before you dump it. Cheers, - Ben

Csaba Hruska

1 Dec 1 Dec

12:34 a.m.

Hi Ben, I thought that it is possible to rely on unique values *in case of non exported Ids* because they are local to a specific module and can not appear in expressions in other modules because they are not exported. Do I miss something? Cheers, Csaba On Fri, Nov 30, 2018 at 2:59 PM Ben Gamari wrote:

...

Christopher Done writes:

...
Hi all,

I'm attempting to make a simple evaluator for GHC core, but I'm not clear on how to reliably looking up names. I'm compiling each of ghc-prim, integer-simple and base with a patched version of GHC which performs an extra output step with the Core AST to a file for each module.

...

...
Two questions:

1) How do I recognize class methods when I see one, like the "main:Main:foo" above?

Maybe this? isClassOpId_maybe :: Id -> Maybe Class

Is an "op" what GHC calls type-class methods?

Yes, I believe this will do what you are looking for.

...
2) If I compile e.g. ghc-prim and that generates a binding Name with ID 123, and then I compile base -- will the ID 123 be re-used by base for something else, or will any reference to 123 in the compiled Names for base refer ONLY to that one in ghc-prim? In other words, when GHC loads the iface for ghc-prim, does it generate a fresh set of names for everything in ghc-prim, or does it load them from file?

Perhaps I am misunderstanding Csaba's point, but I don't believe you can rely on uniques here. Except in the case of known key things (which is certainly the minority of things), uniques are generated afresh with every GHC compilation. While it's possible that the same compiler run will happen to produce the same Name/Unique correspondence, this is not guaranteed and may be broken by GHC `-j`, different recompilation-checker conditions, etc.

The only part of the name that we guarantee will be stable across compiler sessions is the OccName of Names coming from interface files. The uniqueness of these names is ensured when we create the interface file by TidyPgm.chooseExternalIds. In principle you could do something similar to the entire Core program before you dump it.

Cheers,

- Ben _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Ben Gamari

2:20 a.m.

Csaba Hruska writes:

...

Hi Ben,

I thought that it is possible to rely on unique values *in case of non exported Ids* because they are local to a specific module and can not appear in expressions in other modules because they are not exported. Do I miss something?

Uniques should be treated as being non-reproducible across compiler sessions. To make this more concrete: if GHC compiles the same module twice it will not necessarily assign the same uniques to the module's Names. Uniques are derived from local UniqSupplies conjured up at a variety of points in the compilation pipeline (search from mkSplitUniqSupply). These supplies are themselves derived from an impure global counter (see compiler/cbits/genSym.c). The state of this counter (and consequently the uniques derived from it) should be treated as being entirely unpredictable. Cheers, - Ben

Csaba Hruska

2:28 a.m.

I'm aware that unique names are unique in a GHC invocation. But my statements sill holds. On Fri, Nov 30, 2018 at 7:20 PM Ben Gamari wrote:

...

Csaba Hruska writes:

...
Hi Ben,

I thought that it is possible to rely on unique values *in case of non exported Ids* because they are local to a specific module and can not appear in expressions in other modules because they are not exported. Do I miss something?

Uniques should be treated as being non-reproducible across compiler sessions.

To make this more concrete: if GHC compiles the same module twice it will not necessarily assign the same uniques to the module's Names. Uniques are derived from local UniqSupplies conjured up at a variety of points in the compilation pipeline (search from mkSplitUniqSupply).

These supplies are themselves derived from an impure global counter (see compiler/cbits/genSym.c). The state of this counter (and consequently the uniques derived from it) should be treated as being entirely unpredictable.

Cheers,

- Ben

Christopher Done

8:23 p.m.

I think what Csaba means is that we can have e.g. * GHC invocation 1 * ghc-prim: * MyModule.foo has Unique 123 * OtherModule.bar has Unique 124 * GHC invocation 2 * base: * MyMod.mu has Unique 123 * OtherMod.zot has Unique 124 For a unique reference then, we just need: * ghc-prim:MyMobile.foo+123 * ghc-prim:OtherModule.bar+124 * base:MyMod.mu+123 * base:OtherMod.zot+124 For any local lookup this is reliable. If the lookup fails, a lookup without the Unique works for cross-module lookups.

Csaba Hruska

8:42 p.m.

The package name + module name is always unique in every Haskell program, module names can not be duplicated in a package and package names are unique also. There are 2 kinds of identifiers, local and exported. You can construct a whole program unique name for: - exported identifier with combining the package name + module name + occurence name (without the unique value) - local identifier with combining the package name + module name + occurence name + unique (what is unique per invocation) It is safe because only the exported names can appear in an external expression and those do not contain the GHC's unique value. Just think about how the object code linker deals with GHC symbols. Cheers, Csaba On Sat, Dec 1, 2018 at 1:23 PM Christopher Done wrote:

...

I think what Csaba means is that we can have e.g.

* GHC invocation 1 * ghc-prim: * MyModule.foo has Unique 123 * OtherModule.bar has Unique 124 * GHC invocation 2 * base: * MyMod.mu has Unique 123 * OtherMod.zot has Unique 124

For a unique reference then, we just need:

* ghc-prim:MyMobile.foo+123 * ghc-prim:OtherModule.bar+124 * base:MyMod.mu+123 * base:OtherMod.zot+124

For any local lookup this is reliable. If the lookup fails, a lookup without the Unique works for cross-module lookups.

Christopher Done

11:03 p.m.

Regarding classes, Csaba, did you have to deal with dictionaries at the STG level? I'm finding that at the Core level, methods don't generate code, so I have to generate them myself. But then I have to know more about classes than I might ideally not want to. For example, I've noticed that if a class has only one method, then the dictionary for an instance doesn't actually seem to construct a record, but makes a special case and just refers to the single method. So my evaluator failed on this. If I add one more method to the class, then things work properly. Example: module Demo (demo) where class Person a where person :: a -> Int wibble :: a -> Int instance Person X where person unusedarg = 5 wibble _ = 9 data X = X demo = person X This produces chris@precision:~/Work/chrisdone/prana$ sh scripts/compiledemo.sh [1 of 1] Compiling Demo ( Demo.hs, Demo.o ) Writing main_Demo.prana Eval: ((main:Demo.person main:Demo.$fPersonX) main:Demo.X) Eval: (main:Demo.person main:Demo.$fPersonX) Eval: main:Demo.person Done: main:Demo.person[Method]0 Eval: main:Demo.$fPersonX <---- here is the dictionary, and you can see what it refers to below: Eval: ((main:Demo.C:Person main:Demo.$cperson) main:Demo.$cwibble) <-- this is the two methods Eval: (main:Demo.C:Person main:Demo.$cperson) Eval: main:Demo.C:Person Done: (main:Demo.C:Person[Con] ) Done: (main:Demo.C:Person[Con] main:Demo.$cperson) Done: (main:Demo.C:Person[Con] main:Demo.$cpersonmain:Demo.$cwibble) Done: (main:Demo.C:Person[Con] main:Demo.$cpersonmain:Demo.$cwibble) Eval: main:Demo.$cperson Eval: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Eval: (ghc-prim:GHC.Types.I# 5) Eval: ghc-prim:GHC.Types.I# Done: (ghc-prim:GHC.Types.I#[Con] ) Done: (ghc-prim:GHC.Types.I#[Con] 5) Done: (ghc-prim:GHC.Types.I#[Con] 5) ConWHNF (Id {idStableName = "ghc-prim:GHC.Types.I#", idUnique = Unique 3891110078048108563, idCategory = DataCat}) [LitE (Int 5)] That's great. But if I delete the wibble method: Eval: ((main:Demo.person main:Demo.$fPersonX) main:Demo.X) Eval: (main:Demo.person main:Demo.$fPersonX) Eval: main:Demo.person Done: main:Demo.person[Method]0 Eval: main:Demo.$fPersonX <- the dictionary Eval: main:Demo.$cperson <-- evaluates to simply the person method, instead of a data constructor Eval: main:Demo.$cperson Eval: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Which results in a runtime type error: prana: TypeError (NotAnInstanceDictionary (Id {idStableName = "main:Demo.person", idUnique = Unique 8214565720323785170, idCategory = ValCat}) (LamWHNF (Id {idStableName = "main:Demo.unusedarg", idUnique = Unique 6989586621679011036, idCategory = ValCat}) (AppE (VarE (Id {idStableName = "ghc-prim:GHC.Types.I#", idUnique = Unique 3891110078048108563, idCategory = DataCat})) (LitE (Int 5))))) I could ignore the fact that I got a function instead of a dictionary, and then evaluation proceeds OK: Eval: ((main:Demo.person main:Demo.$fPersonX) main:Demo.X) Eval: (main:Demo.person main:Demo.$fPersonX) Eval: main:Demo.person Done: main:Demo.person[Method]0 Eval: main:Demo.$fPersonX Eval: main:Demo.$cperson Eval: main:Demo.$cperson Eval: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Eval: (ghc-prim:GHC.Types.I# 5) Eval: ghc-prim:GHC.Types.I# Done: (ghc-prim:GHC.Types.I#[Con] ) Done: (ghc-prim:GHC.Types.I#[Con] 5) Done: (ghc-prim:GHC.Types.I#[Con] 5) ConWHNF (Id {idStableName = "ghc-prim:GHC.Types.I#", idUnique = Unique 3891110078048108563, idCategory = DataCat}) [LitE (Int 5)] But this feels a bit less structured. I want, for example, to be able to update definitions at runtime, so runtime type-errors will be a thing sometimes. I feel like this kind of thing would make some confusing runtime type errors. And what other class-specific oddities should I have to handle? I haven't gotten to dictionaries with superclasses yet, that'll require more handling so I'll probably need more knowledge of classes anyway. So I'm wondering whether at the STG phase all classes have been abstracted away and we really do only deal with lambdas, lets and cases? I didn't see any class-specific code in your project. I only chose Core because I wanted to stay as close to the original Haskell source as possible, and have a very simple evaluation model, but perhaps STG is the easier choice. Cheers On Sat, 1 Dec 2018 at 12:42, Csaba Hruska wrote:

...

The package name + module name is always unique in every Haskell program, module names can not be duplicated in a package and package names are unique also. There are 2 kinds of identifiers, local and exported. You can construct a whole program unique name for:

exported identifier with combining the package name + module name + occurence name (without the unique value) local identifier with combining the package name + module name + occurence name + unique (what is unique per invocation)

It is safe because only the exported names can appear in an external expression and those do not contain the GHC's unique value. Just think about how the object code linker deals with GHC symbols.

Cheers, Csaba

On Sat, Dec 1, 2018 at 1:23 PM Christopher Done wrote:

...
I think what Csaba means is that we can have e.g.

* GHC invocation 1 * ghc-prim: * MyModule.foo has Unique 123 * OtherModule.bar has Unique 124 * GHC invocation 2 * base: * MyMod.mu has Unique 123 * OtherMod.zot has Unique 124

For a unique reference then, we just need:

* ghc-prim:MyMobile.foo+123 * ghc-prim:OtherModule.bar+124 * base:MyMod.mu+123 * base:OtherMod.zot+124

For any local lookup this is reliable. If the lookup fails, a lookup without the Unique works for cross-module lookups.

Csaba Hruska

11:25 p.m.

That's right there are some missing bindings at the core level, those will be generated by *corePrepPgm* function. STG is a low level version of core and it contains all code that is required for execution. Classes are represented as nodes (dictionary) at STG level. E.g. here is my custom STG data type which I export: https://github.com/grin-tech/ghc-grin/blob/master/ghc-dump-core/GhcDump_StgA... In my opinion GHC core has lots of internal coding convention. Cheers On Sat, Dec 1, 2018 at 4:02 PM Christopher Done wrote:

...

Regarding classes,

Csaba, did you have to deal with dictionaries at the STG level? I'm finding that at the Core level, methods don't generate code, so I have to generate them myself. But then I have to know more about classes than I might ideally not want to.

For example, I've noticed that if a class has only one method, then the dictionary for an instance doesn't actually seem to construct a record, but makes a special case and just refers to the single method. So my evaluator failed on this. If I add one more method to the class, then things work properly. Example:

module Demo (demo) where class Person a where person :: a -> Int wibble :: a -> Int instance Person X where person unusedarg = 5 wibble _ = 9 data X = X demo = person X

This produces

chris@precision:~/Work/chrisdone/prana$ sh scripts/compiledemo.sh [1 of 1] Compiling Demo ( Demo.hs, Demo.o ) Writing main_Demo.prana Eval: ((main:Demo.person main:Demo.$fPersonX) main:Demo.X) Eval: (main:Demo.person main:Demo.$fPersonX) Eval: main:Demo.person Done: main:Demo.person[Method]0 Eval: main:Demo.$fPersonX <---- here is the dictionary, and you can see what it refers to below: Eval: ((main:Demo.C:Person main:Demo.$cperson) main:Demo.$cwibble) <-- this is the two methods Eval: (main:Demo.C:Person main:Demo.$cperson) Eval: main:Demo.C:Person Done: (main:Demo.C:Person[Con] ) Done: (main:Demo.C:Person[Con] main:Demo.$cperson) Done: (main:Demo.C:Person[Con] main:Demo.$cpersonmain:Demo.$cwibble) Done: (main:Demo.C:Person[Con] main:Demo.$cpersonmain:Demo.$cwibble) Eval: main:Demo.$cperson Eval: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Eval: (ghc-prim:GHC.Types.I# 5) Eval: ghc-prim:GHC.Types.I# Done: (ghc-prim:GHC.Types.I#[Con] ) Done: (ghc-prim:GHC.Types.I#[Con] 5) Done: (ghc-prim:GHC.Types.I#[Con] 5) ConWHNF (Id {idStableName = "ghc-prim:GHC.Types.I#", idUnique = Unique 3891110078048108563, idCategory = DataCat}) [LitE (Int 5)]

That's great. But if I delete the wibble method:

Eval: ((main:Demo.person main:Demo.$fPersonX) main:Demo.X) Eval: (main:Demo.person main:Demo.$fPersonX) Eval: main:Demo.person Done: main:Demo.person[Method]0 Eval: main:Demo.$fPersonX <- the dictionary Eval: main:Demo.$cperson <-- evaluates to simply the person method, instead of a data constructor Eval: main:Demo.$cperson Eval: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5))

Which results in a runtime type error:

prana: TypeError (NotAnInstanceDictionary (Id {idStableName = "main:Demo.person", idUnique = Unique 8214565720323785170, idCategory = ValCat}) (LamWHNF (Id {idStableName = "main:Demo.unusedarg", idUnique = Unique 6989586621679011036, idCategory = ValCat}) (AppE (VarE (Id {idStableName = "ghc-prim:GHC.Types.I#", idUnique = Unique 3891110078048108563, idCategory = DataCat})) (LitE (Int 5)))))

I could ignore the fact that I got a function instead of a dictionary, and then evaluation proceeds OK:

Eval: ((main:Demo.person main:Demo.$fPersonX) main:Demo.X) Eval: (main:Demo.person main:Demo.$fPersonX) Eval: main:Demo.person Done: main:Demo.person[Method]0 Eval: main:Demo.$fPersonX Eval: main:Demo.$cperson Eval: main:Demo.$cperson Eval: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Done: (\main:Demo.unusedarg -> (ghc-prim:GHC.Types.I# 5)) Eval: (ghc-prim:GHC.Types.I# 5) Eval: ghc-prim:GHC.Types.I# Done: (ghc-prim:GHC.Types.I#[Con] ) Done: (ghc-prim:GHC.Types.I#[Con] 5) Done: (ghc-prim:GHC.Types.I#[Con] 5) ConWHNF (Id {idStableName = "ghc-prim:GHC.Types.I#", idUnique = Unique 3891110078048108563, idCategory = DataCat}) [LitE (Int 5)]

But this feels a bit less structured. I want, for example, to be able to update definitions at runtime, so runtime type-errors will be a thing sometimes. I feel like this kind of thing would make some confusing runtime type errors. And what other class-specific oddities should I have to handle?

I haven't gotten to dictionaries with superclasses yet, that'll require more handling so I'll probably need more knowledge of classes anyway.

So I'm wondering whether at the STG phase all classes have been abstracted away and we really do only deal with lambdas, lets and cases? I didn't see any class-specific code in your project.

I only chose Core because I wanted to stay as close to the original Haskell source as possible, and have a very simple evaluation model, but perhaps STG is the easier choice.

Cheers On Sat, 1 Dec 2018 at 12:42, Csaba Hruska wrote:

...
The package name + module name is always unique in every Haskell

program, module names can not be duplicated in a package and package names are unique also.

...
There are 2 kinds of identifiers, local and exported. You can construct a whole program unique name for:

exported identifier with combining the package name + module name + occurence name (without the unique value) local identifier with combining the package name + module name + occurence name + unique (what is unique per invocation)

It is safe because only the exported names can appear in an external expression and those do not contain the GHC's unique value. Just think about how the object code linker deals with GHC symbols.

Cheers, Csaba

On Sat, Dec 1, 2018 at 1:23 PM Christopher Done wrote:

...
I think what Csaba means is that we can have e.g.

* GHC invocation 1 * ghc-prim: * MyModule.foo has Unique 123 * OtherModule.bar has Unique 124 * GHC invocation 2 * base: * MyMod.mu has Unique 123 * OtherMod.zot has Unique 124

For a unique reference then, we just need:

* ghc-prim:MyMobile.foo+123 * ghc-prim:OtherModule.bar+124 * base:MyMod.mu+123 * base:OtherMod.zot+124

For any local lookup this is reliable. If the lookup fails, a lookup without the Unique works for cross-module lookups.

2551

Age (days ago)

2555

Last active (days ago)

List overview

Download

11 comments

4 participants

participants (4)

Ben Gamari
Ben Gamari
Christopher Done
Csaba Hruska

Writing a simple Core evaluator, having trouble with name lookups

tags

participants (4)