Show instances for GHC internals

Currently the only way to debug and inspect GHC internals is by adding some carefully placed print statements. (I'd love to be proven wrong on this, cost of debugging this way is huge, given how long it's taking to rebuild GHC) We have Outputable instances for most data types, and `Outputable.pprTrace` etc. helps with debugging/inspecting pure functions this way. However, Outputable instances are hiding some details and they're sometimes not useful for debugging and inspecting internals. This is why I implemented CoreDump package(http://hackage.haskell.org/package/CoreDump), Outputable instance of CoreSyn is simply not useful for some things. Similarly, just today I had to add a show function for `HscTypes.TargetId` because `Outputable` instance was hiding `Maybe Phase` field. Since the only way to debug or inspect GHC internals(except maybe the RTS) is by printing things, I think we should provide Show instances for.. basically everything. Otherwise I just can't see a way of debugging things and inspecting internals, tracing code etc. for learning purposes. I was wondering what would be the cost of adding Show instances. Would that mean significantly increased compile times? Or significantly bigger GHC binaries? If that's the case, could we enable Show instances with some arguments so that we can enable/disable it by modifying mk/build.mk?

This is a utility I would love to see inside the ghc source tree for
examining the AST
https://github.com/edsko/ghc-dump-tree
Alan
On Mon, Oct 19, 2015 at 11:18 PM, Ömer Sinan Ağacan
Currently the only way to debug and inspect GHC internals is by adding some carefully placed print statements. (I'd love to be proven wrong on this, cost of debugging this way is huge, given how long it's taking to rebuild GHC)
We have Outputable instances for most data types, and `Outputable.pprTrace` etc. helps with debugging/inspecting pure functions this way.
However, Outputable instances are hiding some details and they're sometimes not useful for debugging and inspecting internals. This is why I implemented CoreDump package(http://hackage.haskell.org/package/CoreDump), Outputable instance of CoreSyn is simply not useful for some things. Similarly, just today I had to add a show function for `HscTypes.TargetId` because `Outputable` instance was hiding `Maybe Phase` field.
Since the only way to debug or inspect GHC internals(except maybe the RTS) is by printing things, I think we should provide Show instances for.. basically everything. Otherwise I just can't see a way of debugging things and inspecting internals, tracing code etc. for learning purposes.
I was wondering what would be the cost of adding Show instances. Would that mean significantly increased compile times? Or significantly bigger GHC binaries? If that's the case, could we enable Show instances with some arguments so that we can enable/disable it by modifying mk/build.mk? _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

On Oct 19, 2015, at 5:18 PM, Ömer Sinan Ağacan
cost of debugging this way is huge, given how long it's taking to rebuild GHC)
There are more interesting parts of your post, but I can respond to this: It shouldn't take that much time. Once you have ghc-stage2 built, you should be able to say `make 2` in the ./ghc subdirectory and get a new binary in a few seconds. Using `make 1` in the ./compiler subdirectory works similarly for the stage1 compiler. But only once it's built the first time. Richard

Excerpts from Ömer Sinan Ağacan's message of 2015-10-19 14:18:41 -0700:
I was wondering what would be the cost of adding Show instances. Would that mean significantly increased compile times? Or significantly bigger GHC binaries? If that's the case, could we enable Show instances with some arguments so that we can enable/disable it by modifying mk/build.mk?
One difficulty is that many of the core type data types, e.g. TyThing, are (1) a large mutually recursive graph, and (2) have unsafeInterleaveIO thunks which would induce IO action. So a naive Show instance would give infinite output and have lots of side effects. There are many data types which could usefully have Show added but also many which would be very difficult to. Edward

One difficulty is that many of the core type data types, e.g. TyThing, are (1) a large mutually recursive graph, and (2) have unsafeInterleaveIO thunks which would induce IO action. So a naive Show instance would give infinite output and have lots of side effects. There are many data types which could usefully have Show added but also many which would be very difficult to.
Ah, yes, this is a very annoying problem. I discovered that when I first wrote
CoreDump: https://github.com/osa1/CoreDump/issues/2
I don't have solution to this yet.
2015-10-19 21:59 GMT-04:00 Edward Z. Yang
Excerpts from Ömer Sinan Ağacan's message of 2015-10-19 14:18:41 -0700:
I was wondering what would be the cost of adding Show instances. Would that mean significantly increased compile times? Or significantly bigger GHC binaries? If that's the case, could we enable Show instances with some arguments so that we can enable/disable it by modifying mk/build.mk?
One difficulty is that many of the core type data types, e.g. TyThing, are (1) a large mutually recursive graph, and (2) have unsafeInterleaveIO thunks which would induce IO action. So a naive Show instance would give infinite output and have lots of side effects. There are many data types which could usefully have Show added but also many which would be very difficult to.
Edward

There are more interesting parts of your post, but I can respond to this: It shouldn't take that much time. Once you have ghc-stage2 built, you should be able to say `make 2` in the ./ghc subdirectory and get a new binary in a few seconds.
Using `make 1` in the ./compiler subdirectory works similarly for the stage1 compiler. But only once it's built the first time.
I replied this in other thread, I think it works but I'll make sure next time I do a `make clean`. Thanks. Another problem is this: Hiding fields of types is great for safety reasons, but not so great for debugging. In CoreDump I'm having this problems: - Sometimes GHC can't derive Show instance because record fields are hidden. But every field is actually exposed in a read-only way with some manually defined functions. This is super annoying. It'd be really awesome if we could export record fields as "read-only". (very half-baked idea) - Sometimes fields are hidden, and no accessors are provided. This is even worse because now there's really no way to derive Show, using `deriving` or manually.
(2) have unsafeInterleaveIO thunks which would induce IO action
Edward, do you remember any examples of such code?
2015-10-20 9:22 GMT-04:00 Ömer Sinan Ağacan
One difficulty is that many of the core type data types, e.g. TyThing, are (1) a large mutually recursive graph, and (2) have unsafeInterleaveIO thunks which would induce IO action. So a naive Show instance would give infinite output and have lots of side effects. There are many data types which could usefully have Show added but also many which would be very difficult to.
Ah, yes, this is a very annoying problem. I discovered that when I first wrote CoreDump: https://github.com/osa1/CoreDump/issues/2
I don't have solution to this yet.
2015-10-19 21:59 GMT-04:00 Edward Z. Yang
: Excerpts from Ömer Sinan Ağacan's message of 2015-10-19 14:18:41 -0700:
I was wondering what would be the cost of adding Show instances. Would that mean significantly increased compile times? Or significantly bigger GHC binaries? If that's the case, could we enable Show instances with some arguments so that we can enable/disable it by modifying mk/build.mk?
One difficulty is that many of the core type data types, e.g. TyThing, are (1) a large mutually recursive graph, and (2) have unsafeInterleaveIO thunks which would induce IO action. So a naive Show instance would give infinite output and have lots of side effects. There are many data types which could usefully have Show added but also many which would be very difficult to.
Edward

Excerpts from Ömer Sinan Ağacan's message of 2015-10-20 06:37:44 -0700:
Edward, do you remember any examples of such code?
The big kahuna is interface loading. Everything TyThing from loadDecls is done (unsafely) lazily. In fact, must be, because TyThings are a mutually recursive structure. Edward
participants (4)
-
Alan & Kim Zimmerman
-
Edward Z. Yang
-
Richard Eisenberg
-
Ömer Sinan Ağacan