my experience with ghci debugger extensions

Hi users of ghci debugger, This post is going to be a bit longer. Here are few cookies to motivate you to go on: * you will probably like it more than printf debugging for a lot of cases * it can provide a way of implementing Claus Reinke's idea of breakpoints with a list of identifiers which should be available at breakpoint location and doing it without recompilation; here is the link to Claus' message: http://permalink.gmane.org/gmane.comp.lang.haskell.glasgow.user/15900 * it gives some idea to ghc team about importance of ghci debugger related tickets (and whether to implement them just the way they were proposed) A note to ghc developers: Stepping program in ghci debugger sometimes purges top level bindings and sometimes not. Not sure this is a bug or feature (whether there is some logic in it). I do not have a simple example when it purges the bindings. But I did not really look for one. I would probably post later, but some ghci bugs and missing features are badly limiting my progress. There is not much more I can think of to investigate. Maybe somebody will have better ideas how to solve the obstacles I'm hitting. I'm also posting with hope that people will find this interesting and ghc team will fix some of the critical bugs and adds the most critical features, especially if somebody will not have better debugging tips. You can get my extensions here: http://www.hck.sk/users/peter/pub/ghciext-0.1.tar.gz The extensions are not in a single .ghci file now. The new .ghci file needs to install a library. The reason is that without the library it is not manageable any more. And here are arch linux packaging instructions for my custom ghc (if you are an arch linux user just download and run makepkg :) ): http://www.hck.sk/users/peter/pub/ghc-custom-6.10.1.tar.gz The custom ghc is just the same one as ghc 6.10.1 with two more patches: t2740.patch and loglocal.patch. The first one fixes ticket 2740 and you will get it in 6.10.2. The second one adds :loglocal command to ghci. You can extract the patches from the tar file. If you already read ghci scripting tutorial from Claus Reinke then you will know how to customize ghciext (that is if you would feel like doing so). The tutorial is here: http://www.haskell.org/pipermail/haskell-cafe/2007-September/032260.html Here is the list of custom commands in ghciext package: :defs -- list user-defined commands :. <file> -- source commands from <file> :redir <var> <cmd>... -- execute <cmd> redirecting stdout to <var> :grep <pat> <cmd>... -- filter lines matching <pat> from <cmd> output :* <count> <cmd>... -- run <cmd> <count> times :x <cmd>... -- run <cmd> with stdout suppressed :bp <bpArgs> -- put breakpoint at location <bpArgs> (adds hit count) :inject <cc> <c> <sc> <b> -- at location <b> execute <c> if <cc>, and stop if <sc> :monitor ["<c>"] <vs> <b> -- show comma separated variables at location <b> if <c> :watch <cond> <bpArgs> -- break at location <bpArgs> when <cond> is True :count (_|<N>) [<bpArgs>] -- count/query/stop execution at location <bpArgs> :find <var> <cmd>... -- step with <cmd> until <var> is found :findex <str> <cmd>... -- step with <cmd> until <str> is found :next [lazy] -- step over; lazy version forces only the top constructor :xout -- redirect ghci stdout to /dev/null :out -- redirect ghci stdout back to console :defs, :., :redir, :grep, are the same as the commands in Claus' tutorial. The only differences I recall now are: * grep pattern can be in quotation marks (easier search for spaces) * grep merges :browse output more nicely * redir can accept :step, :steplocal etc; i.e. also the commands which sometimes remove top level bindings * the commands do not pollute top level bindings so much The rest will describe my custom commands and now they relate to tickets in ghci track. If you want to check the tickets mentioned here then the most easy way is probably selecting them from this list: http://hackage.haskell.org/trac/ghc/search?q=phercek&noquickjump=1&ticket=on The initial big problem was how to limit the amount of information ghci debugger spits at you when breakpoints with custom scripts are used. This is also mentioned in forth point of the "unknown" section of ticket #1377: /We can disable a breakpoint with ":set stop N :continue", but this still prints out the breakpoint info when we stop. Should we print the info only if there were no commands?/ So I say: yes do it! Just disable any ghci output related to a breakpoint when the breakpoint has a custom code attached. We can disable the output ourselves, but then we disable all the output (including the output of the debugged program). People who are debugging console applications are busted (especially if the applications are interactive). This is not an issue for me since my application is not using console almost at all. I'm solving the problem by prefixing commands like /:main/ and /:continue /with my command /:x/. This makes output by default disabled and ghciExt enables it just before a breakpoint code is run. If the breakpoint continues execution it disables it again. If the debugged function finishes the output is enabled by /:x/ itself. A small problem happens when you forget to use /:x/ e.g. with your /:main/. Then you do not get a prompt when the program finishes. It's time to notice it and use /:out/ to switch it on. This is the only minor disadvantage for a gui application debugging. /:xout /in not that useful (it is mostly used internally by /:x/). It could be a hidden command (i.e. not registered in defs). Now when we have ghci "muted" we can get to real goodies: /:monitor/, /:watch/, and /:count/. If you like printf debugging, then /:monitor/ can help you. It can monitor only free variables in the selected expression (on which a breakpoint is hit) but that was always enough for me during debug sessions I had. Write a script file (lets say it is named *ghciInit* (I'll also call it this way later on in this text)) which is like this example: /:monitor var1,var2 ModuleName 23 :monitor var3 ModuleName 40 :x :main mainArgument/ Then open ghci with your program and run *:. ghciInit* And you will get nice log like this: /(0): var1 = <value01> var2 = <value02> (1): var3 = <value03> (0): var1 = <value11> var2 = <value12> (1): var3 = <value13> ... etc/ Moreover /:monitor/ allows condition in quotation marks to be specified as the first argument. If you do it then the variable values will be printed only when the condition is True. Of course the condition can contain only free variables in the selected expression. Mostly it is not a problem. /:watch/ is a conditional breakpoint which stops when the specified condition is True. /:count/ has 3 forms: /:count _ ModuleName 23/ This never breaks just counts number of times the execution reached position ModuleName 23 /:count 5 ModuleNmae 23/ This breaks when we reach position ModuleName 23 fifth time /:count 0/ This tells how many times breakpoint number 0 was hit (it can report the number of hits for any breakpoint created with /:bp/, /:inject/, /:monitor/, /:watch/, and /:count/). The first form of /:count/ is interesting when you want to stop earlier than something bad happens so that you can see why the bad think happened. Put the first form of /:count/ at the start of the function with the bug and then put a break in the function which is hit when the bad think happens maybe with /:watch/. When you stop at the bug place, check hit count at the start of the function. Add the hit count in your /:count/ breakpoint in your *ghciInt* file restart and you can use /:steplocal/ or /:loglocal/ to find out what went wrong. /:loglocal/ is implemented directly in ghc source code. It does exactly the same as /:steplocal/ but makes sure that the trace history does not contain anything outside of the function we step with /:steplocal/. The problem is that /:steplocal/ works like the code was traced while it is executing. Mostly the result is that your trace history is loaded with crap outside of the scope you are interested in. I'll return to /:loglocal/ later again. /:inject/ is there when you need something special (/:monitor/, /:watch/, and /:count/ are implemented with something very like /:inject/). E.g. when you want to do monitoring of a value but do not want the associated breakpoint number printed. /:find/ and /:findex/ are there primary to search trace history. /:find var1 :back/ will find the variable var1 in your trace history by back stepping it /:find var1 :step/ will single step forward till variable var1 is in the list of free variables /:findex BL/Logic.hs:23 :loglocal/ will fill in your trace history will all the local breakpoints till location BL/Logic.hs:23 is hit. Having the trace history filled in with the right stuff is useful for checking out why you got bad results later. Now lets return to Claus' idea of breakpoints with a list of identifiers which should be available at the breakpoint location. You can make sure the identifiers are available with the first form of /:count/. It never stops but it puts records to the trace history. So the trace history will contain free variables at the locations where you put /:count/. This will be even more useful when automatic search of trace history is built in (see ticket #2737). So a weaker form of Claus' idea can be implemented by carefully selecting what should bet to the trace history. Why a weaker form only? Well, in some cases, the variable instances in trace log may not be the expected ones (they may be from a different lexical scope). Experience with my code indicates this should be rare. /:next/ is an idea how to implement a kind of step over. That is if by step over you mean something else than steplocal. The non-lazy form of /:next/ forces _result and does a /:step/. The lazy form forces only the top level constructor of _result before the step. Hey, I even had a case when it worked just like I expected. But typically it does not work because of bug #1531. _result is not correctly bound to the result of selected expression in most of the practical cases. This bug is also critical for all the forms of conditional breakpoints. It would be cool if we could specify the condition based on _result or some part of it. The implementation of ghciExt conditional breakpoints would need to be extended to support conditions on _result (in particular the breakpoint would need to be disabled during the condition execution) but that is easy to do. Even more worrying thing about bug #1531 is that it has the milestone set to _|_. It is easy to add /:enable/ and /:disable/ to support enabling and disabling breakpoints. I just did not need it yet. Here is how a GhciExt breakpoint looks like: *Main> :show breaks [0] Main a.hs:4:2-8 ":cmd return$GhciExt.getStopCode 0 (True) "putStr \"(0): \"\n:force x" "False"" *Main> Just replace getStopCode with getDisabledStopCode and you have it disabled. Return back to enable. Yeah, and implement getDisabledStopCode which will just continue. I added :/loglocal/ mostly to simulate how /:tracelocal/ in ticket #2737 would help. I was also trying how full tracing is helping. In both cases the answer is: full tracing almost never helps. /:tracelocal/ from ticket #2737 as proposed originally would rarely help. The problem is that trace log gets overwhelmed with crap when we cannot control what can be saved in it and what cannot be saved. My idea is that user should be able to specify what can go in it and also what should not go in it. Here is an alternative solution to the ones I proposed in tickets #2737 and #2946. I think this one would be best. The command to control the tracing should look like: -- should everything be traced? :set trace (True|False) -- scopes which should be traced (or should not be traced when ! is present) :set trace ( (!)? scopeid )* -- add/remove individual scopeids to/from the trace specification :set trace (+|-) (!)? scopeid where scopeid = ( conid . )* ( varid . )* varid Notice how scopeid looks. It can have a sequence of varids at the end. The reason is so that user can leave out a scope of a function which is defined in a where clause. The scope specification is similar to the proposal in ticket #3000. E.g. for this code: fn s = 'a' : add s where add = (++"z") it could look like :set trace Main.fn !Main.fn.add meaning trace whole scope of *fn* but not the stuff in the scope of *add. *Order should not be important, requests for not tracing should have precedence before requests to trace. The scopes which we want typically exclude are the ones which contain loops. The loop content often fills in the trace log forcing the interesting stuff out of it. It is better to investigate functions with loops separately in nested context. Notice that there is a bit difference between this proposal of controlling trace content and the one in ticket #2737. #2737 assumes usage of breakpoint arguments to specify a scope. The breakpoint arguments give an ability to define scopes at finer level but there is no option to define exclude scopes which I find important now. The summary is: Trace log is as useful as much you can control what can get in it. The :trace command looks to me like an error. It is better to control it by allowing/disallowing scopes. I also changed my opinion a bit about ticket #2945. :mergetrace would be better than global trace history. Being able to investigate something separately in a nested context is useful. If I should order the ghci debugger related tickets then the order would be like (more important first): #1531 (_result can get bound to the wrong value in a breakpoint) #2737 and #2946 (add :tracelocal to ghci debugger... and add command :mergetrace...) #3000 (:break command should recognize also nonexported top level symbols...) #2803 (bring full top level of a module in scope when a breakpoint is hit in the ...) #1377 (the task: "We should print breakpoint related info only if breakpoint has no commands set") but people debugging interactive console applications would like to have this one the very top; *IIRC* this may be easy to do, looks like all the printing is done in one function (something like afterCmd???); also #2950 looked like trivial to do (like 15 mins without the compile time???) And the last thing: my first time experiences hacking in the /:loglocal/ into ghc. I cannot tell much, I spend with it only one long Sunday afternoon, but here are my two points: * I needed to extend ghc interface. The type of function GHC.resume changed from: resume :: GhcMonad m => SingleStep -> m RunResult to: resume :: GhcMonad m => (SrcSpan->Bool) -> SingleStep -> m RunResult ... plus the corresponding implementation change. The added argument is a filtering function to limit source spans which can recorded in the trace history. * It would be cool if ghci has its own dir in the souce tree where only the ghci source files are. It would encourage people to hack it more since it would be easier to maintain private patches and merging upstream. It would be also easier to make sure one modifies only ghci source code so that it works with other ghc releases. Hopefully this helps somebody, Peter.

If somebody managed to download it already there is a newer version. Break point counter inside break expressions was one less than it should be. Sorry for inconvenience. It is still very new. Not sure there would be enough interest to put it on hackage or something like that. Let me know if you want it. http://www.hck.sk/users/peter/pub/ghciext-0.2.tar.gz Peter.

Hello Peter,
Your efforts are simply outstanding. Thanks a lot for sharing your
experiences. I want to add a few comments:
- Regarding your :logLocal, you should rename it to :stepLocal, open a
ticket, and attach your patch. We should really try to get this into
6.10.2.
:stepLocal is broken right now, as you say it traces a lot of
garbage. I implemented it and shortly later
noticed the problem you mention, but never got around to fixing it.
Thanks for shouting.
(unless you think it makes sense to keep two versions of this
command, :logLocal and :stepLocal)
- Your idea of forcing _result to implement step over is very nice.
A while ago I tried and failed at implementing :next. The conclusion
was it cannot be done without a lexical call stack.
Your idea could be useful as a pseudo :next, as long as the user of
the debugger is ok with changing the evaluation semantics of the
program. However, I don't know if there is a fix for the _result bug
at hand's reach.
I look forward to playing this.
Your custom ghc tar.gz file contains an additional patch,
network.patch, but there are no instructions about it. I assume it is
ok to ignore it.
I applied the two other patches succesfully and hopefully in a few
minutes will have a working build.
Evidently this is a suboptimal way to distribute your changes. If GHCi
lived in its own package as a real GHC api client, it would certainly
be much easier to distribute modifications like these.Although it
would not have helped in this occasion: the changes you needed
required modifying the API.
Finallly, please do not forget to add a link to this in the GHCi
Debugger wiki page at
http://haskell.org/haskellwiki/GHC/GHCi_debugger
and/or at the debugging page at
http://haskell.org/haskellwiki/Debugging
Thanks,
pepe
2009/2/5 Peter Hercek
Hi users of ghci debugger,
This post is going to be a bit longer. Here are few cookies to motivate you to go on: * you will probably like it more than printf debugging for a lot of cases * it can provide a way of implementing Claus Reinke's idea of breakpoints with a list of identifiers which should be available at breakpoint location and doing it without recompilation; here is the link to Claus' message: http://permalink.gmane.org/gmane.comp.lang.haskell.glasgow.user/15900 * it gives some idea to ghc team about importance of ghci debugger related tickets (and whether to implement them just the way they were proposed)
A note to ghc developers: Stepping program in ghci debugger sometimes purges top level bindings and sometimes not. Not sure this is a bug or feature (whether there is some logic in it). I do not have a simple example when it purges the bindings. But I did not really look for one.
I would probably post later, but some ghci bugs and missing features are badly limiting my progress. There is not much more I can think of to investigate. Maybe somebody will have better ideas how to solve the obstacles I'm hitting. I'm also posting with hope that people will find this interesting and ghc team will fix some of the critical bugs and adds the most critical features, especially if somebody will not have better debugging tips.
You can get my extensions here: http://www.hck.sk/users/peter/pub/ghciext-0.1.tar.gz The extensions are not in a single .ghci file now. The new .ghci file needs to install a library. The reason is that without the library it is not manageable any more.
And here are arch linux packaging instructions for my custom ghc (if you are an arch linux user just download and run makepkg :) ): http://www.hck.sk/users/peter/pub/ghc-custom-6.10.1.tar.gz The custom ghc is just the same one as ghc 6.10.1 with two more patches: t2740.patch and loglocal.patch. The first one fixes ticket 2740 and you will get it in 6.10.2. The second one adds :loglocal command to ghci. You can extract the patches from the tar file.
If you already read ghci scripting tutorial from Claus Reinke then you will know how to customize ghciext (that is if you would feel like doing so). The tutorial is here: http://www.haskell.org/pipermail/haskell-cafe/2007-September/032260.html
Here is the list of custom commands in ghciext package: :defs -- list user-defined commands :. <file> -- source commands from <file> :redir <var> <cmd>... -- execute <cmd> redirecting stdout to <var> :grep <pat> <cmd>... -- filter lines matching <pat> from <cmd> output :* <count> <cmd>... -- run <cmd> <count> times :x <cmd>... -- run <cmd> with stdout suppressed :bp <bpArgs> -- put breakpoint at location <bpArgs> (adds hit count) :inject <cc> <c> <sc> <b> -- at location <b> execute <c> if <cc>, and stop if <sc> :monitor ["<c>"] <vs> <b> -- show comma separated variables at location <b> if <c> :watch <cond> <bpArgs> -- break at location <bpArgs> when <cond> is True :count (_|<N>) [<bpArgs>] -- count/query/stop execution at location <bpArgs> :find <var> <cmd>... -- step with <cmd> until <var> is found :findex <str> <cmd>... -- step with <cmd> until <str> is found :next [lazy] -- step over; lazy version forces only the top constructor :xout -- redirect ghci stdout to /dev/null :out -- redirect ghci stdout back to console
:defs, :., :redir, :grep, are the same as the commands in Claus' tutorial. The only differences I recall now are: * grep pattern can be in quotation marks (easier search for spaces) * grep merges :browse output more nicely * redir can accept :step, :steplocal etc; i.e. also the commands which sometimes remove top level bindings * the commands do not pollute top level bindings so much
The rest will describe my custom commands and now they relate to tickets in ghci track. If you want to check the tickets mentioned here then the most easy way is probably selecting them from this list: http://hackage.haskell.org/trac/ghc/search?q=phercek&noquickjump=1&ticket=on
The initial big problem was how to limit the amount of information ghci debugger spits at you when breakpoints with custom scripts are used. This is also mentioned in forth point of the "unknown" section of ticket #1377: We can disable a breakpoint with ":set stop N :continue", but this still prints out the breakpoint info when we stop. Should we print the info only if there were no commands? So I say: yes do it! Just disable any ghci output related to a breakpoint when the breakpoint has a custom code attached. We can disable the output ourselves, but then we disable all the output (including the output of the debugged program). People who are debugging console applications are busted (especially if the applications are interactive). This is not an issue for me since my application is not using console almost at all. I'm solving the problem by prefixing commands like :main and :continue with my command :x. This makes output by default disabled and ghciExt enables it just before a breakpoint code is run. If the breakpoint continues execution it disables it again. If the debugged function finishes the output is enabled by :x itself. A small problem happens when you forget to use :x e.g. with your :main. Then you do not get a prompt when the program finishes. It's time to notice it and use :out to switch it on. This is the only minor disadvantage for a gui application debugging. :xout in not that useful (it is mostly used internally by :x). It could be a hidden command (i.e. not registered in defs).
Now when we have ghci "muted" we can get to real goodies: :monitor, :watch, and :count. If you like printf debugging, then :monitor can help you. It can monitor only free variables in the selected expression (on which a breakpoint is hit) but that was always enough for me during debug sessions I had. Write a script file (lets say it is named ghciInit (I'll also call it this way later on in this text)) which is like this example: :monitor var1,var2 ModuleName 23 :monitor var3 ModuleName 40 :x :main mainArgument
Then open ghci with your program and run :. ghciInit And you will get nice log like this: (0): var1 = <value01> var2 = <value02> (1): var3 = <value03> (0): var1 = <value11> var2 = <value12> (1): var3 = <value13> ... etc
Moreover :monitor allows condition in quotation marks to be specified as the first argument. If you do it then the variable values will be printed only when the condition is True. Of course the condition can contain only free variables in the selected expression. Mostly it is not a problem. :watch is a conditional breakpoint which stops when the specified condition is True. :count has 3 forms: :count _ ModuleName 23 This never breaks just counts number of times the execution reached position ModuleName 23 :count 5 ModuleNmae 23 This breaks when we reach position ModuleName 23 fifth time :count 0 This tells how many times breakpoint number 0 was hit (it can report the number of hits for any breakpoint created with :bp, :inject, :monitor, :watch, and :count).
The first form of :count is interesting when you want to stop earlier than something bad happens so that you can see why the bad think happened. Put the first form of :count at the start of the function with the bug and then put a break in the function which is hit when the bad think happens maybe with :watch. When you stop at the bug place, check hit count at the start of the function. Add the hit count in your :count breakpoint in your ghciInt file restart and you can use :steplocal or :loglocal to find out what went wrong.
:loglocal is implemented directly in ghc source code. It does exactly the same as :steplocal but makes sure that the trace history does not contain anything outside of the function we step with :steplocal. The problem is that :steplocal works like the code was traced while it is executing. Mostly the result is that your trace history is loaded with crap outside of the scope you are interested in. I'll return to :loglocal later again.
:inject is there when you need something special (:monitor, :watch, and :count are implemented with something very like :inject). E.g. when you want to do monitoring of a value but do not want the associated breakpoint number printed.
:find and :findex are there primary to search trace history. :find var1 :back will find the variable var1 in your trace history by back stepping it :find var1 :step will single step forward till variable var1 is in the list of free variables :findex BL/Logic.hs:23 :loglocal will fill in your trace history will all the local breakpoints till location BL/Logic.hs:23 is hit. Having the trace history filled in with the right stuff is useful for checking out why you got bad results later.
Now lets return to Claus' idea of breakpoints with a list of identifiers which should be available at the breakpoint location. You can make sure the identifiers are available with the first form of :count. It never stops but it puts records to the trace history. So the trace history will contain free variables at the locations where you put :count. This will be even more useful when automatic search of trace history is built in (see ticket #2737). So a weaker form of Claus' idea can be implemented by carefully selecting what should bet to the trace history. Why a weaker form only? Well, in some cases, the variable instances in trace log may not be the expected ones (they may be from a different lexical scope). Experience with my code indicates this should be rare.
:next is an idea how to implement a kind of step over. That is if by step over you mean something else than steplocal. The non-lazy form of :next forces _result and does a :step. The lazy form forces only the top level constructor of _result before the step. Hey, I even had a case when it worked just like I expected. But typically it does not work because of bug #1531. _result is not correctly bound to the result of selected expression in most of the practical cases. This bug is also critical for all the forms of conditional breakpoints. It would be cool if we could specify the condition based on _result or some part of it. The implementation of ghciExt conditional breakpoints would need to be extended to support conditions on _result (in particular the breakpoint would need to be disabled during the condition execution) but that is easy to do. Even more worrying thing about bug #1531 is that it has the milestone set to _|_.
It is easy to add :enable and :disable to support enabling and disabling breakpoints. I just did not need it yet. Here is how a GhciExt breakpoint looks like: *Main> :show breaks [0] Main a.hs:4:2-8 ":cmd return$GhciExt.getStopCode 0 (True) "putStr \"(0): \"\n:force x" "False"" *Main> Just replace getStopCode with getDisabledStopCode and you have it disabled. Return back to enable. Yeah, and implement getDisabledStopCode which will just continue.
I added :loglocal mostly to simulate how :tracelocal in ticket #2737 would help. I was also trying how full tracing is helping. In both cases the answer is: full tracing almost never helps. :tracelocal from ticket #2737 as proposed originally would rarely help. The problem is that trace log gets overwhelmed with crap when we cannot control what can be saved in it and what cannot be saved. My idea is that user should be able to specify what can go in it and also what should not go in it. Here is an alternative solution to the ones I proposed in tickets #2737 and #2946. I think this one would be best. The command to control the tracing should look like: -- should everything be traced? :set trace (True|False) -- scopes which should be traced (or should not be traced when ! is present) :set trace ( (!)? scopeid )* -- add/remove individual scopeids to/from the trace specification :set trace (+|-) (!)? scopeid where scopeid = ( conid . )* ( varid . )* varid Notice how scopeid looks. It can have a sequence of varids at the end. The reason is so that user can leave out a scope of a function which is defined in a where clause. The scope specification is similar to the proposal in ticket #3000. E.g. for this code: fn s = 'a' : add s where add = (++"z") it could look like :set trace Main.fn !Main.fn.add meaning trace whole scope of fn but not the stuff in the scope of add. Order should not be important, requests for not tracing should have precedence before requests to trace. The scopes which we want typically exclude are the ones which contain loops. The loop content often fills in the trace log forcing the interesting stuff out of it. It is better to investigate functions with loops separately in nested context.
Notice that there is a bit difference between this proposal of controlling trace content and the one in ticket #2737. #2737 assumes usage of breakpoint arguments to specify a scope. The breakpoint arguments give an ability to define scopes at finer level but there is no option to define exclude scopes which I find important now.
The summary is: Trace log is as useful as much you can control what can get in it. The :trace command looks to me like an error. It is better to control it by allowing/disallowing scopes.
I also changed my opinion a bit about ticket #2945. :mergetrace would be better than global trace history. Being able to investigate something separately in a nested context is useful.
If I should order the ghci debugger related tickets then the order would be like (more important first): #1531 (_result can get bound to the wrong value in a breakpoint) #2737 and #2946 (add :tracelocal to ghci debugger... and add command :mergetrace...) #3000 (:break command should recognize also nonexported top level symbols...) #2803 (bring full top level of a module in scope when a breakpoint is hit in the ...) #1377 (the task: "We should print breakpoint related info only if breakpoint has no commands set") but people debugging interactive console applications would like to have this one the very top; IIRC this may be easy to do, looks like all the printing is done in one function (something like afterCmd???); also #2950 looked like trivial to do (like 15 mins without the compile time???)
And the last thing: my first time experiences hacking in the :loglocal into ghc. I cannot tell much, I spend with it only one long Sunday afternoon, but here are my two points: * I needed to extend ghc interface. The type of function GHC.resume changed from: resume :: GhcMonad m => SingleStep -> m RunResult to: resume :: GhcMonad m => (SrcSpan->Bool) -> SingleStep -> m RunResult ... plus the corresponding implementation change. The added argument is a filtering function to limit source spans which can recorded in the trace history. * It would be cool if ghci has its own dir in the souce tree where only the ghci source files are. It would encourage people to hack it more since it would be easier to maintain private patches and merging upstream. It would be also easier to make sure one modifies only ghci source code so that it works with other ghc releases.
Hopefully this helps somebody, Peter.
_______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Pepe Iborra wrote:
Hello Peter,
Your efforts are simply outstanding. Thanks a lot for sharing your experiences. I want to add a few comments:
- Regarding your :logLocal, you should rename it to :stepLocal, open a ticket, and attach your patch. We should really try to get this into 6.10.2. :stepLocal is broken right now, as you say it traces a lot of garbage. I implemented it and shortly later noticed the problem you mention, but never got around to fixing it. Thanks for shouting. (unless you think it makes sense to keep two versions of this command, :logLocal and :stepLocal)
I do no think it makes sense to keep two versions (both /:steplocal/ and /:loglocal/). I'm aware the name loglocal is bad. I named it that way since it was a way for me to fill in trace log with the right stuff using /:findex/. I kind of did not really intended to release this but I noticed that there was some activity on my ghci debugger related tickets and wanted to get it out till anybody starts to code. Yeah and I'm getting limited by bugs and missing features too. Back to the /:loglocal/ and /:steplocal/. I think there should be only /:steplocal/, but I think the interface for tracing is bad. There should not be /:trace/ command at all. Instead (if the proposal with /:set trace/ is accepted (also mentioned in ticket #2946)) when /:set trace/ is set to True then /:steplocal/ would behave like now; if it is set to False it would behave like /:loglocal/. Of course setting trace to True is just a special case; in my opinion almost useless. The useful cases are when you can set it to scopes either denying them or allowing them. The idea is user should specify filter what is traced. With filter switched off nothing is traced and this case should not have any significant performance impact. When tracing filter is set up into something more complicated it may have some performance impact, but in that case we are not after speed.
- Your idea of forcing _result to implement step over is very nice. A while ago I tried and failed at implementing :next. The conclusion was it cannot be done without a lexical call stack. Your idea could be useful as a pseudo :next, as long as the user of the debugger is ok with changing the evaluation semantics of the program. However, I don't know if there is a fix for the _result bug at hand's reach.
As for as the /:next/ semantics. I actually do not know what it should be in a lazy language. The trick with _result is close sometimes. I could really use it. Especially in my /if/ conditions. They are often simple in my code just some simple arithmetics and record accessors. Stepping over them individually is a pain. It would be also good to step over a function containing a loop when using /:loglocal/ otherwise (so that the loop does not pollute trace log).
I look forward to playing this. Your custom ghc tar.gz file contains an additional patch, network.patch, but there are no instructions about it. I assume it is ok to ignore it.
It is a patch from arch linux upstream. I do not have idea why it is there. It has license whatever arch linux has. You probably can find out more on arch-haskell mailing list.
Finallly, please do not forget to add a link to this in the GHCi Debugger wiki page at
http://haskell.org/haskellwiki/GHC/GHCi_debugger
and/or at the debugging page at
Hmmm, I do not have wiki account and cannot create it. "Create an account or log in" page has a note: "new account creation has been disabled as an anti-spam measure". So I guess it is up to somebody who has a wiki account :-D Peter.

| Hello Peter, | | Your efforts are simply outstanding. Thanks a lot for sharing your | experiences. Seconded! Very useful stuff. That said, Simon M and I are not really focused on the debugger right now, so I'm hoping that someone (Pepe or others helping him) can follow up your suggestions. Simon

Pepe Iborra wrote:
- Regarding your :logLocal, you should rename it to :stepLocal, open a ticket, and attach your patch. We should really try to get this into 6.10.2.
Ach, I missed I'm supposed to do this first time I read the message. I'll get to it at worst during this weekend.
Finallly, please do not forget to add a link to this in the GHCi Debugger wiki page at
http://haskell.org/haskellwiki/GHC/GHCi_debugger
and/or at the debugging page at
Ok, I found a note in HWN that Ashley Yakeley can create a wiki account. He kindly did it for me so I updated the second page. Also there does not seem to be a demand for ghciext package so I'm not going to advertise it any more but I'll keep the latest version here (if anybody would be interested): http://www.hck.sk/users/peter/pub/ Peter.

Peter Hercek wrote:
Also there does not seem to be a demand for ghciext package so I'm not
Hi Peter.. just to note that I haven't had the need/time yet to try it, but I'm very thankful for the work you and Pepe are doing to make ghci more powerful. It's a very useful tool for learning about Haskell and figuring out perplexing behaviours in real-world programs.

Pepe Iborra wrote:
- Regarding your :logLocal, you should rename it to :stepLocal, open a ticket, and attach your patch. We should really try to get this into 6.10.2.
Ach, I missed I'm supposed to do this first time I read the message. I'll get to it at worst during this weekend. http://hackage.haskell.org/trac/ghc/ticket/3035 It is there with a patch attached. Maybe you can still validate and add it to 6.10.2. Sorry for not noticing that you wanted me to do it. For me it is not important, I already run a custom ghc anyway because of other
Peter Hercek wrote: things so one more patch does not change much. Peter.

Hi Peter, Thanks very much for all the suggestions here. As Simon mentioned, we're not actively working on the debugger, and speaking for myself I don't plan to invest significant effort in it in the near future (too many things to do!). If you felt like working on this yourself, possibly with Pepe, then we'd be happy to support in any way we can. Peter Hercek wrote:
/:next/ is an idea how to implement a kind of step over. That is if by step over you mean something else than steplocal. The non-lazy form of /:next/ forces _result and does a /:step/. The lazy form forces only the top level constructor of _result before the step. Hey, I even had a case when it worked just like I expected. But typically it does not work because of bug #1531. _result is not correctly bound to the result of selected expression in most of the practical cases. This bug is also critical for all the forms of conditional breakpoints. It would be cool if we could specify the condition based on _result or some part of it. The implementation of ghciExt conditional breakpoints would need to be extended to support conditions on _result (in particular the breakpoint would need to be disabled during the condition execution) but that is easy to do. Even more worrying thing about bug #1531 is that it has the milestone set to _|_.
So #1531 is tricky to fix, unfortunately. The implementation of _result is a bit of a hack in the first place. The fundamental problem is that a tick expression looks like this case tick<n> of _ -> e where 'e' is not necessarily exactly the same as the expression that was originally inside the tick. We are careful to maintian the property that the tick is evaluated iff the original expression is evaluated, but that's all. _result is bound to e, which may or may not be what you wanted. One way to fix it would be to add extra constraints on what the simplifier can do with tick expressions. I don't like the sound of that because (a) I doni't know exactly what restrictions we'd have to add and (b) this amounts to changing the semantics of Core (i.e. changing which transformations are valid). Maybe there's another way to fix it, but I can't think of one right now. Cheers, Simon

Hi Simon, Simon Marlow wrote:
If you felt like working on this yourself, possibly with Pepe, then we'd be happy to support in any way we can. Thanks. It may happen though it is not probable. I do not know the code so anything non-trivial is a significant effort and my free weekends and evenings are sparse :-( If I would do anything, should it be posted here, sent to Pepe, or attached to the ticket? Is it a habit to indicate in the ticket that somebody started coding it actually (especially if it takes longer to implement)?
So #1531 is tricky to fix, unfortunately. The implementation of _result is a bit of a hack in the first place. The fundamental problem is that a tick expression looks like this
case tick<n> of _ -> e
where 'e' is not necessarily exactly the same as the expression that was originally inside the tick. We are careful to maintian the property that the tick is evaluated iff the original expression is evaluated, but that's all. _result is bound to e, which may or may not be what you wanted.
One way to fix it would be to add extra constraints on what the simplifier can do with tick expressions. I don't like the sound of that because (a) I doni't know exactly what restrictions we'd have to add and (b) this amounts to changing the semantics of Core (i.e. changing which transformations are valid). Ok, I did not understand this part a bit till I did not skim over http://www.haskell.org/~simonmar/papers/ghci-debug.pdf Maybe that paper should be mentioned on the wiki pages about debugger. Something like: "If you do not understand why ghci debugger is limited in such a strange way read this."
A breakpoint condition on _result: My guess is that in about half of the cases I can just put them on a free variable on some other location just as comfortably. In other cases I'm out of luck :) As for as /:next/ command: Like Pepe indicated, I do not have idea how to do it without working _result and without dynamic stack. Though dynamic stack should not be that hard since how otherwise could profiler count ticks for cost centers. And dynamic stack would be great. It would create new options where to store lists of free variables of selected expressions :)
Maybe there's another way to fix it, but I can't think of one right now. If by simplifier you did not mean straight translation to core, then I assume you wanted to try to just skip over all the optimizations (simplifications?). Was it hard to do it or was the performance impact so bad that it was not worth the addition of a command line switch?
Thanks for reading the post about debugging, now there is at least a chance that it will be better once. Peter.

Hi,
Simon Marlow wrote:
If you felt like working on this yourself, possibly with Pepe, then we'd be happy to support in any way we can. Thanks. It may happen though it is not probable. I do not know the code so anything non-trivial is a significant effort and my free weekends and evenings are sparse :-( If I would do anything, should it be posted here, sent to Pepe, or attached to the ticket? Is it a habit to indicate in the ticket that somebody started coding it actually (especially if it takes longer to implement)?
Peter, it is best if you attach everything to the ticket. If you want to signal that you started coding on a ticket, just take ownership of it.
As for as /:next/ command: Like Pepe indicated, I do not have idea how to do it without working _result and without dynamic stack. Though dynamic stack should not be that hard since how otherwise could profiler count ticks for cost centers. And dynamic stack would be great. It would create new options where to store lists of free variables of selected expressions :)
Having (a kind of messy approximation of) a dynamic stack is possible with a variant of the cost center stacks mechanism used for profiling. But the downside is that code and libraries would need to be compiled for debugging. Nevertheless, I believe that having a true dynamic stack would make debugging so much simpler.
Ok, I did not understand this part a bit till I did not skim over http://www.haskell.org/~simonmar/papers/ghci-debug.pdf Maybe that paper should be mentioned on the wiki pages about debugger. Something like: "If you do not understand why ghci debugger is limited in such a strange way read this."
Debugging for lazy functional languages is a hard problem. The GHCi debugger is no panacea. But you are right in that the current state of things can be improved in several ways. However, the Simons have already enough things in their hands; it is up to us to step forward and help. Unfortunately, my time is also very limited, as I am trying to get a degree here. I am happy to support Peter and anyone else who wants to hack on the debugger, and I will continue maintaining the code around :print. But right now I don't think I can find the time to work on the tickets brought up in this discussion. Cheers, pepe

Having (a kind of messy approximation of) a dynamic stack is possible with a variant of the cost center stacks mechanism used for profiling. But the downside is that code and libraries would need to be compiled for debugging. Is there any info somewhere why the approximation of the dynamic stack needs libraries to be recompiled for debugging? I thought about it but I could not figure out why it would be needed. Here is what I thought is
pepe wrote: the way it works: * the ticks only inform about the approximate start of the selected expression; this is acceptable provided it makes it much easier to implement * the number of items (continuations) on the return stack from the beginning of /case tick<n> of {_->e}/ to the moment when we can check the count of items in the return stack inside /tick<n>/ is constant and known for a given runtime version of ghc Provided the above is true then we can find out the number of items on the return stack which was valid just before /case tick<n> of {_->e}/ was entered. Lets mark this number as tick_stack_size<n>. Then the only thing we need to build the approximation of the dynamic stack is to get a callback from the runtime just before the number of items in the return stack gets below tick_stack_size<n> for the corresponding /case tick<n> of {_->e}/ expression. That is the moment of "step out" from the selected expression and that is the moment when we can pop an item from our dynamic stack approximation. (Each entering of /tick<n>/ pushes one item to the dynamic stack approximation.) All the requirements to implement the above way seem to be easy to do and they do not look like having too bad speed consequences. Just one indirect memory access and a conditional jump more for each pop of a continuation address from the return stack. And the most important thing is that it does not matter whether a library you use is "strobed" with ticks or not. If a library is not "strobed" it would just look like one item in the approximation of the dynamic stack. If a library is not interpreted (it is not being debugged) we do not want to be bugged with its stack frames anyway ... probably. It looks to me better this way without any experience with it yet. Some of the conditional jumps would happen and would result in more work (maintaining the approximation of the dynamic stack), but all non-tagged value accesses would not as well as all expressions which are not annotated with ticks (like e.g. list creation). Anyway, since the libs would be needed to be compiled for debugging something in the above is wrong. I would like to know what is wrong or some pointer to some web page or paper which describes how the approximation of the dynamic stack works for profiler. I cannot think of other way the profiler dynamic stack approximation would work :-/ Thanks, Peter.

Peter Hercek wrote:
Having (a kind of messy approximation of) a dynamic stack is possible with a variant of the cost center stacks mechanism used for profiling. But the downside is that code and libraries would need to be compiled for debugging. Is there any info somewhere why the approximation of the dynamic stack needs libraries to be recompiled for debugging? I thought about it but I could not figure out why it would be needed. Here is what I thought is
pepe wrote: the way it works:
I have the feeling that pepe is talking about *lexical* call stacks, rather than *dynamic* call stacks. Cost-centre-stacks try to give you the lexcial call stack (but sadly don't always work properly, and as I've said before we don't fully understand how to do it, or indeed whether it can be done at all...). It probably *would* require recompiling the libraries, though. Perhaps you're already aware of this wiki page, but I'll post the link anyway: http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack The dynamic call stack is already present, in the form of the runtime execution stack. For debugging you might want to track more information than we store on this stack, however. You seem to have a plan for maintaining a dynamic stack for debugging, perhaps you could flesh out the details in a wiki page, mainly to ensure that we're discussing the same thing? Cheers, Simon

On 17/02/2009, at 9:46, Simon Marlow wrote:
Peter Hercek wrote:
pepe wrote:
Having (a kind of messy approximation of) a dynamic stack is possible with a variant of the cost center stacks mechanism used for profiling. But the downside is that code and libraries would need to be compiled for debugging. Is there any info somewhere why the approximation of the dynamic stack needs libraries to be recompiled for debugging? I thought about it but I could not figure out why it would be needed. Here is what I thought is the way it works:
I have the feeling that pepe is talking about *lexical* call stacks, rather than *dynamic* call stacks. Cost-centre-stacks try to give you the lexcial call stack (but sadly don't always work properly, and as I've said before we don't fully understand how to do it, or indeed whether it can be done at all...). It probably *would* require recompiling the libraries, though.
Yes, I was meaning lexical call stacks, as Simon suggests. Apologies for the confusion, Peter. pepe

Simon Marlow wrote:
Perhaps you're already aware of this wiki page, but I'll post the link anyway:
I was writing about a way how to maintain the stack as described in point 6 of the page (provided that point is about dynamic stack). The point only says it would be fine to have stack without hints how to do it.
The dynamic call stack is already present, in the form of the runtime execution stack. For debugging you might want to track more information than we store on this stack, however.
Does GHC have the same stack for both return addresses and arguments or are they separated? I assumed separated but I'm in doubt now. Do you have enough (debug) information there already to at least match arguments to function calls? My point is that having an exact stack is probably better if it is not too hard to do. On the other side if there is not enough debug information already present, it may be easier to to maintain an approximate debugging stack because most of the information needed for it is already in there. As I already said in other emails, I would rather choose dynamic stack over lexical one if I was forced to choose only one of them. Actually, I almost do not care about lexical stack and still do not understand why people want it. Even for profiling it looks fishy because at least in some cases it behaves like a dynamic stack (time is attributed where expression is forced not where the expression looks to be in the lexical stack).
You seem to have a plan for maintaining a dynamic stack for debugging, perhaps you could flesh out the details in a wiki page, mainly to ensure that we're discussing the same thing?
Sure, but the plan to maintain an approximate debugging dynamic stack depends on one thing: The number of items (continuations) on the return stack from the beginning of /case tick<n> of {_->e}/ to the moment when we can check the count of items in the return stack inside /tick<n>/ is constant and known for a given runtime version of ghc. Or variable but known for each call individually. This is important to find out the number of return addresses on the return stack just before the execution of /case tick<n> of {_->e}/. This looks achievable to me, but maybe it is not. Do you think the condition can be satisfied without too much work? If yes, I'll go on to write the page. If not it would be waste of time. Thanks, Peter.

Peter Hercek
I was writing about a way how to maintain the stack as described in point 6 of the page (provided that point is about dynamic stack).
The whole page (including point 6) is about explicitly maintaining a (simulated) lexical call stack, not the dynamic one.
As I already said in other emails, I would rather choose dynamic stack over lexical one if I was forced to choose only one of them. Actually, I almost do not care about lexical stack and still do not understand why people want it.
In a lazy language, the dynamic stack rarely tells you anything of interest for debugging. For the value at the top of the stack, you get one of many possible _demand_ chains, rather than the creation chain. The demanding location is pretty-much guaranteed not to be the site of the bug. But you can think of the lexical call stack as what _would_ have been the dynamic call stack, if only the language were completely strict rather than lazy. Most people find the latter notion more intuitive for the purposes of finding program errors.
Sure, but the plan to maintain an approximate debugging dynamic stack depends on one thing:
There is no need to approximate the dynamic stack. It is directly available to the RTS, in full detail. Regards, Malcolm

Malcolm Wallace wrote:
In a lazy language, the dynamic stack rarely tells you anything of interest for debugging. For the value at the top of the stack, you get one of many possible _demand_ chains, rather than the creation chain. The demanding location is pretty-much guaranteed not to be the site of the bug.
But you can think of the lexical call stack as what _would_ have been the dynamic call stack, if only the language were completely strict rather than lazy. Most people find the latter notion more intuitive for the purposes of finding program errors.
OK, maybe I understand it. If the lexical stack would give me access to local variables for all its frames it would be probably better. In the current situation where I have only access to the free vars in the current expression it is not that useful. I mean for my code I know what is the creation chain. This may be different if I would debug somebody else's code. But when debugging my code I sometimes lose track what demand chain I'm in or why the hell I'm at the given location at all. Dynamic stack would help here a lot and it would help me to better understand lazy behavior of my code. The creation behavior is rather clear to me because it is explicit in the code. The lazy behavior may be more tough because it is implicit.
Sure, but the plan to maintain an approximate debugging dynamic stack depends on one thing:
There is no need to approximate the dynamic stack. It is directly available to the RTS, in full detail.
Well, but this would be the exact stack. It would be great to see how ghci works but I'm not sure how much helpful it would be for debugging. I'm afraid it would have the same problem as _return binding (bug #1531). In my code _return is mostly wrong. I'm not even checking it out any more. Thanks, Peter.

Peter Hercek wrote:
Simon Marlow wrote:
You seem to have a plan for maintaining a dynamic stack for debugging, perhaps you could flesh out the details in a wiki page, mainly to ensure that we're discussing the same thing?
Sure, but the plan to maintain an approximate debugging dynamic stack depends on one thing: The number of items (continuations) on the return stack from the beginning of /case tick<n> of {_->e}/ to the moment when we can check the count of items in the return stack inside /tick<n>/ is constant and known for a given runtime version of ghc. Or variable but known for each call individually. This is important to find out the number of return addresses on the return stack just before the execution of /case tick<n> of {_->e}/.
I don't fully understand what it is you mean. e.g. I don't know what "from the beginning of /case tick<n> of {_->e}/" means. Let me try to explain a couple of things that might (or might not!) help clarify. We don't normally see case tick<n> of { _ -> e } because the byte-code generator turns this into let z = case tick<n> of { _ -> e } in z the debugger paper explains why we do this. Anyway, the byte code for the closure for z does this: - if the breakpoint at <n> is enabled then stop, - otherwise, evaluate e i.e. it doesn't push any stack frames. Does that help frame your question? Cheers, Simon

Simon Marlow wrote:
Peter Hercek wrote:
Sure, but the plan to maintain an approximate debugging dynamic stack depends on one thing: The number of items (continuations) on the return stack from the beginning of /case tick<n> of {_->e}/ to the moment when we can check the count of items in the return stack inside /tick<n>/ is constant and known for a given runtime version of ghc. Or variable but known for each call individually. This is important to find out the number of return addresses on the return stack just before the execution of /case tick<n> of {_->e}/.
I don't fully understand what it is you mean. e.g. I don't know what "from the beginning of /case tick<n> of {_->e}/" means.
Let me try to explain a couple of things that might (or might not!) help clarify. We don't normally see
case tick<n> of { _ -> e }
because the byte-code generator turns this into
let z = case tick<n> of { _ -> e } in z
the debugger paper explains why we do this. Anyway, the byte code for the closure for z does this:
- if the breakpoint at <n> is enabled then stop, - otherwise, evaluate e
i.e. it doesn't push any stack frames.
Does that help frame your question?
I reread the paper and together with this it actually answered my question. I thought that tick<n> represents a call to the debugger. But it is only a byte code which is checked by interpreter and if the debugging location is enabled then the interpreter breaks. Also I have found out that what I originally intended would not work because the interpreter can beak only at the beginning of a BCO. But maybe there are other almost as good solutions. I'm aiming at these features: * :next command with the meaning: Break at the next source span which has the same or smaller number of addresses on the return stack and which is not a subset of the current source span. * :stepout command with the meaning: Break at the next source span which has smaller number of addresses on the return stack. * build dynamic stack (more or less entirely in GHCi so if it is not used it should not slow down GHC code interpretation; well doing it in GHCi would mean it would be painfully slow because of thread switches but if it proves useful it can be moved to GHC) * ... and maybe also something like the last one (or the last few) frames of lexical stack for the current source span; this one may not be even that interesting if options to filter trace log are fine enough; the problem with trace log is anything which is executed in a loop (so e.g. even some stupid lambda in a map cal) I'm not aiming at full lexical stack. This is not a promise I'll implement the above things. I would like just some more information: * what would be a good way (if any) to implement a new kind of break point: Break at any tick (source span) if number of addresses on the return stack is less than given number. Actually the ability to count number of return addresses (although useful) is not that important. It is important to find out whether the current return stack has more, less, or the same number of return adresses than it had in a given moment in past. Any good paper / web page where I could find how the return stack is actually implemented? * any good paper / web page where I can find how GHC profiler derives lexical call stack approximation? Thanks, Peter.
participants (7)
-
Malcolm Wallace
-
pepe
-
Pepe Iborra
-
Peter Hercek
-
Simon Marlow
-
Simon Michael
-
Simon Peyton-Jones