Hi users of ghci debugger,

This post is going to be a bit longer. Here are few cookies to motivate you to go on:
* you will probably like it more than printf debugging for a lot of cases
* it can provide a way of implementing Claus Reinke's idea of breakpoints with a list of identifiers which should be available at breakpoint location and doing it without recompilation; here is the link to Claus' message: http://permalink.gmane.org/gmane.comp.lang.haskell.glasgow.user/15900
* it gives some idea to ghc team about importance of ghci debugger related tickets (and whether to implement them just the way they were proposed)

A note to ghc developers: Stepping program in ghci debugger sometimes purges top level bindings and sometimes not. Not sure this is a bug or feature (whether there is some logic in it). I do not have a simple example when it purges the bindings. But I did not really look for one.

I would probably post later, but some ghci bugs and missing features are badly limiting my progress. There is not much more I can think of to investigate. Maybe somebody will have better ideas how to solve the obstacles I'm hitting. I'm also posting with hope that people will find this interesting and ghc team will fix some of the critical bugs and adds the most critical features, especially if somebody will not have better debugging tips.

You can get my extensions here:
http://www.hck.sk/users/peter/pub/ghciext-0.1.tar.gz
The extensions are not in a single .ghci file now. The new .ghci file needs to install a library. The reason is that without the library it is not manageable any more.

And here are arch linux packaging instructions for my custom ghc (if you are an arch linux user just download and run makepkg :) ):
http://www.hck.sk/users/peter/pub/ghc-custom-6.10.1.tar.gz
The custom ghc is just the same one as ghc 6.10.1 with two more patches: t2740.patch and loglocal.patch. The first one fixes ticket 2740 and you will get it in 6.10.2. The second one adds :loglocal command to ghci. You can extract the patches from the tar file.

If you already read ghci scripting tutorial from Claus Reinke then you will know how to customize ghciext (that is if you would feel like doing so). The tutorial is here:
http://www.haskell.org/pipermail/haskell-cafe/2007-September/032260.html

Here is the list of custom commands in ghciext package:
:defs                     -- list user-defined commands
:. <file>                 -- source commands from <file> :redir <var> <cmd>...     -- execute <cmd> redirecting stdout to <var> :grep <pat> <cmd>...      -- filter lines matching <pat> from <cmd> output:* <count> <cmd>...       -- run <cmd> <count> times :x <cmd>...               -- run <cmd> with stdout suppressed:bp <bpArgs>              -- put breakpoint at location <bpArgs> (adds hit count) :inject <cc> <c> <sc> <b> -- at location <b> execute <c> if <cc>, and stop if <sc> :monitor ["<c>"] <vs> <b> -- show comma separated variables at location <b> if <c> :watch <cond> <bpArgs>    -- break at location <bpArgs> when <cond> is True :count (_|<N>) [<bpArgs>] -- count/query/stop execution at location <bpArgs> :find <var> <cmd>...      -- step with <cmd> until <var> is found :findex <str> <cmd>...    -- step with <cmd> until <str> is found :next [lazy]              -- step over; lazy version forces only the top constructor :xout                     -- redirect ghci stdout to /dev/null :out                      -- redirect ghci stdout back to console
:defs, :., :redir, :grep, are the same as the commands in Claus' tutorial. The only differences I recall now are:
* grep pattern can be in quotation marks (easier search for spaces)
* grep merges :browse output more nicely
* redir can accept :step, :steplocal etc; i.e. also the commands which sometimes remove top level bindings
* the commands do not pollute top level bindings so much

The rest will describe my custom commands and now they relate to tickets in ghci track. If you want to check the tickets mentioned here then the most easy way is probably selecting them from this list:
http://hackage.haskell.org/trac/ghc/search?q=phercek&noquickjump=1&ticket=on

The initial big problem was how to limit the amount of information ghci debugger spits at you when breakpoints with custom scripts are used. This is also mentioned in forth point of the "unknown" section of ticket #1377:
We can disable a breakpoint with ":set stop N :continue", but this still prints out the breakpoint info when we stop. Should we print the info only if there were no commands?
So I say: yes do it! Just disable any ghci output related to a breakpoint when the breakpoint has a custom code attached. We can disable the output ourselves, but then we disable all the output (including the output of the debugged program). People who are debugging console applications are busted (especially if the applications are interactive). This is not an issue for me since my application is not using console almost at all. I'm solving the problem by prefixing commands like :main and :continue with my command :x. This makes output by default disabled and ghciExt enables it just before a breakpoint code is run. If the breakpoint continues execution it disables it again. If the debugged function finishes the output is enabled by :x itself. A small problem happens when you forget to use :x e.g. with your :main. Then you do not get a prompt when the program finishes. It's time to notice it and use :out to switch it on. This is the only minor disadvantage for a gui application debugging. :xout in not that useful (it is mostly used internally by :x). It could be a hidden command (i.e. not registered in defs).

Now when we have ghci "muted" we can get to real goodies: :monitor, :watch, and :count.
If you like printf debugging, then :monitor can help you. It can monitor only free variables in the selected expression (on which a breakpoint is hit) but that was always enough for me during debug sessions I had. Write a script file (lets say it is named ghciInit (I'll also call it this way later on in this text)) which is like this example:
:monitor var1,var2 ModuleName 23
:monitor var3 ModuleName 40
:x :main mainArgument

Then open ghci with your program and run :. ghciInit
And you will get nice log like this:
(0): var1 = <value01>
var2 = <value02>
(1): var3 = <value03>
(0): var1 = <value11>
var2 = <value12>
(1): var3 = <value13>
... etc

Moreover :monitor allows condition in quotation marks to be specified as the first argument. If you do it then the variable values will be printed only when the condition is True. Of course the condition can contain only free variables in the selected expression. Mostly it is not a problem.
:watch is a conditional breakpoint which stops when the specified condition is True.
:count has 3 forms:
:count _ ModuleName 23
   This never breaks just counts number of times the execution reached position ModuleName 23
:count 5 ModuleNmae 23
   This breaks when we reach position ModuleName 23 fifth time
:count 0
   This tells how many times breakpoint number 0 was hit (it can report the number of hits for any breakpoint created with :bp, :inject, :monitor, :watch, and :count).

The first form of :count is interesting when you want to stop earlier than something bad happens so that you can see why the bad think happened. Put the first form of :count at the start of the function with the bug and then put a break in the function which is hit when the bad think happens maybe with :watch. When you stop at the bug place, check hit count at the start of the function. Add the hit count in your :count breakpoint in your ghciInt file restart and you can use :steplocal or :loglocal to find out what went wrong.

:loglocal is implemented directly in ghc source code. It does exactly the same as :steplocal but makes sure that the trace history does not contain anything outside of the function we step with :steplocal. The problem is that :steplocal works like the code was traced while it is executing. Mostly the result is that your trace history is loaded with crap outside of the scope you are interested in. I'll return to :loglocal later again.

:inject is there when you need something special (:monitor, :watch, and :count are implemented with something very like :inject). E.g. when you want to do monitoring of a value but do not want the associated breakpoint number printed.

:find and :findex are there primary to search trace history.
:find var1 :back
   will find the variable var1 in your trace history by back stepping it
:find var1 :step
   will single step forward till variable var1 is in the list of free variables
:findex BL/Logic.hs:23 :loglocal
will fill in your trace history will all the local breakpoints till location BL/Logic.hs:23 is hit. Having the trace history filled in with the right stuff is useful for checking out why you got bad results later.

Now lets return to Claus' idea of breakpoints with a list of identifiers which should be available at the breakpoint location. You can make sure the identifiers are available with the first form of :count. It never stops but it puts records to the trace history. So the trace history will contain free variables at the locations where you put :count. This will be even more useful when automatic search of trace history is built in (see ticket #2737). So a weaker form of Claus' idea can be implemented by carefully selecting what should bet to the trace history. Why a weaker form only? Well, in some cases, the variable instances in trace log may not be the expected ones (they may be from a different lexical scope). Experience with my code indicates this should be rare.

:next is an idea how to implement a kind of step over. That is if by step over you mean something else than steplocal. The non-lazy form of :next forces _result and does a :step. The lazy form forces only the top level constructor of _result before the step. Hey, I even had a case when it worked just like I expected. But typically it does not work because of bug #1531. _result is not correctly bound to the result of selected expression in most of the practical cases. This bug is also critical for all the forms of conditional breakpoints. It would be cool if we could specify the condition based on _result or some part of it. The implementation of ghciExt conditional breakpoints would need to be extended to support conditions on _result (in particular the breakpoint would need to be disabled during the condition execution) but that is easy to do. Even more worrying thing about bug #1531 is that it has the milestone set to _|_.

It is easy to add :enable and :disable to support enabling and disabling breakpoints. I just did not need it yet. Here is how a GhciExt breakpoint looks like:
*Main> :show breaks [0] Main a.hs:4:2-8 ":cmd return$GhciExt.getStopCode 0 (True) "putStr \"(0): \"\n:force x" "False"" *Main>
Just replace getStopCode with getDisabledStopCode and you have it disabled. Return back to enable. Yeah, and implement getDisabledStopCode which will just continue.

I added :loglocal mostly to simulate how :tracelocal in ticket #2737 would help. I was also trying how full tracing is helping. In both cases the answer is: full tracing almost never helps. :tracelocal from ticket #2737 as proposed originally would rarely help. The problem is that trace log gets overwhelmed with crap when we cannot control what can be saved in it and what cannot be saved. My idea is that user should be able to specify what can go in it and also what should not go in it. Here is an alternative solution to the ones I proposed in tickets #2737 and #2946. I think this one would be best. The command to control the tracing should look like:
-- should everything be traced? :set trace (True|False) -- scopes which should be traced (or should not be traced when ! is present) :set trace ( (!)? scopeid )* -- add/remove individual scopeids to/from the trace specification :set trace (+|-) (!)? scopeidwhere scopeid = ( conid . )* ( varid . )* varid
Notice how scopeid looks. It can have a sequence of varids at the end. The reason is so that user can leave out a scope of a function which is defined in a where clause. The scope specification is similar to the proposal in ticket #3000. E.g. for this code:
fn s = 'a' : add s where add = (++"z")it could look like
:set trace Main.fn !Main.fn.add
meaning trace whole scope of fn but not the stuff in the scope of add.
Order should not be important, requests for not tracing should have precedence before requests to trace.
The scopes which we want typically exclude are the ones which contain loops. The loop content often fills in the trace log forcing the interesting stuff out of it. It is better to investigate functions with loops separately in nested context.

Notice that there is a bit difference between this proposal of controlling trace content and the one in ticket #2737. #2737 assumes usage of breakpoint arguments to specify a scope. The breakpoint arguments give an ability to define scopes at finer level but there is no option to define exclude scopes which I find important now.

The summary is: Trace log is as useful as much you can control what can get in it. The :trace command looks to me like an error. It is better to control it by allowing/disallowing scopes.

I also changed my opinion a bit about ticket #2945. :mergetrace would be better than global trace history. Being able to investigate something separately in a nested context is useful.

If I should order the ghci debugger related tickets then the order would be like (more important first):
#1531 (_result can get bound to the wrong value in a breakpoint)
#2737 and #2946 (add :tracelocal to ghci debugger... and    add command :mergetrace...)
#3000 (:break command should recognize also nonexported top level symbols...)
#2803 (bring full top level of a module in scope when a breakpoint is hit in the ...)
#1377 (the task: "We should print breakpoint related info only if breakpoint has no commands set") but people debugging interactive console applications would like to have this one the very top; IIRC this may be easy to do, looks like all the printing is done in one function (something like afterCmd???); also #2950 looked like trivial to do (like 15 mins without the compile time???)

And the last thing: my first time experiences hacking in the :loglocal into ghc. I cannot tell much, I spend with it only one long Sunday afternoon, but here are my two points:
* I needed to extend ghc interface. The type of function GHC.resume changed from:
    resume :: GhcMonad m => SingleStep -> m RunResult
    to:
    resume :: GhcMonad m => (SrcSpan->Bool) -> SingleStep -> m RunResult
    ... plus the corresponding implementation change. The added argument is a filtering function to limit source spans which can recorded in the trace history.
* It would be cool if ghci has its own dir in the souce tree where only the ghci source files are. It would encourage people to hack it more since it would be easier to maintain private patches and merging upstream. It would be also easier to make sure one modifies only ghci source code so that it works with other ghc releases.

Hopefully this helps somebody,
Peter.