Hi users of ghci debugger,
This post is going to be a bit longer. Here are few cookies to motivate
you to go on:
* you will probably like it more than printf debugging for a lot of
cases
* it can provide a way of implementing Claus Reinke's idea of
breakpoints with a list of identifiers which should be available at
breakpoint location and doing it without recompilation; here is the
link to Claus' message:
http://permalink.gmane.org/gmane.comp.lang.haskell.glasgow.user/15900
* it gives some idea to ghc team about importance of ghci debugger
related tickets (and whether to implement them just the way they were
proposed)
A note to ghc developers: Stepping program in ghci debugger sometimes
purges top level bindings and sometimes not. Not sure this is a bug or
feature (whether there is some logic in it). I do not have a simple
example when it purges the bindings. But I did not really look for one.
I would probably post later, but some ghci bugs and missing features
are badly limiting my progress. There is not much more I can think of
to investigate. Maybe somebody will have better ideas how to solve the
obstacles I'm hitting. I'm also posting with hope that people will find
this interesting
and ghc team will fix some of the critical bugs and adds the most
critical features, especially if somebody will not have better
debugging tips.
You can get my extensions here:
http://www.hck.sk/users/peter/pub/ghciext-0.1.tar.gz
The extensions are not in a single .ghci file now. The new .ghci file
needs to
install a library. The reason is that without the library it is not
manageable any more.
And here are arch linux packaging instructions for my custom ghc (if
you
are an arch linux user just download and run makepkg :) ):
http://www.hck.sk/users/peter/pub/ghc-custom-6.10.1.tar.gz
The custom ghc is just the same one as ghc 6.10.1 with two more
patches: t2740.patch and loglocal.patch. The first one fixes ticket
2740 and you will get it in 6.10.2. The second one adds :loglocal
command to ghci. You can extract the patches from the tar file.
If you already read ghci scripting tutorial from Claus Reinke then you
will know how to customize ghciext (that is if you would feel like
doing so). The tutorial is here:
http://www.haskell.org/pipermail/haskell-cafe/2007-September/032260.html
Here is the list of custom commands in ghciext package:
:defs -- list user-defined commands
:. <file> -- source commands from <file>
:redir <var> <cmd>... -- execute <cmd>
redirecting stdout to <var>
:grep <pat> <cmd>... -- filter lines matching
<pat> from <cmd> output
:* <count> <cmd>... -- run <cmd>
<count> times
:x <cmd>... -- run <cmd> with stdout
suppressed
:bp <bpArgs> -- put breakpoint at location
<bpArgs> (adds hit count)
:inject <cc> <c> <sc> <b> -- at location
<b> execute <c> if <cc>, and stop if <sc>
:monitor ["<c>"] <vs> <b> -- show comma separated
variables at location <b> if <c>
:watch <cond> <bpArgs> -- break at location
<bpArgs> when <cond> is True
:count (_|<N>) [<bpArgs>] -- count/query/stop execution at
location <bpArgs>
:find <var> <cmd>... -- step with <cmd> until
<var> is found
:findex <str> <cmd>... -- step with <cmd> until
<str> is found
:next [lazy] -- step over; lazy version forces only the
top constructor
:xout -- redirect ghci stdout to /dev/null
:out -- redirect ghci stdout back to console
:defs, :., :redir, :grep, are the same as the commands in Claus'
tutorial. The only differences I recall now are:
* grep pattern can be in quotation marks (easier search for spaces)
* grep merges :browse output more nicely
* redir can accept :step, :steplocal etc; i.e. also the commands which
sometimes
remove top level bindings
* the commands do not pollute top level bindings so much
The rest will describe my custom commands and now they relate to
tickets in ghci track. If you want to check the tickets mentioned here
then the most easy way is probably selecting them from this list:
http://hackage.haskell.org/trac/ghc/search?q=phercek&noquickjump=1&ticket=on
The initial big problem was how to limit the amount of information ghci
debugger spits at you when breakpoints with custom scripts are used.
This is also mentioned in forth point of the "unknown" section of
ticket #1377:
We can disable a breakpoint with ":set stop N :continue", but this
still prints out the breakpoint info when we stop.
Should we print the info only if there were no commands?
So I say: yes do it! Just disable any ghci output related to a
breakpoint when the breakpoint has a custom code attached. We can
disable the output ourselves, but then we disable all the output
(including the output of the debugged program). People who are
debugging console applications are busted (especially if the
applications are interactive). This is not an issue for me since my
application is not using console almost at all. I'm solving the problem
by prefixing commands like :main and :continue with my
command :x. This makes output by default disabled and ghciExt
enables it just before a breakpoint code is run. If the breakpoint
continues execution it disables it again. If the debugged function
finishes the output is enabled by :x itself. A small problem
happens when you forget to use :x e.g. with your :main.
Then you do not get a prompt when the program finishes. It's time to
notice it and use :out to switch it on. This is the only minor
disadvantage for a gui application debugging. :xout in not
that useful (it is mostly used internally by :x). It could be a
hidden command (i.e. not registered in defs).
Now when we have ghci "muted" we can get to real goodies: :monitor,
:watch, and :count.
If you like printf debugging, then :monitor can help you. It
can monitor only free variables in the selected expression (on which a
breakpoint is hit) but that was always enough for me during debug
sessions I had. Write a script
file (lets say it is named ghciInit (I'll also call it this way
later on in this text)) which is like this example:
:monitor var1,var2 ModuleName 23
:monitor var3 ModuleName 40
:x :main mainArgument
Then open ghci with your program and run :. ghciInit
And you will get nice log like this:
(0): var1 = <value01>
var2 = <value02>
(1): var3 = <value03>
(0): var1 = <value11>
var2 = <value12>
(1): var3 = <value13>
... etc
Moreover :monitor allows condition in quotation marks to be
specified as the first argument. If you do it then the variable values
will be printed only when the condition is True. Of course the
condition can contain only free variables in the selected expression.
Mostly it is not a problem.
:watch is a conditional breakpoint which stops when the
specified condition is True.
:count has 3 forms:
:count _ ModuleName 23
This never breaks just counts number of times the execution reached
position ModuleName 23
:count 5 ModuleNmae 23
This breaks when we reach position ModuleName 23 fifth time
:count 0
This tells how many times breakpoint number 0 was hit (it can report
the number of hits for any breakpoint created with :bp, :inject,
:monitor, :watch, and :count).
The first form of :count is interesting when you want to stop
earlier than something bad happens so that you can see why the bad
think happened. Put the first form of :count at the start of
the function with the bug and then put a break in the function which is
hit when the bad think happens maybe with :watch. When you stop
at the bug place, check hit count at the start of the function. Add the
hit count in your :count breakpoint in your ghciInt
file restart and you can use :steplocal or :loglocal to
find out what went wrong.
:loglocal is implemented directly in ghc source code. It does
exactly the same as :steplocal but makes sure that the trace
history does not contain anything outside of the function we step with :steplocal.
The problem is that :steplocal
works like the code was traced while it is executing. Mostly the result
is that your trace history is loaded with crap outside of the scope you
are interested in. I'll return to :loglocal later again.
:inject is there when you need something special (:monitor,
:watch, and :count are implemented with something very
like :inject). E.g. when you want to do monitoring of a value
but do not want the associated breakpoint number printed.
:find and :findex are there primary to search trace
history.
:find var1 :back
will find the variable var1 in your trace history by back stepping it
:find var1 :step
will single step forward till variable var1 is in the list of free
variables
:findex BL/Logic.hs:23 :loglocal
will fill in your trace history will all the local breakpoints till
location BL/Logic.hs:23 is hit. Having the trace history filled in with
the right stuff is useful for checking out why you got bad results
later.
Now lets return to Claus' idea of breakpoints with a list of
identifiers which should be available at the breakpoint location. You
can make sure the identifiers are available with the first form of :count.
It never stops but it puts records to the trace history. So the trace
history will contain free variables at the locations where you put :count.
This will be even more useful when automatic search of trace history is
built in (see ticket #2737). So a weaker form of Claus' idea can be
implemented by carefully selecting what should bet to the trace
history. Why a weaker form only? Well, in some cases, the variable
instances in trace log may not be the expected ones (they may be
from a different lexical scope). Experience with my code indicates this
should be rare.
:next is an idea how to implement a kind of step over. That is
if by step over you mean something else than steplocal. The
non-lazy form of :next forces _result and does a :step.
The lazy form
forces only the top level constructor of _result before the step. Hey,
I even had a case when it worked just like I expected. But typically it
does not work because of bug #1531. _result is not correctly bound to
the result of selected expression in most of the practical cases. This
bug is also critical for all the forms of conditional breakpoints. It
would be cool if we could specify the condition based on _result or
some part of it. The implementation of ghciExt conditional breakpoints
would need to be extended to support conditions on _result (in
particular the breakpoint would need to be disabled during the
condition execution) but that is easy to do. Even more worrying thing
about bug #1531 is that it has the milestone set to _|_.
It is easy to add :enable and :disable to support
enabling and disabling breakpoints. I just did not need it yet. Here is
how a GhciExt breakpoint looks like:
*Main> :show breaks
[0] Main a.hs:4:2-8 ":cmd return$GhciExt.getStopCode 0 (True) "putStr
\"(0): \"\n:force x" "False""
*Main>
Just replace getStopCode with getDisabledStopCode and you have it
disabled. Return back to enable. Yeah, and implement
getDisabledStopCode which will just continue.
I added :loglocal mostly to simulate how :tracelocal in
ticket #2737 would help. I was also trying how full tracing is helping.
In both cases the answer is: full tracing almost never helps. :tracelocal
from ticket #2737 as proposed originally would rarely help. The problem
is that trace log gets
overwhelmed with crap
when we cannot control what can be saved in it and what cannot be
saved. My idea is that user should be able to specify what can go in it
and also what should not go in it. Here is an alternative solution to
the ones I proposed in tickets #2737 and #2946. I think this one would
be best. The command to control the tracing should
look like:
-- should everything be traced?
:set trace (True|False)
-- scopes which should be traced (or should not be traced when ! is
present)
:set trace ( (!)? scopeid )*
-- add/remove individual scopeids to/from the trace specification
:set trace (+|-) (!)? scopeid
where scopeid = ( conid . )* ( varid . )* varid
Notice how scopeid looks. It can have a sequence of varids at the end.
The reason is so that user can leave out a scope of a function which is
defined in a where clause. The scope specification is similar to the
proposal in ticket #3000. E.g. for this code:
fn s = 'a' : add s
where add = (++"z")
it could look like
:set trace Main.fn !Main.fn.add
meaning trace whole scope of fn but not the stuff in the scope
of add.
Order should not be important, requests for not tracing should have
precedence before requests to trace.
The scopes which we want typically exclude are the ones which contain
loops. The loop content often fills in the trace log forcing
the interesting stuff out of it. It is better to investigate functions
with
loops separately in nested context.
Notice that there is a bit difference between this proposal of
controlling trace content and the one in ticket #2737. #2737 assumes
usage of breakpoint arguments to specify a scope. The breakpoint
arguments give an ability to define scopes at finer level but there is
no option to define exclude scopes which I find important now.
The summary is: Trace log is as useful as much you can control what can
get in it. The :trace command looks to me like an error. It is better
to control it by allowing/disallowing scopes.
I also changed my opinion a bit about ticket #2945. :mergetrace would
be better than global trace history. Being able to investigate
something separately in a nested context is useful.
If I should order the ghci debugger related tickets then the order
would be
like (more important first):
#1531 (_result can get bound to the wrong value in a breakpoint)
#2737 and #2946 (add :tracelocal to ghci debugger... and add
command :mergetrace...)
#3000 (:break command should recognize also nonexported top level
symbols...)
#2803 (bring full top level of a module in scope when a breakpoint is
hit in the ...)
#1377 (the task: "We should print breakpoint related info only if
breakpoint has no commands set") but people debugging interactive
console applications would like to have this one the very top; IIRC
this may be easy to do, looks like all the printing is done in one
function (something like afterCmd???); also #2950 looked like trivial
to do (like 15 mins without the compile time???)
And the last thing: my first time experiences hacking in the :loglocal
into ghc. I cannot tell much, I spend with it only one long Sunday
afternoon, but here are my two points:
* I needed to extend ghc interface. The type of function GHC.resume
changed from:
resume
:: GhcMonad m => SingleStep -> m RunResult
to:
resume
:: GhcMonad m => (SrcSpan->Bool) -> SingleStep -> m
RunResult
... plus the corresponding implementation change.
The added argument is a filtering function to limit source spans which
can recorded in the trace history.
* It would be cool if ghci has its own dir in the souce tree where only
the ghci source files are. It would encourage people to hack it more
since it would be easier to maintain private patches and merging
upstream. It would be also easier to make sure one modifies only ghci
source code so that it works with other ghc releases.
Hopefully this helps somebody,
Peter.