Debugging partial functions by the rules

So all this talk of locating head [] and fromJust failures got me thinking: Couldn't we just use rewrite rules to rewrite *transparently* all uses of fromJust to safeFromJust, tagging the call site with a location? To work this requires a few things to go right: * a rewrite rule * assertions * and rewrite rules firing before assertions are expanded Let's try this. Consider the program: 1 import qualified Data.Map as M 2 import Data.Maybe 3 4 main = print f 5 6 f = let m = M.fromList 7 [(1,"1") 8 ,(2,"2") 9 ,(3,"3")] 10 s = M.lookup 4 m 11 in fromJust s When we run it we get the not so useful error: $ ./A A: Maybe.fromJust: Nothing Ok, so we have a few tricks for locating this, using LocH (http://www.cse.unsw.edu.au/~dons/loch.html), we can catch an assertion failure, but we have to insert the assertion by hand: 1 import Debug.Trace.Location 2 import qualified Data.Map as M 3 import Data.Maybe 4 5 main = do print f 6 7 f = let m = M.fromList 8 [(1,"1") 9 ,(2,"2") 10 ,(3,"3")] 11 s = M.lookup 4 m 12 in safeFromJust assert s 13 14 safeFromJust a = check a . fromJust Which correctly identifies the call site: $ ./A A: A.hs:12:20-25: Maybe.fromJust: Nothing Now, this approach is a little fragile. 'assert' is only respected by GHC if -O is *not* on, so if we happened to try this trick with -O, we'd get: $ ./A A: Debug.Trace.Location.failure So lesson one: you have to do the bug hunting with -Onot. Currently there's -fignore-asserts for turning off assertions, but no flag for turning them on with -O, Simon, could this be fixed? Could we get a -frespect-asserts that works even with -O ? Ok, assuming this assert trick is used, can we get the compiler to insert the asserts for us? If so, this would be a great advantage, you'd just be able to switch on a flag, or import a debugging module, and your fromJusts would be transparently rewritten. With rewrite rules we do just this! So, to our initial unsafe use of fromJust, we add a rewrite rule: -- -- rewrite fromJust to a located version, and hope that GHC expands -- 'assert' after the rule fires.. -- {-# RULES "located fromJust" fromJust = check assert . myFromJust #-} This just tells the compiler to replace every occurence of fromJust with a assertion-throwing fromJust, should it fail. We have to use myFromJust here, to avoid rule recursion. -- -- Inlined to avoid recursion in the rule: -- myFromJust :: Maybe a -> a myFromJust Nothing = error "Maybe.fromJust: Nothing" -- yuck myFromJust (Just x) = x Ok, so can we get ghc to rewrite fromJust to the safe fromJust magicaly? $ ghc --make -Onot A.hs -fglasgow-exts -ddump-simpl-stats [1 of 1] Compiling Main ( A.hs, A.o ) 1 RuleFired 1 located fromJust Linking A ... Yes, the rule fired! GHC *did* rewrite our fromJust to a more useful fromJust. Running it: $ ./A A: A.hs:19:36-41: Maybe.fromJust: Nothing Looks good! But that is deceiving: the assert was expanded before the rule fired, and refers to the rewrite rule source line (line 19), not the fromJust call site (line 12). Now if we could just have the 'assert' token inserted into the AST before it was expanded, we'd be home and dry. Could this be done with TH? Or could we arrange for asserts in rewrite rules not to be expanded till later? Note that this is still a useful technique, we can rewrite head/fromJust/... to some other possibly more useful message. And if we can constrain the rule to fire in only particular modules, we may be able to narrow down the bug, just by turning on a rule. For example, adding: {-# RULES "located fromJust" fromJust = safeFromJust #-} safeFromJust s = case s of Nothing -> "safeFromJust: failed with Nothing. Ouch" Just x -> x will produce: $ ./A "safeFromJust: failed with Nothing. Ouch" So rewrite rules can be used to transparently alter uses of partial functions like head and fromJust. So, further work: * have 'assert' respected when -O is on * think up a technique for splicing in 'assert' via rewrite rules (or TH ...) such that the src locations are expanded after the rewrite, and correctly reflect the location of the splice point. Any ideas? -- Don

On 11/14/06, Donald Bruce Stewart
So, further work:
* have 'assert' respected when -O is on
* think up a technique for splicing in 'assert' via rewrite rules (or TH ...) such that the src locations are expanded after the rewrite, and correctly reflect the location of the splice point.
Any ideas?
Overall I like what you've said. But when I read it, I wondered, "Why do I have to fix every function which calls error? Why can't we just fix error?" We discussed this a bit in #haskell and it was pointed out that, that would only give us the line number of where error is defined. Someone mentoined stack traces aren't so great in haskell, but Cale sugested we try to get a "cost-center trace" plus line numbers? Even though __FILE__ and __LINE__ are a bit simplistic in C they are quite handy for things like this. $0.02, Jason

Ok, so I took the rule rewriting idea and added a preprocessor instead, that inserts 'assert's for you, currently just for head,tail and fromJust. This program, for example: module Main where import qualified Data.Map as M import Data.Maybe main = do print f f = let m = M.fromList [(1,"1") ,(2,"2") ,(3,"3")] s = M.lookup 4 m :: Maybe String in head ([] :: [()]) Fails with: $ ghc A.hs --make [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ... $ ./A A: Maybe.fromJust: Nothing Add one import statement: import Debug.Trace.Location And recompile with the preprocessor: $ ghc A.hs --make -pgmF loch -F -no-recomp [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ... And our fromJust is rewritten to a located fromJust: $ ./A A: A.hs:14:14-19: Maybe.fromJust: Nothing Currently this only works for fromJust, head and tail. Adding user-specified located functions should be trivial. The 'loch' preprocessor is in the darcs repo, now. http://www.cse.unsw.edu.au/~dons/code/loch/ Comments, suggestions? -- Don

On Wed, Nov 15, 2006 at 03:54:31PM +1100, Donald Bruce Stewart wrote:
To: glasgow-haskell-users@haskell.org Cc: haskell-cafe@haskell.org From: Donald Bruce Stewart
Date: Wed, 15 Nov 2006 15:54:31 +1100 Subject: [Haskell-cafe] Debugging partial functions by the rules So all this talk of locating head [] and fromJust failures got me thinking:
Couldn't we just use rewrite rules to rewrite *transparently* all uses of fromJust to safeFromJust, tagging the call site with a location?
The problem with tagging the call site is that there has to be an explicit location somewhere in the code that is maximally meaningful for the user. I don't think that's the whole story. I have always been wondering why error shouldn't always return a call stack from the occurrance of fromJust up to the function I typed into ghci (or up to main, for that matter). I guess this is because the call tree is rewritten extensively before being evaluated, and the output wouldn't be so easy to match to the source code? But shouldn't the mapping of source tree to executable tree be bijective, at least in theory? How hard would it be to hack call stack output into ghc? -matthias

Hi
I have always been wondering why error shouldn't always return a call stack from the occurrance of fromJust up to the function I typed into ghci (or up to main, for that matter).
yhi-stack (as yet unreleased) does exactly this, without requiring you to recompile your code even. hat-stack also does this. Thanks Neil

| I have always been wondering why error shouldn't always return a call | stack from the occurrance of fromJust up to the function I typed into | ghci (or up to main, for that matter). | | I guess this is because the call tree is rewritten extensively before | being evaluated, and the output wouldn't be so easy to match to the | source code? But shouldn't the mapping of source tree to executable | tree be bijective, at least in theory? Simon and I have been brainstorming on just such a thing. I've jotted down our current notes (well some of them) here http://hackage.haskell.org/trac/ghc/wiki/ExplicitCallStack Simon

Couldn't we just use rewrite rules to rewrite *transparently* all uses of fromJust to safeFromJust, tagging the call site with a location? .. Looks good! But that is deceiving: the assert was expanded before the rule fired, and refers to the rewrite rule source line (line 19), not the fromJust call site (line 12). Now if we could just have the 'assert' token inserted into the AST before it was expanded, we'd be home and dry. Could this be done with TH? Or could we arrange for asserts in rewrite rules not to be expanded till later? .. Any ideas?
http://www.haskell.org/pipermail/glasgow-haskell-users/2006-November/011545.... claus

| Looks good! But that is deceiving: the assert was expanded before the rule | fired, and refers to the rewrite rule source line (line 19), not the fromJust | call site (line 12). Now if we could just have the 'assert' token inserted | into the AST before it was expanded, we'd be home and dry. Could this be done | with TH? Or could we arrange for asserts in rewrite rules not to be expanded | till later? That's difficult. Trouble is, the assert expansion happens right at the front, before any desugaring or program transformation. But rewrite rules fire much, much later, in the simplifier. It is, however, possible (or could be made possible) to answer the question "When this rule fires, what module am I compiling", and make that available in the RHS of the rule, by some magic incantation. Harder would be "when this rule fires, what top-level function's RHS is being simplified?". But even that is tricky, because it might be "lvl_4532", a function created by the simplifier itself. The only solid way to connect to programmer-comprehensible stuff is by transforming the original, unadulterated source code, as Hat does, and as Jhc does, and as Simon and I are musing about doing. Your idea is to do something cheap and cheerful, which is always a good plan. But I don't see how to make it fly. (The -O thing, and/or providing $currentLocation rather than just 'assert', seem easier.) Simon

This good long thread prompted Don and I to take a look at the darcs source (this is current darcs-unstable). You can get the darcs source at: darcs get http://abridgegame.org/repos/darcs-unstable Here are some data points (without interpretation): In impossible.h we have the definitions: import Bug ( bug ) #define impossible (bug $ "Impossible case at "++__FILE__++":"++show (__LINE__ :: Int)++" compiled "++__TIME__++" "++__DATE__) #define fromJust (\m_fromJust_funny_name -> case m_fromJust_funny_name of {Nothing -> bug ("fromJust error at "++__FILE__++":"++show (__LINE__ :: Int)++" compiled "++__TIME__++" "++__DATE__); Just x -> x}) The second definition gives bug reports like the following: fromJust error at Push.lhs:136 compiled 11:51:58 Jun 16 2006 Please report this to bugs@darcs.net, If possible include the output of 'darcs --exact-version'. To see at a glance the various bug reports about fromJust you can search the bug database: http://bugs.darcs.net/issue?@columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignedto%2Cstatus&@sort=activity&@group=priority&@search_text=fromJust I count 7 bugs. Some other datapoints: $ grep fromJust *.lhs *.hs | wc -l 101 $ grep fromMaybe *.lhs *.hs | wc -l 7 $ grep maybe *.lhs *.hs | wc -l 120 I would be interested to see the results of static analysis tools (Catch?) or applying Oleg's strategy. Any volunteers? thanks! Jason

Hi
To see at a glance the various bug reports about fromJust you can search the bug database: http://bugs.darcs.net/issue?@columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignedto%2Cstatus&@sort=activity&@group=priority&@search_text=fromJust
I count 7 bugs.
I would be interested to see the results of static analysis tools (Catch?) or applying Oleg's strategy. Any volunteers?
Unfortunately darcs is too big, and too unhaskell-98 to go through Catch as it currently stands. In reality the best strategy would probably be to use Catch on darcs, then where Catch is unable to automatically verify the program use Oleg's techniques and other rewritings until Catch can verify everything. Just taking a random example (the first fromJust I stumbled upon): http://abridgegame.org/repos/darcs-unstable/Population.lhs, cleanPop The requirement here is that the modifiedHowI field of the 2nd field of the Pop at the top must be not a removal. Figuring out in an existing code-base whether that is a general invariant for Pop, true in this specific case, or any other random combination is quite a hard problem! Thanks Neil

ndmitchell:
Hi
To see at a glance the various bug reports about fromJust you can search the bug database: http://bugs.darcs.net/issue?@columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignedto%2Cstatus&@sort=activity&@group=priority&@search_text=fromJust
I count 7 bugs.
I would be interested to see the results of static analysis tools (Catch?) or applying Oleg's strategy. Any volunteers?
Unfortunately darcs is too big, and too unhaskell-98 to go through Catch as it currently stands. In reality the best strategy would probably be to use Catch on darcs, then where Catch is unable to automatically verify the program use Oleg's techniques and other rewritings until Catch can verify everything.
Just taking a random example (the first fromJust I stumbled upon):
http://abridgegame.org/repos/darcs-unstable/Population.lhs, cleanPop
The requirement here is that the modifiedHowI field of the 2nd field of the Pop at the top must be not a removal. Figuring out in an existing code-base whether that is a general invariant for Pop, true in this specific case, or any other random combination is quite a hard problem!
This would be an argument for deprecating fromJust then, to discourage its use. The darcs case illustrates how the fromJust style encourages the (unintentional) embedding of uncheckable isJust invariants into Haskell code. The relatively high bug report rate in darcs due to uncaught fromJusts only emphasises the problems associated with this style. -- Don
participants (6)
-
Claus Reinke
-
dons@cse.unsw.edu.au
-
Jason Dagit
-
Matthias Fischmann
-
Neil Mitchell
-
Simon Peyton-Jones