
This isn't strictly Haskell related, but anyway. Next year I will be doing my honours in mathematics. One possible topic for my thesis that I've thought of - and my supervisor is quite enthused about - is to use graph theory to analyse various textual sources, starting with source code but leaving the framework open enough to be able to extend it to other sources (e.g. email address books). How I envisage it happening is that a parser would be used to find all "functions" in the given code, treat these as nodes in the graph and then use directed edges to indicate which functions call other functions. This resultant graph can then be analysed in various ways suitable to the context (e.g. find that a library module can be split into two since there are two completely separate trees present in the graph that don't interact at all, or if a function is only ever called by one other function then it can be subsumed into it). So, here is the question I ask of all of you: is this feasible? Do you know if anything like this has ever been attempted before? I know there are some other usages of graph theory related to source code (e.g. McCabes complexity metric [1]), but I couldn't seem to find anything related to what I'm proposing. I intend to code this up in Haskell (possibly using FGL: I know of it, but haven't really looked at it) and use Haskell as my primary target for analysis, so in a sense the resultant graph could be seen as a Haskell equivalent to UML. [1] http://en.wikipedia.org/wiki/Cyclomatic_complexity -- Ivan Lazar Miljenovic