
be tentatively stored as a directed graph with the nodes being functions and the edges being function calls, i.e. if f calls g, there is a directed edge from f to g):
Will the graph be completely computed before the analysis begins? Or will you try to build the graph lazily as and when you require more information? And
* root finding -> find nodes with no incoming edges (in a program, you'd ideally want only main to be such a node)
* cycle detection -> find a possibly recursive cycle in your code: if the cycle is too big, then you may wish to consider refactoring
* depth analysis -> find leaves that have a depth from the root node that is extremely large compared to the others (if a function is 50 calls down from main compared to an average of 15, you may wish to refactor)
* chain detection -> find connected functions that have only one incoming and one outgoing edge, e.g. : f -> g -> h : if g isn't used anywhere else, you may wish to combine it inside either f or g
* node popularity -> get a count of how many functions use a particular function and how many other functions it calls (related to chain detection above)
* clique detection -> probably not as relevant to source code, but if you have a large number of functions that are all pairwise co-recursive than you may wish to refactor
Can anyone think of any other kind of functions that would be useful in this kind of source code analysis?
Are you looking for Haskell functions that can be used to solve the above problems? I guess once you come up with the algorithms, translating it into Haskell shouldnt be much of a problem. -- Vimal