Simon C Shaw: Program comprehension

Program comprehension:

Note - the interested reader should visit our software testing homepage.

Program comprehension is concerned with the understanding of existing software systems. It is a major factor in the maintenance and evolution of software. Without an understanding of the structure and inter-relationships of a piece of software, programmers cannot carry out changes required to improve or increase functionality, or to correct detected errors, that are a necessary part of the continued maintenance and evolution of systems. The aim is to both understand the software through visualisations of models of it, and to identify areas of the software which may be remodularised into separate modules as a means of aiding the maintenance process by localising the impacts of change. Further, these identified areas are potentially reusable and as such we refer to them as potential reuse candidates.

Summary of current work and future progressions:
Converted the call graph into a directed graphical model by introducing a generalised conditional independence property. Used the conditional independence property to examine the structure of the dominance tree, thus strengthening the interpretation. This leads to an improved dominance tree, the moral dominance tree. The moral dominance tree has the same shape as the dominance tree, we merely alter the shapes and shadings of the nodes. Stephen Rank has recently provided software for drawing the moral dominance tree from a given call graph. Future work is to run case studies on the moral dominance tree.
The dominance relation may be viewed as a graph separation property, but it only looks for a single separator. It struggles with areas of the call graph which are heavily dependent, in particular cliques. We are currently exploring ways of trying to improve this by modifying the call graph to merge nodes in a clique together prior to a dominance analysis.
The work outlined above centres around improving analysis provided by the dominance relation. A better solution may be to introduce an alternative representation of the call graph as a tree, namely the moral tree. The moral tree is derived from the conditional independence structure of the call graph. Unlike the dominance tree, the call graph can always be recovered from the moral tree. Future work is to develop the moral tree further and also explore its properties in a more general statistical framework.

The calling structure of a piece of software provides a high level description of the flow of the program. This structure describes the procedural units and the relationships between them and may be visualised as a directed acyclic graph called the call graph. A further abstraction may be made by converting the call graph into a directed tree. This may be achieved by using the dominance relation; the directed tree is termed the dominance tree.

Dominance trees have be used as a means for reengineering legacy systems into potential reuse candidates. The dominance relation shapes the form of the reuse candidates which are identified as the strongly directly dominated subtrees. The paper :

Moral dominance relations for program comprehension
Simon C Shaw (1), Michael Goldstein (1), Malcolm Munro (2), Elizabeth Burd (2)
(1) Department of Mathematical Sciences, University of Durham
(2) Research Institute in Software Evolution, Department of Computer Science, University of Durham
Status: in submission
Downloads: postscript or pdf
Questions/comments: please e-mail me

reviews the approach, illustrating how the dominance tree fails to show the relationship of the strongly directly dominated nodes to the directly dominated nodes. We propose introducing a relation of generalised conditional independence which strengthens the argument for the adoption of the potential reuse candidates suggested by the dominance tree by explaining their relationship with the directly dominated nodes. This leads to an improved dominance tree, the moral dominance tree, which helps aid program comprehension available from the tree. We also argue that the generalised conditional independence relation identifies potential reuse candidates that are missed by the dominance relation.

The moral dominance has the identical shape to the dominance tree, the only difference is in the shapes and shadings of the nodes (and the interpretation of the tree). The construction of the tree is still dependent upon the dominance relation and so does not address the weaknesses of this relation.
One area of weakness is that the dominance tree is an abstraction of the call graph, but made over the same collection of nodes. One immediate consequence of this is that, in general, the call graph cannot be reclaimed from the dominance (and moral dominance) tree. One consideration is to attempt to merge nodes together on the dominance tree. The dominance relation is easily seen to be a graph separation property, but one that only seeks a single separator rather than a set. An improvement may be to seek separator sets, but how can one do this in an automated way? On approach that we are currently exploring is to modify the call graph to merge nodes that are in cliques on the call graph and then perform a dominance analysis on the modified call graph.

An alternative approach is not to use the dominance relation but to work solely with the generalised conditional independence relation generated over the call graph. We consider conditional independencies around a root node and a graphical representation of the separations derived.
By removing nodes and arcs from the call graph, it is possible to treat each node on the call graph as a root node of a subgraph of the call graph and build a graphical representation of the call graph as a tree, which we term the moral tree. The ideas behind this are explored in the working document:

Moralising the call graph as a means of program comprehension
Simon C Shaw (1), Michael Goldstein (1), Malcolm Munro (2), Elizabeth Burd (2)
(1) Department of Mathematical Sciences, University of Durham
(2) Research Institute in Software Evolution, Department of Computer Science, University of Durham
Status: working document
Downloads: postscript or pdf
Questions/comments: please e-mail me

The moral tree represents conditional independences on the call graph and is constructed from the moral graph of subgraphs of the call graph. It has the property that the call graph may be reclaimed from the moral tree, whilst it is easy to explore separations on the moral tree and hence identify potential reuse candidates. The moral tree merges nodes together and will always have at least one less node than the call graph.
Simon Shaw

Last revision:
14/03/02 University home Statistics home Statistics people Top of page