Reverse Engineering of the UML Class Diagram from C++ Code in Presence of Weakly Typed Containers

Abstract

UML diagrams, and in particular the most frequently used one, the class diagram, represent a valuable source of information even after the delivery of the system, when it enters the maintenance phase. Several tools provide a reverse engineering engine to recover it from the code.

In this paper, an algorithm is proposed for the improvement of the accuracy of the UML class diagram extracted from the code. Specifically, important information about inter-class relations may be missed in a reverse engineered class diagram, when weakly typed containers, i.e., containers collecting objects whose type is the top of the inheritance hierarchy, are employed. In fact, the class of the contained objects is not directly known, and therefore no relation with it is apparent from the container declaration.

The proposed approach was applied to several software components developed at CERN. Experimental results highlight that a substantial improvement is achieved when the container type information is refined with the inferred data. The number of relations otherwise missed is relevant and the connectivity of the associated class diagrams is radically different when containers are considered.

Postscript version of the paper.