Nomen Est Omen: Analyzing the Language of Function Identifiers

Abstract

The identifiers chosen by programmers as function names contain valuable information. They are often the starting point for the program understanding activities, especially if high level views, like the call graph, are available.

In this paper the lexical, syntactical and semantical structure of function identifiers is analyzed by means of a segmentation technique, a regular language and a conceptual classification. The application of these analyses to a database of procedural programs suggests some potential uses of the results, ranging from the support to program understanding, to the evolution toward a standard and more maintainable form.

Postscript version of the paper.