Using Clustering to Support the Migration from Static to Dynamic Web Pages

Abstract

Web sites of the first generation consist typically of a set of purely static Web pages. Content and presentation are not separated, and a same page structure is replicated every time a similar organization of the information is devised. Such a practice poses several problems to the evolution of these sites. It is not easy to update the content, and each time the HTML structure is modified, the same changes have to be propagated to all replications.

In this paper, an approach is proposed for the identification of the Web pages that are more amenable to be migrated into a dynamic version, in that they share a similar structure, filled in with a content organized according to a common scheme. Clustering is used for this purpose: a common template is extracted from the pages in the same cluster, and the variable information of the pages matching the template is migrated to a database. A server side program extracts the requested information from the data base, and generates dynamically the HTML pages to be displayed in the browser.

Postscript version of the paper.