4th International Workshop on
Web Site Evolution

October 2, 2002; Montréal, Canada

WSE 2002
Migrating to Web Services

Home page
Important dates
Special issue

Weifeng Zhang, Baowen Xu, and Hongji Yang,
Learning Users' Interest for Web Pre-Fetching


WWW is popular for its multimedia transmission and friendly interactivity. Although the speed of network has been improved considerably in recent years, the rapid expansion of using the Internet, the inherited character of delay in the network and the Request/Response working mode of WWW still make the internet traffic very slow and give no guarantee on the Quality of Service. Because HTTP has no states, the web server cannot know the users' demand and the users' requests cannot be predicted. Taking advantage of a cache mechanism and the time locality of WWW access, the browser can preserve the documents ever accessed in the local machine. By this means, for the documents in the local cache, the browser does not need to send the requests to the remote server or to receive the whole responses from the remote one. Pre-fetching uses the space locality of access. First, the users' access requests are predicted according to the users' current request. Second, the expected pages are fetched into the local cache when the user is browsing the current page. Finally, the users can access these pages downloaded from the local cache. And thus this can reduce the access delay to some degrees. Pre-fetching is one kind of active caches that can cache the pages which are still not requested by the user. The application of pre-fetching technology in the web can greatly reduce the waiting time after users have sent their requests. This paper brings forward an intelligent technology of web pre-fetching, which can speed up fetching web pages. In this technology, We use a simplified WWW data model to represent the data in the cache of web browser to mine the association rules. We store these rules in a knowledge base so as to predict the user's actions. To reduce the storage space of the knowledge base, the theory of rough set is used. By rough set, the interest is based on the equivalence class and the storage space of interest association repository is largely reduced. In the client sides, the agents are responsible for mining the users' interest and pre-fetching the web pages, which are based on the interest association repository. Therefore it is transparent for the users to speed up the browsing.