|
| |
Weifeng Zhang, Baowen Xu, and Hongji Yang,
Learning Users' Interest for Web Pre-Fetching
Abstract
WWW is popular for its multimedia transmission and friendly
interactivity. Although the speed of network has been improved
considerably in recent years, the rapid expansion of using the
Internet, the inherited character of delay in the network and the
Request/Response working mode of WWW still make the internet traffic
very slow and give no guarantee on the Quality of Service. Because HTTP
has no states, the web server cannot know the users' demand and the
users' requests cannot be predicted. Taking advantage of a cache
mechanism and the time locality of WWW access, the browser can preserve
the documents ever accessed in the local machine. By this means, for
the documents in the local cache, the browser does not need to send the
requests to the remote server or to receive the whole responses from
the remote one. Pre-fetching uses the space locality of access. First,
the users' access requests are predicted according to the users'
current request. Second, the expected pages are fetched into the local
cache when the user is browsing the current page. Finally, the users
can access these pages downloaded from the local cache. And thus this
can reduce the access delay to some degrees. Pre-fetching is one kind
of active caches that can cache the pages which are still not requested
by the user. The application of pre-fetching technology in the web can
greatly reduce the waiting time after users have sent their
requests. This paper brings forward an intelligent technology of web
pre-fetching, which can speed up fetching web pages. In this
technology, We use a simplified WWW data model to represent the data in
the cache of web browser to mine the association rules. We store these
rules in a knowledge base so as to predict the user's actions. To
reduce the storage space of the knowledge base, the theory of rough set
is used. By rough set, the interest is based on the equivalence class
and the storage space of interest association repository is largely
reduced. In the client sides, the agents are responsible for mining the
users' interest and pre-fetching the web pages, which are based on the
interest association repository. Therefore it is transparent for the
users to speed up the browsing.
|