The World Wide Web, due to its sheer size and dynamics, has turned into one
of the most fascinating and important data sources for large-scale analysis and
investigation, ranging from content-based information location, dynamics of
change, to community analysis. Yet, most projects so far rely on special-purpose
tools optimized for a given task, providing only limited flexibility.
In this paper we propose a Data Warehouse based approach to analyze the World
Wide Web. Information contained in the web pages, meta data on the documents, as
well as information acquired from additional sources such as the WHOIS database,
are integrated into a multidimensional view of the Web. The resulting system
allows for flexible analysis of the various characteristics of the Web. Results
from a prototypical study of the Austrian national Web space as part of the AOLA
project demonstrate the potential of the presented approach.
First International Workshop on Very Large Data Warehouses (VLDWH
2002).
In 13th International Workshop on Database and Expert Systems
Applications (DEXA'02), IEEE Computer Society Press, Aix-en-Provence,
France, September 2002.