Using Portfolio Theory for Automatically Processing Information about Data Quality in Data Warehouse Environments

Abstract:

Data warehouses are characterized in general by heterogeneous data sources providing information with different levels of quality. In such environments many data quality approaches address the importance of defining the term “data quality” by a set of dimensions and providing according metrics. The benefit is the additional quality information during the analytical processing of the data.
In this paper we present a data quality model for data warehouse environments, which is an adaptation of Markowitz’s portfolio theory. This allows the introduction of a new kind of analytical processing using “uncertainty” about data quality as a steering factor in the analysis. We further enhance the model by integrating prognosis data within a conventional data warehouse to provide risk management for new predictions.

Authors:

Robert M. Bruckner
Institute of Software Technology, Vienna University of Technology, Austria.

Josef Schiefer
Institute of Software Technology, Vienna University of Technology, Austria.
 

Publishing Information:

In Proceedings of the International Conference on Advances in Information Systems (ADVIS 2000), Springer Verlag, LNCS 1909, pp. 34-43, Izmir, Turkey, October 2000.