Ph.D. thesis. 248 Pages. Department of Computer Science, Vienna University of Technology, November 2002.
Data warehousing is a powerful concept for organizations to analyze their business. It generates benefits for the business as it transforms the intelligence contained in the data into better decision-making, which results in more effective action. The most successful data warehouse implementations deliver business value on an iterative and continuous basis. Therefore, we propose a six-stage data warehouse evolution model in order to meet the need for minimized latency in certain data propagation and decision-making processes.
The zero-latency data warehouse is our vision of a data warehouse system which aims to decrease the time it takes to make a business decisions. In fact, there should be almost zero-latency between the cause and effect of a business decision. This doctoral thesis proposes a technical architecture for a zero-latency data warehouse and investigates its core components:
Time Consistency. We distinguish between two different temporal characterizations of the information appearing in a data warehouse: one is the classical description of the time instant when a given fact has occurred; the other represents the instant when the information is actually intelligible to the system. This distinction, implicit and usually not critical in on-line transaction processing applications, is of particular importance for zero-latency data warehouses. There it can be most useful (or even vital) to determine and analyze what the situation was in the past, with only the information available at a given point in time.
Near Real-Time Data Integration. We will discuss the changing requirements for near real-time data integration in data warehouses. In that context we will study the convergence of traditional ETL (extract-transform-load) and EAI (enterprise application integration) technology, as well as the ODS (operational data store) concept. Finally, we will describe a detailed architecture for near real-time data integration and evaluate two prototype implementations.
Active Decision-Making. Both for efficiency reasons and for consistency in decision-making, an organization will want to (semi-)automate decisions whenever the human mind does not add significant value. We investigate and evaluate several approaches for automating routine decision tasks (database triggers, event-condition-action rules, notifications, etc.). Furthermore, we will extend one of the prototypes with event-handling capabilities in order to enable active decision-making during near real-time data integration.
Finally, we will discuss strengths and limitations of zero-latency data warehouses, as well as some application scenarios, where the approach we propose strongly improves decision-making.