Ralph Kimball’s definition from his first edition of The Data Warehouse Toolkit is:
A data warehouse is a copy of transaction data specifically structured for query and analysis.
The beauty of this definition is its simplicity. There probably is no definition for a typical data warehouse.
The Enterprise Data Warehouse (EDW) environment, platform, form and function are a result of the requirements of the organization and the decisions taken to best support them. There is a huge variety in how an EDW is implemented and therefore the definition shouldn’t be restricted to a specific case. What it is is an alternate data store that compiles, consolidates and aggregates various data sources within an organisation and presents a single point of truth.
So, in essence:
- The form of the stored data has nothing to do with whether something is a data warehouse.
- A data warehouse can be normalized or denormalized.
- It can be a relational database, multidimensional database, flat file, hierarchical database, object database, etc. Data warehouse data often gets changed. And data warehouses often focus on a specific activity or entity.
- The EDW is not necessarily for the needs of “decision makers” or used in the process of decision making. Since the EDW is considered the one version of truth across all organisational verticals, departments and business, it provides a comprehensive platform for normal reporting purposes as well. Since the conventional reporting is based on the OLTP data, the same reports can be extracted from the EDW and hence shift the reporting load from the OLTP systems.
An EDW is also referred to as a Decision Support System (DSS) implying that it supports the decision making process of a business user. In essence, the EDW only offers creative ways to analyse data against references from across the organisation and thus provides new avenues of looking at the data. The actual decision is still made by the human intelligence.