As part of the ‘basic’ series, we now come to the world of OLAP and the various concepts, technologies and tools in it. In my previous post on OLTP and OLAP, we saw the difference (conceptually) between what comprises an OLTP system and what would be considered as an OLAP system.
Having understood the fundamental concept of OLAP, lets go down the rabbit hole and see what other wonders we find. To jumpstart the process I’ll throw at you the 3 acronyms you are bound to hear when OLAP pops up and that is MOLAP, ROLAP and HOLAP. They are the three stooges of OLAP.
Multidimensional OLAP came into being when products like Essbase and Applix hit the market. These data storage engines had a fundamentally difference mechanism for data storage – THE CUBE! The cube is a logical concept of data storage and is very different from the 2D RDBMS (and I assume here that you’re familiar with what an RDBMS is). A cube stores data in a multidimensional model similar to the star or snowflake schema in an RDBMS model. However the internal storage system is different and is optimized for multidimensional storage and hence provides the ideal mechanism for OLAP.
ROLAP is short for Relational OLAP. As the reader must be familiar that OLAP queries are few in quantity but they work on massive amounts of data. Having a normalized ER implementation will cause innumerable joins within the data sets and hence affect the overall system response time. To optimize performance at the cost of storage, the ER model is sometimes denormalized in order to gain performance advantages against storage and redundancy costs. This is the star or snowflake schema and any conventional RDBMS (which is usually optimized for OLTP) can be used as an OLAP engine.
Hybrid OLAP is a combination of MOLAP and ROLAP. In essence, the ROLAP engine contains the detailed data store. The aggregations are built in a MOLAP engine in order to optimize retrieval for high-level reporting purposes. In this setup, the analyst queries on the MOLAP engine and analyses the business. If the analyst need to investigate the detailed data behind a certain value, then the ROLAP can be queried. A HOLAP product would obviously make this invisible to the user providing a seamless experience. How this detailed and aggregate data is partitioned between the MOLAP and ROLAP parts depends on each particular product.
This concludes a brief description of each concept within the OLAP domain. Coming up is a post that will highlight the various products (commercial and open-source) that are available for each.