Datacubes — Do we mean ‘cubes’?

Nayanika Mondal
3 min readMar 18, 2019

--

Contrary to what the notion is, a data cube (and here we are referring to an OLAP cube) is not necessarily a strict mathematical definition of a cube as all its sides are not always equal.

An irregular trapezoidal cube

Then why is the term ‘cube’ widely used?

As we know, the OLAP (Online Analytical Processing) cubes are used to define a multidimensional dataset. These various parameters of data in a cubical representation form help analysts to easily visualize and operate on them for productive outcomes. When the number of dimensions of the dataset are more than 3, the datacubes are also referred to as hypercubes.

There are many ways in which a cube can be studied for analysis and learning. The various OLAP operations are shown in the image below:-

The OLAP data is stored in a data warehouse having certain specific methods of design such as star schema and snowflake schema. These two designs can handle large, multidimensional data extremely efficiently which is why they are used in relational data warehouses. The terms ‘star’ and ‘snowflake’ refer their patterns.

The star schema is simpler and has a high level of data redundancy but its cube processing ability is faster, on the other hand, a snowflake schema is an extension of the star schema and is much more complex which is why its processing speed is slower, but it has a low level of data redundancy.

There is also a third type of schema which is known as the Galaxy schema or Fact Constellation schema.

The plus points of a datacube are mainly “multidimensionality” and “facilitation of easy analysis”. If compared to a normal OLTP database, it has benefits such as “speed” and “clean, well-structured views”.

The drawbacks of OLAP cubes include low computational capability, lack of user-friendliness, slow reaction to analytical business demands, abstraction of the cubes, the requirement of a huge amount of pre-modeling of the analytical data.

Representation of a Data Lake (Image Source: Credera)

Moreover, with the advent of IoT where a lot of real-time data goes into Data Lakes instead of Data warehouses, which is why OLAP cubes are getting replaced these days as more and more people find it easier to access data from the Data Lakes.

In spite of a lot of drawbacks, hypercubes are here to stay because the OLAP technologies are only getting morphed, not eradicated - to other forms such as Cloud Data Warehouses (Snowflake, Google BigQuery, Amazon Redshift), Data Virtualizations(Cisco Data Virtualisation, Informatica Data Virtualization, IBM Big SQL) and Serviced Cloud and Analytics(Looker and Sisense).

With the current and future requirements of the IT industry, OLAP datacubes are revolutionized to be the right technology which can meet the demands of various business houses dealing with and analyzing a magnanimous amount of multidimensional datasets.

--

--