Are MVs useful?

Nagavaibhav Elluru
NCR Edinburgh
Published in
5 min readMar 26, 2019

We have a data warehouse with generic reporting tables on the data mart. For new metrics identified for reporting, it is required that we use the available reporting tables for calculating the metrics. To calculate these new metrics, we would need to use windowing functions, analytical functions & query set operations.

I took up the task of exploring the possibility of using Materialized Views, for the sake of simplicity of implementation, versus writing another layer of ETL and tables.

Views vs Materialized Views

I had two options, either a View or a Materialized View. These calculated values were to be queried by a service and displayed over a portal, which could be accessed at any time of the day. Using Views would be dead slow as the calculations included aggregations. Calculated data was to be made available directly to be selected for reporting on a portal. Minimal overhead on the query was allowed.

Picture 1 : Data Mart → Materialized Views → Online Portal

So I chose to find the best ways of using Materialized View for this purpose.

I assumed initially that Materialized Views (MViews) are just another type of Views, which had the capability of storing the results. Little did I know I would be writing this story.

Understanding MViews

Materialized Views are Views with storage of the result set on executing the underlying view’s query. They have several mechanisms to refresh the data stored to reflect the changes from base tables.

A Materialized View Log is created on each base table which is involved in the query. On creating this log, a table gets created with predefined name such as “MLOG$_<table_name>”. It contains an identifier such as a primary key or a ROWID from base table, along with it a few auditing information which mentions the change of record(update, insert, delete), validity (old or new) etc.

Materialized views can be created and updated in the following ways:

1. Pre-built table

2. Complete refresh

3. Fast refresh

4. Query Rewrite

To try these out, I designed the following query.

Pre-Built table

This feature helps us to build a materialized view using one of the tables that already has the data. This helps to reuse the storage segment from the table. Hence the MView gets created quickly.

Complete Refresh

To reflect changes in the base tables in the MView, we can do a complete refresh which re-creates the entire MView. This process is resource intensive and the MView is not usable during this refresh. This causes a downtime when refreshing the MView.

Fast Refresh

To use this refresh method, we must create a Materialized View Log on each of the base tables. This View log stores record identifying information into the View Log table.

Picture 2 : Fast Refresh (https://oracle-base.com/articles/misc/materialized-views)

Fast refresh requires that a property be specified when creating the MView. It should be set as “INCLUDING NEW VALUES”, which indicates the Log table to capture old and new values.

If the MView query contains any operators not supported by fast refresh, then Oracle will choose to do a complete refresh.

Restrictions on using Fast refresh for MViews

Some of the restrictions that were found during the prototyping of the calculation of metrics were:

1. Cannot use DISTINCT with aggregate functions

2. Must not use UNION operations

3. Using SYSDATE, ROWNUM, CURRENT_TIMESTAMP does not work

4. Cannot use sub-query expressions

Most of these errors are restrictions to using Fast Refresh. Complete refresh does not impose so many restrictions. When I realized there are a ton of restrictions, I looked into the Oracle documentation:

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

General Restrictions on Fast Refresh from Oracle Docs:

(https://docs.oracle.com/database/121/DWHSG/basicmv.htm#DWHSG8201)

The defining query of the materialized view is restricted as follows:

· The materialized view must not contain references to non-repeating expressions like SYSDATE and ROWNUM.

· The materialized view must not contain references to RAW or LONG RAW data types.

· It cannot contain a SELECT list subquery.

· It cannot contain analytic functions (for example, RANK) in the SELECT clause.

· It cannot reference a table on which an XMLIndex index is defined.

· It cannot contain a MODEL clause.

· It cannot contain a HAVING clause with a subquery.

· It cannot contain nested queries that have ANY, ALL, or NOT EXISTS.

· It cannot contain a [START WITH …] CONNECT BY clause.

· It cannot contain multiple detail tables at different sites.

· ON COMMIT materialized views cannot have remote detail tables.

· Nested materialized views must have a join or aggregate.

Materialized join views and materialized aggregate views with a GROUP BY clause cannot select from an index-organized table.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Conclusions

Fast refresh Materialized Views cannot be used with the same flexibility as that of an Oracle View to house business logic, due to the list of limitations it comes with.

MViews are not just simple Views with storage. They are much more complicated.

I would use MView in cases when only simple selection & projection is involved in the queries. This would be at the point of consumption of data rather than complex manipulation of data.

What did I do to solve my requirements?

The solution I had to go with was to include an ETL layer which takes care of the data manipulations. This ETL Layer enables reporting in the form of a Star schema. This demands additional coding time which is worth it due to the advantages of building a cube:

  • Flexible to addition of dimensions
  • Plug in any BI tool, and Voila you have all the analytics
  • Manageable volumes
  • Easy to understand business areas so, a self service querying is enabled
  • Access management can be setup on the reporting schema
Picture 3 : Data Mart → ETL → Cube→ Online Portal

--

--