Views on ‘Materialized views’

Swayam Raina
4 min readSep 28, 2018

--

When I was working for my previous employer (Belzabar Software Design), I worked heavily with Oracle Database. All the SQL database designing concepts studied during graduation made sense. I mostly had to work on writing PL/SQL stored procedures, functions, views, optimising queries, apart from writing JAVA code.
Though being a fresher, I was not into the underlying implementations, but in writing the business logic. Luckily, The organisation had a very flat structure and I usually took part in Tech discussions (or “Water-cooler discussions” they termed it) near my seating space. Thanks to all of my mentors there, I got into the why part of decisions.

This story is about one such discussion where I came to know about Views in Oracle.

Inspiration

Let’s say You are a developer at Amazon marketplace and you are given a task to design a system wherein the Data Analyst team can check the number of orders customer placed every day.
This seems a very easy problem at the first place. One can easily create a dashboard and can write a simple SQL query to get the required data using the below query,

SELECT trunc (order_date) date, count(*) orders
FROM orders
WHERE order_date > ? and order_date < ?
GROUP BY trunc (order_date);

This query would work fine for a small business but considering the scale of Amazon, this might not be an acceptable solution. The “orders” table can easily have 30+ Million records in a month and looping over such a size every time is a crazy idea.
If one looks closely, what does this query returns is simply a few rows of almost static data. The first thing that comes to the mind is to store this data in a separate table and then query this small table.

Well, this is what a Materialised View is. A separate table which is refreshed automatically or on user request.

Working

Creation

Before we look at the query for a materialised view, I feel we should clear out the difference between a View and a Materialised View. In simple terms, View is a virtual table, that means any query to a view results in the actual query used while defining the view whereas a Materialized View is persistent data, that means the query used to define a materialised view creates a special table in the disk memory.

Continuing… We can use the following query to create a materialised view,

CREATE MATERIALISED VIEW orders_mv
REFRESH FAST ON COMMIT as
SELECT trunc (order_date) date, count(*) orders
FROM orders
WHERE order_date > ? and order_date < ?
GROUP BY trunc (order_date);

Refresh

Running this query creates an actual view table in the memory. One point to be noted is that now we have created this almost static table and now we have to handle the updates to the underlying “orders” table.

We have two refresh options,
- on demand: on user action
- on commit: on table update.

And two refresh techniques,
- fast refresh: Refreshes the MV by managing updates by looking at commits from last refresh time.
- complete refresh: Refreshes the MV by running the main query.

A fast refresh uses something called as a materialised view log. This log is maintained as a DB table which we need to create for using fast refresh.

CREATE MATERIALISED VIEW LOG ON orders;

This creates a table named, mlog$_orders. This table is responsible for storing the delta which can be applied directly to the MV.

In case of an on-demand refresh, we can have scheduled jobs to refresh the data in the MV. But there will always be a period where the MV will be out of synch with the actual data. This might be or might not be acceptable again depending on the use case.
There is one more option (ENABLE ON QUERY COMPUTATION) wherein a result aggregator is used which combines the stale data of MV (using in-demand refresh) and the log table to give consistent data. Though this requires a bit of tweaking since as the difference between the two tables increases, the aggregator has to do more work to give the output result set.

For more on aggregation, you can read here.

Trade-off

One can clearly see by using Materialised View, we increase our disk memory consumption. Though these days disk memory is not a topic of concern, there is one more trade-off with fast refresh.
Fast refresh updates the view whenever there is a commit to the underlying table used by the view. This means the Fast refresh mechanism needs to update the View table as soon as the commit is received by reading the log table. This might create a CPU crunch at the application level by making the writes a tiny bit slower.

So the actual trade-off is between fast and consistent reads for slower commits. Thus the use of MV depends on the application use case.

Before using Materialised Views you should take into considerations about the trade-offs and whether or not this fits for your use case. You should also check if the use case can be achieved by some other way so as the trade-off it presents is better than MVs.

Coming back to the water-cooler talks, We spent almost a couple of hours discussing differences between Views and MVs, and when is it better to use MV

--

--