How to increase query execution speed to 5X using materialized view

Anish patel
4 min readApr 17, 2019

As a software developer, you must be interacting with databases on a regular basis. While it is quite possible that you might be using some kind of ORM to interact with the database and generally avoid raw SQL, there will be a time when your project will have some crazy requirement which will require you to join multiple tables with custom logic. Especially if you are working on data analysis or BI-related project like me. I know you can write a raw query with ORM, you or your users won’t be happy with its performance since each time user hits the API your code will execute this complex query with multiple joins. One solution to this problem is Materialized view. In this article, you will learn about what is materialized view, the difference between view and materialized view, how to create a materialized view in PostgreSQL, when to NOT use it and its disadvantages.

What is a materialized view?

A materialized view is nothing but a physical table which is a result of some complex query. A query is executed once and the data is stored in the form of the actual table in the memory. A materialized view is a database object similar to some simple table, but it is handled differently than table in some scenarios. Such as when you try to delete/drop the materialized view which is being used somewhere PostgreSQL won’t allow you to do so. So by default, it keeps some kind of reference count for a materialized view. Other than that It can be operated as a simple table. For example, you can add indexes, primary key to it. We can think of a materialized view as a cache for expensive queries but in a macro form. While normal micro level caching is handled by a database engine itself, a materialized view gives power in the hand of software developer/DBA. In Oracle, it is being used for a completely different purpose. Which is to keep snapshots of different database objects of different master sites to many non-master sites.

Difference between view and materialized view

  • A view is virtual and does not really exist in the disk while a materialized view is actually stored in the disk.
  • A view is generally used to implement an access control mechanism which limits what each type of user can “view”. A materialized view is used to increase the performance of complex queries.
  • A view gets updated automatically each time it gets executed. A materialized view needs to be updated manually.
  • A view is generated each time it is being queried. A materialized view is precomputed and stored in the disk.

Materialized view in PostgreSQL

Syntax:

CREATE MATERIALIZED VIEW view_name
AS
query
WITH [NO] DATA;

view_name : It is the name of a view.

query: It is the SELECT query which will supply data for the materialized view.

WITH DATA/ WITH NO DATA: This option specifies whether to load data into materialized view the creation time or not. WITH DATA loads data at the creation time and WITH NO DATA does not.

Example:

CREATE MATERIALIZED VIEW my_view
AS
SELECT *
FROM public.user as u
INNER JOIN public.blog as b
ON u.userid = b.userid
INNER JOIN public.comment as c
ON b.blogid = c.blogid
WITH DATA;

Refresh materialized view

You also need to refresh the materialized view if tables which are being used in the query are updated daily.

REFRESH MATERIALIZED VIEW my_view;

While refreshing view like this, Postgres locks the entire table of the materialized view so you can not query it. Since query which generated the materialized view is fairly complex that’s why we used materialized view in the first place, it is safe to assume that it will take a long time to refresh the view. A way around this problem is CONCURRENTLY.

REFRESH MATERIALIZED CONCURRENTLY VIEW my_view;

With CONCURRENTLY option, Postgres creates a temporary updated version of the view and then writes only the changes to the original version. This way the original table does not get locked and stay available for querying.

Note: Prerequisite for using CONCURRENTLY is that view needs to have one UNIQUE index.

Remove materialized view

DROP MATERIALIZED VIEW my_view

Note: If the view is being actively used somewhere Postgres won’t allow you to remove the materialized view.

When to NOT use materialized view

  • When your query is fairly simple but it is taking a long time to execute just because the table has a large number of data, in such scenario one should not use this feature.
  • When tables that are being used are updated frequently and your requirement needs you to provide real-time updated results, at such time it does not make much sense to refresh the materialized view frequently then use that same view.
  • Any requirement in which a frequency of refreshing view is either similar or more than actually using that view, you should not use materialized view.

Disadvantages of a materialized view

  • You need to refresh them manually or set cron job in order to refresh them. Which allows you to get the correct result.
  • It occupies actual space in the memory so one should be very specific and assess the whole situation before deciding to create the materialized view.

Conclusion

In summary, a materialized view is a very powerful feature if used correctly. I have used on many occasions which offered me crazy speed improvements, in some cases query execution time reduced from 40 seconds to 3–4 seconds. Which I think is really astonishing.

--

--

Anish patel

I am current graduate student of CS at UT Dallas. I love coding, sketching and sometimes writing. I write about software engineering and human mind. Stay tuned.