Speeding Up PostgreSQL in 3 Easy Steps

Here at SATS, our Database Administrators have seen it all. One of the common issues with our clients who use Postgres is that they would like to allow frequent, quick access to big and active tables. However, all those big joins and active connections end up not performing as expected. Our DBAs explain how to speed up your Postgres using Matviews.

For the live sample, let’s take two large tables : orders and customers.

Table "public.customers"
Column | Type | Modifiers
----------------------+-----------------------+---------------------------------------------------------------
customerid | integer | not null default nextval('customers_customerid_seq'::regclass)
firstname | character varying(50) | not null
lastname | character varying(50) | not null
address1 | character varying(50) | not null
address2 | character varying(50) |
city | character varying(50) | not null
state | character varying(50) |
zip | integer |
country | character varying(50) | not null
region | smallint | not null
email | character varying(50) |
phone | character varying(50) |
creditcardtype | integer | not null
creditcard | character varying(50) | not null
creditcardexpiration | character varying(50) | not null
username | character varying(50) | not null
password | character varying(50) | not null
age | smallint |
income | integer |
gender | character varying(1) |
Indexes: "customers_pkey" PRIMARY KEY, btree (customerid)
Referenced by: TABLE "cust_hist" CONSTRAINT "fk_cust_hist_customerid" FOREIGN KEY (customerid) REFERENCES customers(customerid) ON DELETE CASCADE
TABLE "orders" CONSTRAINT "fk_customerid" FOREIGN KEY (customerid) REFERENCES customers(customerid) ON DELETE SET NULL
Table "public.orders"
Column | Type | Modifiers
-------------+---------------+----------------------------------------------------------
orderid | integer | not null default nextval('orders_orderid_seq'::regclass)
orderdate | date | not null
customerid | integer |
netamount | numeric(12,2) | not null
tax | numeric(12,2) | not null
totalamount | numeric(12,2) | not null
Indexes:
"orders_pkey" PRIMARY KEY, btree (orderid)
"ix_order" btree (orderdate)
"ix_order_custid" btree (customerid)
Foreign-key constraints:
"fk_customerid" FOREIGN KEY (customerid) REFERENCES customers(customerid) ON DELETE SET NULL
Referenced by: TABLE "orderlines" CONSTRAINT "fk_orderid" FOREIGN KEY (orderid) REFERENCES orders(orderid) ON DELETE CASCADE

Say there are lots of rows in both tables: select count(*) from customers; -> 5120000 and select count(*) from orders; 12288000 rows.

Management needs to see a list of “heavy shoppers” for the last 7 days displayed often. But selecting the following to provided the needed data runs for too long.

SELECT o.customerid, c.firstname, c.lastname, sum(o.totalamount) SUM
FROM orders o, customers c
WHERE c.customerid=o.customerid and o.orderdate > now() - interval '7 day'
GROUP BY o.customerid, c.customerid ORDER BY SUM DESC LIMIT 100;

The problem is, of course that it runs almost 6 seconds (total runtime at 5602.532 ms) for 18 rows.

What can we do? Indexing and partitioning will only get you so far. A real solution is to “cache” a view to fit the select statement (frequently used and natively available in ORACLE) called Materialized Views (or Matviews).

While no native matview support exists in PostgreSQL as of version 9.4, it is relatively easy to create a pglsql logic to control the situation. The logic below uses matviews and refreshes them into tables as it goes.

Step 1
Creation of matviews functions and data dictionary

Create the mat view table (data dictionary). This will include the original view for each matview, the last refresh date and duration.

CREATE TABLE matviews (    
mv NAME NOT NULL PRIMARY KEY
, view NAME NOT NULL
, last_refresh TIMESTAMP WITH TIME ZONE
, refresh_time INTEGER
);

Create the create_matview, drop_matview and refresh_matview as per below:

CREATE OR REPLACE FUNCTION create_matview(NAME, NAME)
RETURNS VOID
SECURITY DEFINER
LANGUAGE plpgsql AS '
DECLARE
matview ALIAS FOR $1;
view_name ALIAS FOR $2;
entry matviews%ROWTYPE;
BEGIN
SELECT * INTO entry FROM matviews WHERE mv = matview;
IF FOUND THEN
RAISE EXCEPTION ''Materialized view ''''%'''' already exists.'',
matview;
END IF;
EXECUTE ''REVOKE ALL ON '' || view_name || '' FROM PUBLIC'';
EXECUTE ''GRANT SELECT ON '' || view_name || '' TO PUBLIC'';
EXECUTE ''CREATE TABLE '' || matview || '' AS SELECT * FROM '' ||view_name;
EXECUTE ''REVOKE ALL ON '' || matview || '' FROM PUBLIC'';
EXECUTE ''GRANT SELECT ON '' || matview || '' TO PUBLIC'';
INSERT INTO matviews (mv, view, last_refresh,refresh_time) VALUES (matview, view_name, CURRENT_TIMESTAMP,0);
RETURN;
END ';
CREATE OR REPLACE FUNCTION drop_matview(NAME) RETURNS VOID
SECURITY DEFINER
LANGUAGE plpgsql AS '
DECLARE
matview ALIAS FOR $1;
entry matviews%ROWTYPE;
BEGIN
SELECT * INTO entry FROM matviews WHERE mv = matview;
IF NOT FOUND THEN
RAISE EXCEPTION ''Materialized view % does not exist.'', matview;
END IF;
EXECUTE ''DROP TABLE '' || matview;
DELETE FROM matviews WHERE mv=matview;
RETURN;
END ';
CREATE OR REPLACE FUNCTION refresh_matview(name) RETURNS VOID
SECURITY DEFINER
LANGUAGE plpgsql AS '
DECLARE
matview ALIAS FOR $1;
entry matviews%ROWTYPE;
start_time TIMESTAMP WITH TIME ZONE;
BEGIN
start_time:= timeofday()::timestamp;
RAISE NOTICE ''Matview Refresh Start Time is %s'',start_time;
SELECT * INTO entry FROM matviews WHERE mv = matview;
IF NOT FOUND THEN
RAISE EXCEPTION ''Materialized view % does not exist.'', matview;
END IF;
EXECUTE ''DELETE FROM '' || matview;
EXECUTE ''INSERT INTO '' || matview
|| '' SELECT * FROM '' || entry.view;
RAISE NOTICE ''Matview Refresh End Time is %s (total %seconds)'',
timeofday()::timestamp,round(extract(epoch from(timeofday()::timestamp-start_time)));
UPDATE matviews
SET last_refresh= timeofday()::timestamp,
refresh_time= round(extract(epoch from ( timeofday()::timestamp-start_time)))
WHERE mv=matview;
RETURN;
END ';

Step 2
Creation of the mission-specific view

To create the matview we will use the create_matview function.

bar=# SELECT create_matview('mv_top_customer', 'v_top_customer');  
create_matview
----------------
(1 row)
Time: 5559.486 ms

The mv_top_customer matview is created and can now be used. You can even create indexes on the matview table as needed (though it is suggested to drop them before refresh in some cases).

From the matview, select and it will run in a fraction of the time, as opposed to the original select from source tables)

bar=# explain analyze select * from mv_top_customer;   
Seq Scan on mv_top_customer (cost=0.00..12.70 rows=270 width=272) (actual time=0.008..0.020 rows=100 loops=1)
Total runtime: 0.044 ms
(2 rows)
Time: 0.397 ms

Step 3
Administration of the matview

To refresh , run select refresh_matview(‘mv_top_customer’);

bar=# select refresh_matview('mv_top_customer');   
NOTICE: Matview Refresh Start Time is 2013-10-21 23:05:03.297062+03s
NOTICE: Matview Refresh End Time is 2013-10-21 23:05:08.675053s (total 5 seconds)
refresh_matview
-----------------
(1 row)
Time: 5398.158 ms

Note that refresh still runs the original 6 seconds but we don’t care as it is “behind the scenes”

Last step is to create a simple crontab script to refresh the matview as needed, something along the lines of following will work:

postgres@test-01:~$ crontab -l   
0 * * * * echo "select refresh_matview('mv_top_customer')" | psql -d bar > /tmp/refresh.log 2>&1

The matviews table will let you know once how long the last refresh took and when the rest refresh occurred.

bar=# select * from matviews;   
mv | view | last_refresh | refresh_time -----------------+----------------+-------------------------------+--------------
mv_top_customer | v_top_customer | 2013-10-21 23:05:08.675362+03 | 5

If needed, matview could be dropped using drop_matview(‘mv_top_customer’).

You can change the origin view definition (v_top_customer in this case) and this will be updated in the next matview. Just note that every change in the view columns or table definitions into the view will break the matview. You will have to dropp the matview and recreate it.

Having trouble? Feel free to contact our DBAs for assistance here.

)
Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade