CodeX
Published in

CodeX

Monitor PostgreSQL Performance Using a Database Health Dashboard

Why Monitor PostgreSQL Database Health?

Just like monitoring our applications, it is very much necessary to monitor our database system. We want to monitor things substantially more granular than system-level processes — things such as:

  • How many queries actually make use of the index?
  • How effective is the database cache?
  • The number of open connections.

And so on.

Hardware metrics are essential, but can only get us so far, and will not help us make concrete decisions — like which queries to optimize and how we are using our database in general. Since PostgreSQL is an open-source database, it offers a lot of insights for free without having to rely on costly tools. Most of the metrics we are going to track are available by default since they are required for the query planner, but there are other critical areas for which we have to enable certain extensions/contrib modules that ship along with PostgreSQL.

Introduction to the Arctype Tool

Arctype is a very neat database tool that can be used not just as an SQL editor but also as a platform upon which complex dashboards can be both custom-built and shared around with developers/users. To understand its true power, we are going to consider a practical use case — monitoring a PostgreSQL database health.

Getting Started with Arctype Dashboards

To get ourselves familiar with the dashboard capabilities we are going to build a very simple dashboard with just three components. The Arctype dashboard is built right into the Arctype tool itself. To access the functionality, just click on Dashboards button as shown below.

Click on the plus icon and create a dashboard named — Postgres Health Monitoring.

Now that the dashboard is created, let’s create some components.

Creating a List of PostgreSQL Tables from Information_Schema

Adding a table component/chart component is covered in one of our previous articles. Drag and drop the Table into the empty dashboard pane. Next, we will use a simple query that talks to the underlying database (PostgreSQL) and gets all of the tables with their schemas present excluding the information_schema and the system tables.

SELECT
table_schema, table_name
FROM
information_schema.tables
WHERE
table_schema <> 'information_schema'
AND table_name NOT LIKE 'pg_%'
ORDER BY
table_schema,
table_name;

Visualizing Row Counts of Tables in a Database Schema

Let’s add a Chart component (similar to table) next to our table component which uses the below query:

SELECT
relname as table_name,
reltuples as rows
FROM
pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace)
WHERE
nspname NOT IN ('pg_catalog', 'information_schema')
AND relkind = 'r'
ORDER BY
reltuples DESC;

This query simply counts the records for all the tables in our databases. In the right pane, select table_name for X-Axis, and rows for Y-Axis.

Adding Images to an Arctype Dashboard

After adding the table and chart, let’s add an image to our dashboard from the PostgreSQL website along with a text field for explanation. After adding all of the above steps, our dashboard will look like this:

At this point, you should have a decent feel for dashboards, so now, let’s build a real health dashboard and see how it can help us visualize our database health.

Identifying Critical Areas in the PostgreSQL Database

PostgreSQL has a lot of areas upon which we can focus in order to understand the database’s health. In the below sections, we will see a handful of them and demonstrate how you can use Arctype to visualize them.

Querying to Check PostgreSQL Open Connections

It is always a good idea to monitor how many open connections are currently present in our database cluster. We can get this from the pg_stat_activity table:

SELECT
COUNT(*) as connections,
backend_type
FROM
pg_stat_activity
GROUP BY
backend_type
ORDER BY
connections DESC

Determining the Size of a PostgreSQL Database

As our database expands over time, it is important to keep the database size in check. We can get a list of databases and their respective sizes using the below query:

SELECT
datname as database_name,
pg_database_size(datname)/1024/1024 as size
FROM
pg_database
WHERE
datistemplate = false;

We could use the pg_size_pretty instead of calculating the MB directly, but we want to plot this as a bar chart to see and compare the sizes. Keeping tabs on individual database sizes helps us to get an overall picture.

Note: These sizes can vary from disk size since PostgreSQL does a lot of compression behind the scenes.

Calculating Database Cache-Hit Ratios in PostgreSQL

Any relational database/operational database serves most of its traffic from its cache. If that’s not happening, we definitely need to look into it. The PostgreSQL cache area is known as shared_buffers, which acts as the database layer cache on top of the cache provided by the operating system. We can use the below query to understand the cache hit ratio:

SELECT
sum(heap_blks_read) as reads,
sum(heap_blks_hit) as hits,
ROUND(
sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)),
4
) as hit_ratio
FROM
pg_statio_user_tables;

From the query plan blog, there are a couple of tuples that are coming from heap read and heap hits. The pg_statio_user_tables has the information regarding this. In the above query, we are simply getting the values of hits and reads (coming from the disk) and calculating the ratio rounded off to four digits. It is important to note that in a healthy production database, this ratio should come to around 97+ or close to it—97% of the hits are coming from the cache.

Scanning for Unused Indexes

The statistics collector in PostgreSQL records which indexes were accessed how frequently. We can construct a query from one of the statistics tables called pg_stat_all_indexes.

SELECT
COUNT(*) as count,
relname as table_name
FROM
pg_stat_all_indexes
WHERE
idx_scan = 0
and schemaname = 'public'
GROUP BY
table_name

This kind of gives us an indication of tables that have the greatest number of unused indexes. The idx_scan=0 gives us the indexes that were never used even once. We can customize this value depending on our database traffic pattern.

Inspecting PostgreSQL Database Caches

In order to get more insight out of our cache, we need to enable another contrib module called pg_buffercache. After enabling it and running a few queries we can now inspect the cache:

SELECT
c .relname AS entity_name,
count(*) AS buffers
FROM
pg_buffercache b
INNER JOIN pg_class c ON b.relfilenode = pg_relation_filenode(c .oid)
AND b.reldatabase IN (
0,
(
SELECT
oid
FROM
pg_database
WHERE
datname = current_database()
)
)
WHERE
c .relname NOT LIKE 'pg_%'
GROUP BY
c .relname
ORDER BY
2 DESC;

This query was taken from the official documentation and then modified a little bit. It shows how many pages in shared_buffers are occupied by different tables and indexes. You can assume pages as a metric directly proportional to the number of rows—the larger table being queried, the more pages get moved to the cache. This can also mean that some tables can be moved to an application cache or a fast key-value store, such as Redis, for faster querying—and to free up the database cache for indexes.

Monitoring Average Postgres Query Performance

For this part, we need to enable the pg_stat_statements extension. It is a built-in extension/contrib module, and in some cloud providers, it is enabled by default. If not, it can be easily enabled using the shared_preload_libraries. This extension helps us capture a ton of information regarding query performance. Using this extension, we are going to build a query that outputs the query performance based on the number of rows the query has scanned:

SELECT
mean_time as "timing in ms",
rows
FROM
pg_stat_statements
ORDER BY
mean_time DESC
LIMIT
100;

Limiting the rows and doing the order by gives us the top 100 most consumed queries and the rows column indicates the total number of rows retrieved or affected by the statement. We can also get the top 10 worst queries by running time using the above table. This can be used as an alternative for slow log analysis, as this is more real-time.

Building a Dashboard with PostgreSQL Metrics

If we combine all of the queries in the mentioned chart types, it should look like a proper analytics dashboard that can be easily visualized/consumed:

The mapping for each of the components is shown in the table below. Arctype is not limited to the below charts and has an incredibly feature-rich platform upon which we can represent our visualization.

Each widget here has its own description and the graphs have details on what is present in X-Axis and Y-Axis. It is important that the dashboard is self-explanatory, after all, a picture is worth a thousand words. These charts might look different depending upon the underlying data. They can be expanded and reduced by simply dragging their borders while in edit mode, which can be very useful if the dataset is larger — charts can overflow.

The above dashboard was created in an example database that I had used before. Let’s create the dashboard over a more realistic database such as the Covid DB one that comes inbuilt with Arctype (available during the Signup flow). Since the dataset is huge, I have created the dashboard in two parts:

As we have more data points, the dashboard looks a lot richer and more meaningful. The goal is not to be a “silver bullet” solution, but rather to provide some perspective as to potential areas of focus for an operational database. There are a lot of tools out there that offer different functionalities, but at the end of the day, I believe database developers should know their databases in more depth. What’s better than a tool that easily lets us translate queries into really nice charts and graphs with almost no learning overhead? Answer: probably not much.

Insights and Intelligence from the Health Dashboard

This dashboard allows us to monitor the database in real-time and helps us to more fully understand some features which are unique to PostgreSQL. In doing database optimizations, it is always a rule that “you cannot improve what you cannot measure,” and this dashboard solves the first (yet often overlooked) step, which is actually measuring things. The Covid DB demo database has a 99% cache hit rate, which means that almost all of the traffic is served from the cache (shared_buffers) and will respond very quickly. This is one good sign of a well-optimized database.

Monitoring our database during peak traffic times will give us a lot of insight into which areas can be improved and how to measure this improvement. Some of the ways in which we can improve database performance based upon these metrics are:

  • Using a client-side/server-side connection pool.
  • Code optimizations to release connections back to the pool/close the connection once the work is done.
  • Doing proper database modeling and splitting the traffic between different clusters.
  • Increasing the size of the shared buffers to accommodate more data/indexes in memory.
  • Optimizing poorly written queries by either adding an index or re-writing them entirely.
  • Removing unused indexes to speed up writes.

Whatever we have seen so far is only the tip of the iceberg. Database maintenance and improvement is an art of its own, and one which truly takes several years of hands-on experience. But is always important to use a toolchain that is friendly, cost-effective, and can help us achieve our goal easily.

Keeping the Dashboard Up-to-Date

After building it, it is important to keep the dashboard up-to-date. This can be done using various ways depending on our needs.

Manually Refreshing Arctype Dashboards

At the top right beside the Edit Mode button, there is a reload button, which we can click to reload the entire dashboard.

Automatically Refreshing the Entire Dashboard

We can also configure the whole dashboard to be refreshed periodically.

Refreshing Individual Components Automatically

By clicking on an individual component and clicking on Advanced, we can configure the component in question to be refreshed at specified and automated intervals.

We can choose what kind of refresh type we want, but keep in mind that the queries to monitor the database itself may be resource-intensive and can therefore degrade performance.

Sharing the Dashboard (and Closing Thoughts)

Another very important feature that is missing amongst many database tools, but one that is present in Arctype, is the ability to share dashboards.

By clicking the Share button on the right top screen in the dashboard viewer, we get various options to share.

Sharing an Arctype Workspace via Email

This would invite the user to be part of the workspace. For more advanced controls, check out the teams functionality.

Linking to Your Workspace

This is the same as above, except that anyone with the link can join the workspace.

Creating a Public Read-Only Link

This will send a read-only link that can be viewed in the browser as well. This can be very handy for sharing dashboards with different (perhaps non-technical) stakeholders on a team.

Note: You cannot share a locally-hosted database (or any affiliated dashboards)

Conclusion

These dashboards are incredibly useful when:

  • Running a performance test.
  • Monitoring a live database for a specified period.
  • General database monitoring.

Fine-grained control provided by Arctype makes this dashboard easily shareable across colleagues, teams, and even to the general internet. Go ahead and download Arctype if you haven’t already, and start creating these dashboards — knowing database health and monitoring them should no longer be limited to certain tools or people!

--

--

--

Everything connected with Tech & Code. Follow to join our 900K+ monthly readers

Recommended from Medium

Rails on Docker: How to Run Your Rails App on Docker Swarm

LRU Cache implementation

The 5 Misconceptions of Sponsoring a Virtual Conference

How to bulk change owner of Workflows and Power Automate flows using advanced find in Dynamics 365

Reflections on Quadratic Voting as a Hackathon Judging Mechanism

A better alternative for Docker Desktop?

Load testing with k6 and New Relic

Intro to Python for beginners | a women++ webinar

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arctype

Arctype

The modern SQL editor built for collaboration www.arctype.com

More from Medium

How to Get Employees from the Workday API with Python

Text that reads “Python Integration Tutorial / How to Get Employees from the Workday API”

Analyzing iMessage with SQL

Analyzing iMessage with SQL

Getting Started With PostgreSQL and Sequelize in 4 Minutes

Specification first driven API development: consumer protection use-case