Verify Columnar Engine Usage with AlloyDB Operational Views

Anish Nair
Google Cloud - Community
6 min readJul 17, 2023

Disclaimer: The opinions expressed in this article are my own, and do not represent views of my current or any previous employers.

Background

AlloyDB is a fully-managed, PostgreSQL-compatible database designed for demanding transactional and analytical workloads. Now that being said, there are various attributes which makes it possible for AlloyDB to handle HTAP(Hybrid Transactional/Analytical Processing) workloads in a phenomenal way and one of this is the Columnar Engine.

Though we have the columnar engine to enable us to perform analytics-heavy operations on AlloyDb, its very important that we understand various parameters associated with the columnar engine, be it the columns part of the columnar store, the memory allocation , the invalidations etc. AlloyDb facilitates these by providing multiple operational views which gathers Columnar engine stats in real-time. And in this blog, we will take a look at few of the prominent Views associated with Columnar Engine.

Prerequisites

To understand more on these views, its obvious that these views should have some data. to enable that, we assume that following prerequisites are already adhered to.

  • Create an AlloyDB cluster with columnar instance.
  • Have some analytical and transactional workloads run on the AlloyDB cluster

For ease of understanding lets consider we have 2 Instances in the AlloyDB cluster. The Transactions would run on the Primary instance(Columnar Engine not enabled) and analytical workloads to run on Read-Pool Instance(Columnar Enabled). Also we will make sure that the the Read-Pool instance goes not get restarted, else all the columnar stats would be flushed off.

Now with the context set, lets check few important views and understand on leveraging the information they provide.

The Views

Though we have close to 28+ views available which gives different stats on AlloyDB on multiple verticals like the columnar engine, the index advisory, the schedules etc. we will focus on few prominent views which could be crucial while we work on implementing/debugging columnar engine.

g_columnar_columns

This view gives complete detail of all the columns in the columnar store.

Critical Information Contained:

  • This view could be the first-stop to check all the columns present in the columnar engine at any given point-in time.
  • The Size taken by each column, in the columnar engine is captured in this view.
  • We can also see how many times any specific column has been accessed since the column made it to the Columnar Store.
  • We can also know when was the last time(date time) when the column was accessed.

How to Best Use this Information:

  • In scenarios where we are learning the workload pattern to tune and make sure to leverage AlloyDB columnar Engine to the fullest, we can set the auto-columnarization ON initially, and check the columns which makes it to the Columnar store.
  • We can prioritise the columns which is being looked-up or accessed the most and make sure these columns are always present in the columnar store.
  • With the size information we have, we can effectively plan the memory sizing of the Columnar Engine. By default Columnar Engine acquires 30% of the instance memory. Now with the Size information available, we can plan to cherry-pick required column only which should be part of the Columnar Engine and in certain scenarios, plan to increase/decrease the Columnar Memory allocation.

g_columnar_memory_usage

This view provides information on the columnar memory segregation across multiple pools.

Critical Information Contained:

  • We can check the memory allocation done for columnar engine under multiple pools, along with the currently available memory out of the allocated memory.
  • It also shows memory availability percentage across various pools.

How to Best Use this Information:

  • By default, columnar engine is allocated with 30% of the instance memory. This should reflect in the “main pool”. The “main pool” will ideally be used to populate attributes in columnar format, which further will be utilised while AlloyDB performs columnar scan(conditions applied).
  • Work Pool is more like a shared space which would be used for any general shared memory operations, apart from storage.
  • With information at this level, we can get an idea if the columnar engine memory is being under-utilised or over-utilised. Accordingly we can take necessary steps like controlling columns to be part of the columnar engine, or in required cases, increase/decrease the columnar memory allocated.

g_columnar_units

This view provides an in-depth detail on the columnar units available within the columnar engine

Critical Information Contained:

  • This view can be used to get columnar unit level information of each table(relation) within columnar engine.
  • It given information of start and end of blocks of each columnar unit along with the size consumed by each relation on columnar engine.
  • It gives the swap status of each of the units.
  • Another important information stored in this view is the invalidation block count for each relation

How to Best Use this Information:

  • A columnar unit is a set of data blocks within a columnar table.
  • With the block start and end details and information like invalidation block counts, we can derive the percentage of blocks that has been invalidated(due to DML operations on a table/relation). This could help us take operational decisions to refresh columnar store to negate the invalidations.
  • The Swap status can also give us important information on the amount of information stored in memory for each relation. This information again could be if a relation is completely available in memory, or its moving/moved to the local storage. so if there are important relations which are crucial to be on columnar memory but might have moved to the local storage, we can take appropriate actions to amend this. Few enums that could be used to understand the swap status more is listed below for reference:
Columnar Unit Swap-Status

g_columnar_relations

This view provides a complete list of tables in the columnar engine.

Critical Information Contained:

  • We get to know all the tables(relations) currently present in the Columnar Engine.
  • We can check how much columnar memory every tables holds in the columnar engine.
  • We get the details on how many Columnar Units every table holds.
  • It contains details of number of blocks invalidated.
  • It contains information if a refresh has happened for any table or not and its corresponding status.

How to Best Use this Information:

  • With this information, we can check how much of columnar memory is being used by each table(relation) within columnar store. Also information on number of blocks associated with each table as well as how many of those blocks are present in the Columnar Store, are also provided. This could help us to analyse and anticipate the performance associated with each table, coz with more blocks present within the columnar store, better will be the performance while accessing such tables.
  • Here we can find information like if auto-refresh has happened for a particular table given that the table has invalid blocks available. So this plays an important role while we have auto-refresh is scheduled(by default is every 1 hour)

These were few important views which could be used to understand and hence enhance our experience with columnar store. There are some amazing blogs on Exploring AlloyDB leveraging these views written by Deepak Mahto which could be found here and here.

--

--