Migrating from Oracle to Postgres? Be aware of FILLFACTOR and Indexing implications
We know how useful Indexes are to speed up SQLs in relational databases. Though indexes add some overhead on DMLs processing because with any DML on a table, all the corresponding Indexes need to get updated.
Postgres has become quite popular open source database and many organizations are moving their workload from Oracle to Postgres. One of the major difference between these two RDBMS is their different architecture to handle multi version concurrency control (MVCC). Let’s understand that.
How UPDATE query works in Oracle
In Oracle, when an user runs an UPDATE query to update a column, the old image of that row stores in global UNDO segment which is a dedicated undo space, managed separately. In this scenario because Oracle changes the row in place and hence row location (known as rowid in Oracle) does not get change and there is no need to change corresponding index entry (unless UPADTE query modifying the data of an Index column itself).
How UPDATE query works in PostgreSQL
In PostgreSQL, when an user runs an UPDATE query to update a column, it marks the old row as dead tuple (aka row) and creates a new version of the row with UPDATED column data(out of place change). Here because row’s physical location (known as ctid in Postgres) gets change and it seems with every UPDATED row, corresponding index entries will also need to change. That looks like a huge disadvantage in Postgres if all the Index entries need to change with every UPDATED row.
Wait! Postgres developer community came up with nice optimization in this process, called HOT (Heap Only Tuples) where even with ctid changes for a row, it does not need to change the Index entry and it puts a locator inside table block which points to the new row location within the same block. HOT also does a block level mini VACUUM at some point (when space pressure comes on the block) to clean up the dead tuples.
With HOT feature in Postgres, it is comparable with Oracle and any UPDATED row will change the Index entry only when UPDATE query is changing a column data which is part of any Index.
Next Question: What if there is no free space in the current block to accommodate the new UPDATED row?
In this scenario Oracle and Postgres behave in different manner.
In Oracle, it updates the row in place so a block can be full only if new UPDATED row size is bigger than old row size. In this case that row will be migrated to a new block and a pointer to the new row id will be there in the current block.
- Pros — No need to change the Index entry as Index entry will still points to the old block.
- Cons — Index scan will become slower because Index entry still points the row location to the old block and then old block will point the row location to the new block and fetching a row is then two-step process and hence slower.
In Postgres, when new UPDATED row does not find space in the current block, it will get store in a new block with free space. Here instead of putting a locator in the old block, Postgres chooses to change all the corresponding Index entries.
- Pros — There is no impact on Index scan as Index entry for the row contains actual ctid of the row.
- Cons — Because all the Index entries need to change, that cause following problems.
- Higher latency of UPDATE statement because every row UPDATE need to change table entry as well as all the Index entries.
- Long running VACUUM as vacuum need to clean index dead tuples along with table.
- Write amplification and as a result more data to be sent to all the Standby/Read replicas.
Summary: By default Postgres tables gets created with FILLFACTOR 100 and due to its explained MVCC architecture, with UPDATE statement the new row has to go to new block resulting all the INDEX updates as well. To prevent these INDEX updates, we need to reduce the FILLFACTOR value for table which gets high UPDATEs and also get rid of any unwanted INDEX or INDEX column (in case any column in these INDEXes is getting UPDATED).
Let’s do more deep dive on Postgres FILLFACTOR in next post.