Improving massive Oracle data manipulation using collections

Agustin Botana
Strands Tech Corner
3 min readApr 30, 2019

Collections is a PL/SQL way to implement arrays, lists, queues, etc. which offers a full set of functionalities to navigate and edit stored data more easily.

You can find some tips about collections and further documentation in Steven Feuerstein’s well-known article “Obsessed with Oracle PL/SQL”.

Here, we will focus on a practical case involving Oracle collection implementation to solve a real challenge for one of our clients, exploiting the ability to manipulate the collections in two ways: as an Oracle Table and as also as an integer index.

The Challenge

The Client asked us to implement an Oracle procedure to be called from an endpoint which must check a variable number of conditions on a huge amount of transactions. Alert messages should be triggered every time any of the said conditions are met.

For each user transaction, the procedure algorithm has to search each and every one of the conditions to be validated. As you can see, we have a Cartesian product of millions of transactions, with dozens of Alert Conditions.

To illustrate this scenario, we can have an average of 40 alerts, 5M users, and 10K transactions for each user. On average, we create 2 alerts for each user.

A more traditional approach would be:

  1. A cursor on the USER’S table
  2. A cursor to go through all the user’s transactions one by one
  3. For each transaction, then go through each alert definition
  4. Check the alert condition for the read transaction
  5. For each transaction, then go through each alert definition
  6. Insert a row into the Messages table

In this approach, we would be hitting the database (5MM x 10k x 40) = 2 Trillion sequential reads, and (5MM users * 2 alerts each user) = 10MM sequential writes.

Our Solution

Using the capabilities of PL Tables in ORACLE, the approach changes to:

  1. A cursor loading a PL Table, collected in bulk for all the alarms
  2. A cursor on the USER’S table
  3. For each user, a cursor loading a PL Table in bulk collect mode for all the user transactions
  4. For each in-memory Transaction, go through the alert definition array
  5. Check the alert condition for the transaction
  6. Create a message array element
  7. Once execution is complete, store the content of the alerts message PL table to the Database using the APPEND Hint to speed it up.

With this new approach, we had read the database (5MM x 10k) + 40 = Fifty billion 40 times with 10MM writes, all of this with the added benefit that all the calculations and initial inserts are made at RAM speed instead of Disk speed. In this way, we avoid a scenario in which the I/O waits and most of the reads and writes are made in bulk mode instead of one record at a time, as is the case with a standard cursor.

Conclusion

With ORACLE PL Tables functionality, we can write clean and much more efficient code to interact with large databases in scenarios where it is necessary to go through a huge amount of data over and over again.

--

--