Bulk data processing within PL/SQL

Published in

DigiGeek

3 min readNov 27, 2022

If you are working on much more data within any language, you need to process the data as bulk. In this article, we will analyze how bulk processing increases in PL/SQL language. For this method, we will use two techniques that BULK COLLECT and FORALL.

Let’s create a salary table for this, and fill in the table.

CREATE TABLE SALARY (
  EMPLOYEE_NUMBER NUMBER(18),
  SALARY NUMBER(18),
  DEPARTMENT VARCHAR2(1),
CONSTRAINT PK_SALARY PRIMARY KEY (EMPLOYEE_NUMBER)
);

CREATE INDEX IX_SALARY_EMPLOYEE_NUMBER ON SALARY(DEPARTMENT);

BEGIN

  FOR ix IN 1..1000000 LOOP
       INSERT INTO SALARY VALUES (ix, CEIL(dbms_random.value(5,15)) * 1000, 'A');
  END LOOP;
  COMMIT;
  FOR ix IN 1000001..1200000 LOOP
       INSERT INTO SALARY VALUES (ix, CEIL(dbms_random.value(20,60)) * 1000, 'B');
  END LOOP;
  COMMIT;
  FOR ix IN 1200001..2000000 LOOP
       INSERT INTO SALARY VALUES (ix, CEIL(dbms_random.value(10,20)) * 1000, 'C');
  END LOOP;
  COMMIT;
END;

As seen above, we inserted 2 million data to our salary table, which of 1 million is A, 200 thousand is B, and 800 thousand is C.

Let’s say the employees in the A department should receive a 40% raise, the employees in the B department should be given a 30% raise, and the employees in the C department should be given a 35% raise.

For this scenario, let’s basically try to solve the problem with the following method.

BEGIN
  FOR recSalary IN (SELECT * FROM SALARY) LOOP
      IF recSalary.DEPARTMENT = 'A' THEN
         UPDATE SALARY SET SALARY = recSalary.SALARY * 1.4
         WHERE EMPLOYEE_NUMBER = recSalary.EMPLOYEE_NUMBER;
      ELSIF recSalary.DEPARTMENT = 'B' THEN
         UPDATE SALARY SET SALARY = recSalary.SALARY * 1.3
         WHERE EMPLOYEE_NUMBER = recSalary.EMPLOYEE_NUMBER;
      ELSIF recSalary.DEPARTMENT = 'C' THEN
         UPDATE SALARY SET SALARY = recSalary.SALARY * 1.35
         WHERE EMPLOYEE_NUMBER = recSalary.EMPLOYEE_NUMBER;
      END IF;
      COMMIT;
  END LOOP;
END;

As seen above method, we can see that we fetch the all salary with full table access. Therefore we checked these department singly. Finally we updated their salary as their department. This process took 107.52 seconds.

When we start to think how we can improve the performance of this code, we see firstly ‘context switching’ issue. In the oracle database, we can run SQL queries like SELECT, UPDATE, INSERT etc, besides we can run PL/SQL codes like IF, LOOP etc. In this case, we firstly use select query to get salary table rows, we asked department with PL/SQL codes, and we finally updated with raised salary. Likewise we did this for all rows. This state is called context switching. To improve the performance of this code, first of all we need to decrease context switching.

SELECT EMPLOYEE_NUMBER, DECODE(DEPARTMENT, 'A', SALARY * 1.4,
                                           'B', SALARY * 1.3,
                                           'C', SALARY * 1.35) AS RAISED_SALARY
FROM SALARY

The above code fetches us raised salaries in one query. To bulk insert, we must use BULK COLLECT and FORALL commands.

BULK COLLECT: a clause that lets you enter multiple rows into a collection
FORALL: a feature that lets you run the same DML statement multiple times for different values

DECLARE 
  CURSOR cList IS 
  SELECT EMPLOYEE_NUMBER, DECODE(DEPARTMENT, 'A', SALARY * 1.4,
                                             'B', SALARY * 1.3,
                                             'C', SALARY * 1.35) AS RAISED_SALARY
  FROM SALARY;
  
  TYPE TLIST IS TABLE OF cList%ROWTYPE;
  vList TLIST;
BEGIN

   OPEN cList;

   LOOP
      FETCH cList
      BULK COLLECT INTO vList
      LIMIT 1000;
      
      FORALL ix IN 1 .. vList.COUNT
      UPDATE SALARY SET SALARY = vList(ix).RAISED_SALARY
      WHERE EMPLOYEE_NUMBER = vList(ix).EMPLOYEE_NUMBER; 
      
      COMMIT;
      EXIT WHEN cList%NOTFOUND;
   END LOOP;
   CLOSE cList;
END;

In above declare code block, we defined the cList cursor for the select query. We defined a new TYPE as TLIST to keep these cursor rows, and we defined vList from the new type TLIST.

In the begin-end block, we started with opening the cursor cList. And we put the query results into the vList parameter with a limit of 1000. We updated the salaries for each 1000 rows of data. Finally because there is no data left to process, we exit the loop and close the cursor. This process took 31.19 seconds.

Of course, we could update all the data at once in bulk. Let’s analyze this code also.

UPDATE SALARY SET SALARY = DECODE(DEPARTMENT, 'A', SALARY * 1.4,
                                             'B', SALARY * 1.3,
                                             'C', SALARY * 1.35)

This process took 27.77 seconds. Although we didn’t gain much performance, we locked the table for 27 seconds. Therefore, we do not recommend this method if the table is actively used.

Bulk data processing within PL/SQL

Written by Soner Oz