Benchmarking results between BDA and Exadata with different parallel read and parallel insert parameters

Published in

oracle-bda-exadata-benchmark-for-dwh

2 min readNov 30, 2020

This benchmark results acquired during solution of slow ETL execution time problem of an event table in DWH which has a source from Hive on BDA and oracle table on Exadata.

Before starting the explanation it is important to draw an image of the environments and parameters that surely have an impact on the benchmark figures. Servers are Oracle BDA X8–2 and Oracle Exadata X3–2 Quarter Rack. Oracle DB 11g, Hive on Cloudera and Oracle Data Integrator are used. Even though the tests are adjusted with the same condition in a running environment, it is quite important to emphasize that I/O, CPU consumption, concurrent jobs on the execution and maintenance of both of the environments have impact on the results.

Data size for the hive table on BDA for one day partition is around 8 GB which is compressed as an avro file. It includes 60 Million of rows roughly.

After execution of the ETLs with different parallelism figures for external table creation and insert, results can be shown as below.

Even it seems that this job can be finished as fast as possible with the maximum parallelism adjustment, it may not be feasible to do so since there are more jobs running parallel with this job. There is no unlimited resources for on-premise applications. Therefore, 4 parallel to read data from hive then insert with 8 parallels on exadata may be better solution on operational wise.

Benchmarking results between BDA and Exadata with different parallel read and parallel insert parameters

Written by Halil Baysal