How to run the TPC-DS benchmark in Oracle Autonomous Data Warehouse

Ryan Lu
4 min readDec 8, 2022

--

TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. The benchmark provides a representative evaluation of performance as a general-purpose decision support system.

Today, our primary focus is to run TPC-DS on Oracle Autonomous Data Warehouse.

Step 1:

Go to tpcds-kit GitHub page and follow the setup in README. I used an Oracle Linux instance from Oracle Cloud Infrastructure for data generation for computing power.

Oracle Linux VM configuration

Add configure a boot volume to store the generated data. For this demo, I set the boot volume to 200 GB with 10 VPUs/GB (decent performance for IO-demanding workloads)

Boot Volume configuration

Once both VM and boot volume is provisioned, attach the boot volume to the root partition.

SSH into the VM and git clone the tpcds-kit repository. In the tpcds-kit repository, under /tools look for How_To_Guide-DS-V2.0.0.docx. It is a complete guide on how to generate data and run queries on DBMS. Run Make once everything has been configured.

Step 2:

Once make command has been run. Create a new repository to store generated data. Run the following command to generate 100 GB of data in the data folder.

dsdgen –scale 100 –dir ../data/

Or you can generate data in parallel

./dsdgen -scale 100 -dir ../data/ -parallel 2 -child 1 &
./dsdgen -scale 100 -dir ../data/ -parallel 2 -child 2

This process will take around 30 mins to an hour, depending on what machine you are running on.

Once the terminal prompts done

Install OCI CLI on this VM and configure OCI CLI. Create a bucket in OCI.

Put generated file into this bucket.

Step 3:

Go to Autonomous Database and launch Database actions.

Launch SQL

Create tables with the SQL script provided in the tpcds-kit/tools/

tpcds_source.sql and tpcds-kit/tools/tpcds.sql

Launch Data Load

Select Load Data and Cloud Store

Add Cloud Store Location and create new credential

Auth Token can be generated under user profile

Step 4:

Launch SQL

Run the following script to copy data from the bucket to ADW

set define on
define file_uri_base = 'https://objectstorage.ap-tokyo-1.oraclecloud.com/n/{enter name space name}/b/{enter bucket name}/o'
begin
dbms_cloud.copy_data(
table_name =>'{enter table name}',
file_uri_list =>'&file_uri_base/{enter table name}.dat',
credential_name =>'{enter credential name we created in step 3}',
format => json_object('ignoremissingcolumns' value 'true', 'removequotes' value 'true', 'dateformat' value 'YYYY-MM-DD', 'blankasnull' value 'true')
);
end;

Repeat this process for all .dat files in the bucket.

Go back to the terminal where you generated all the data, and in tpcds-kit/tools, create a build.sh file and paste the following code.

#!/bin/sh
for i in `seq 1 99`
do
./dsqgen -DIRECTORY ../query_templates/ -TEMPLATE "query${i}.tpl" -SC 100G -DIALECT oracle -FILTER Y > ../sql/query${i}.sql
done

This script will loop through all 99 tpl templates and generate corresponding sql scripts and store in tpcds-kit/sql

Go to the tpcds-kit/sql directory. Run the following script in the terminal.

cat *.sql > all_query.sql

This will combine the 99 sql scripts into 1.

Copy all_query.sql and run it in ADW. Once the script has completely run, check the script output for your TPC-DS result!

Congratulation! You have completed running TPC-DS for Oracle Autonomous Data Warehouse.

--

--