GaussDB for MySQL with OBS on Huawei Cloud

Hakan GÜVEZ
Huawei Developers
Published in
6 min readOct 31, 2023
GaussDB — DWS
Gauss DB — DWS

Introduction

Hello all, I’m going to make a demo about the GaussDB with OBS

Huawei’s enterprise-grade distributed relational database is called GaussDB. With no data loss, a GaussDB instance can process petabytes of data, support more than 1,000 nodes, and be deployed across AZs.

This medium article helps you create GaussDB(DWS) clusters, import OBS data to a data warehouse, and query data with a few operations.

So, how did we go about it!

Prerequisites: Log in to HUAWEI CLOUD.

1. Creating a VPC

2. Creating a GaussDB( DWS) Cluster

3. Uploading Data to an OBS Bucket

4. Using DAS to Connect to a GaussDB( DWS) Cluster

5. Importing OBS Data and Querying the Data

The architecture of Huawei Cloud

Architecture of DWS for our example

Let’s start 👇

Step 1: VPC Creation

VPC Creation

Basic Information:

Region: AP-Singapore

Name: vpc-hakan

IPv4 CIDR Block: 192.168.0.0/16

Subnet settings:

AZ: AZ1

Name: subnet-hakan

IPv4 CIDR Block: 192.168.1.0/24

IPv6 CIDR Block: This option is not selected by default.

✅VPC is ready

Step 2: Gauss DB (DWS) Creation

DWS Screen-1

Billing Mode: Pay-per-use

Region: AP-Singapore

AZ: AZ1 (If resources in this AZ are sold out, you can change to another AZ.)

Resource: Standard

Compute Resource: ECS

Storage Type: Cloud SSD.

CPU Architecture: x86

Node Flavor: dwsx2.xlarge (select the minimum specifications)

Hot storage: 20 GB per node

Nodes: 3

DWS Screen-2

VPC: Select the created VPC vpc-hakan.

Subnet: Select subnet-hakan.

Security Group: Automatic creation

EIP: Select Buy now.

Broadband: 1 Mbit/s

ELB: Do not use

DWS Screen-3

Configure the following parameters:

Cluster Name: dwshakan

Cluster Version: Retain the default value.

Administrator Account: dbadmin.

Administrator Password: Set the password to own self 😊

Database Port: Retain the default value.

Wait 10–15 minutes for the creation of the DWS service.

✅DWS is ready

Step 3: OBS Creation

OBS Screen-1

Click Create Bucket in the upper right corner and configure the following parameters:

Region: AP-Singapore

Bucket Name: obs-sandbox (If the bucket name already exists, add a sequence number to the end of the bucket name, for example, obs-sandbox01.)

Data Redundancy Policy: Select Multi-AZ storage.

Default Storage Class: Standard

Bucket Policy: Private

Direct Reading: Select Disable.

OBS Screen-2

After the create OBS

Create folder name: input-data

Create another folder: output-data

, upload the CSV data into input-data on the OBS

✅OBS is ready

OBS Screen-3

Step 4: Data Creation

First: We get the data:

Data format example

These are our types of data:

product_price — product_id — product_time — product_level — product_name — product_type1 — product_type2 — product_monthly_sales_cnt — product_comment_time — product_comment_num — product_comment_content

We open these in Notepad++ and paste them. paste. We convert from encoding to utf-8 and when saving, we save it as product_info0.csv product_info1.csv, and product_info2.csv.

✅Data creation is ready

Step 5: IAM parameters (AK-SK)

Access Key and Secret Key, OBS uses an access key ID (AK) and secret access key (SK) to authenticate the identity of a requester. When you use OBS APIs for secondary development and use the AK and SK for authentication, the signature must be calculated based on the algorithm defined by OBS and added to the request.

The authentication can be based on a permanent AK and SK pair or based on a temporary AK/SK pair and security token.

AK-SK parameters

Click Create Access Key and download the Excel file.

On the Excel file, we will see the username AK and SK parameters

✅IAM AK-SK parameters ready

Step 6: Open DAS Service

After the Create DWS we will make:

1- Click Log In button

2- Enter dbadmin and password on the screen

3- On the development tool side click log in again. Enter the parameters and the DAS service opens successfully.

Login the DWS service
Login again — sometimes this step automatically pass

✅ DAS is ready to work

Step 7: SQL Commands

1st command: (Change the bold sections for your own example):

Creating the product_info_ext table into the DWS

CREATE FOREIGN TABLE product_info_ext
(
product_price integer not null,
product_id char(30) not null,
product_time date,
product_level char(10),
product_name varchar(200),
product_type1 varchar(20),
product_type2 char(10),
product_monthly_sales_cnt integer,
product_comment_time date,
product_comment_num integer,
product_comment_content varchar(200)
)
SERVER gsmpp_server
OPTIONS(
LOCATION'obs://obsdwshakan/input-data/',
FORMAT 'CSV' ,
DELIMITER ',',
ENCODING 'utf8',
HEADER 'false',
ACCESS_KEY 'AK',
SECRET_ACCESS_KEY 'SK',
FILL_MISSING_FIELDS 'true',
IGNORE_EXTRA_DATA 'true'
)
READ ONLY
LOG INTO product_info_err
PER NODE REJECT LIMIT 'unlimited';

2nd command:

Creating the product_info table into the DWS

CREATE TABLE product_info
(
product_price integer not null,
product_id char(30) not null,
product_time date ,
product_level char(10) ,
product_name varchar(200) ,
product_type1 varchar(20) ,
product_type2 char(10) ,
product_monthly_sales_cnt integer ,
product_comment_time date ,
product_comment_num integer ,
product_comment_content varchar(200)
)
WITH (
orientation = column,
compression=middle
)
DISTRIBUTE BY hash (product_id);

3rd command:

INSERT INTO product_info SELECT * FROM product_info_ext;

4th command:

SELECT * FROM product_info;
Our query looks like work
This is an example of the interface of DAS

✅Results look like the above if run successfully

5th command:

INSERT INTO product_info VALUES (200,'ARCP-E','2016–05–04','B','2016 casual pants men','black','L',997,'2016–09–10',301,'good quality.');

6th command:

INSERT INTO product_info VALUES (500,'JYXL-E','2017–06–04','B','2017 casual pants men','black','M',990,'2016–11–10',302,'good quality.');

7th command:

UPDATE product_info SET product_price = 185 WHERE product_id = 'KDKE-B';

8th command:

DELETE product_info WHERE product_id ='KDKE-B';

9th command:

SELECT COUNT(*) FROM product_info;

✅Compare the number of changes with Step-4

Conclusion

The exercise shows how to connect to a GaussDB(DWS) cluster through DAS for data import and simple query and analysis.

** After the complete the demo, you will be deleted:

1- DWS deletion (delete with EIP — delete time takes 10 minutes)

2- VPC deletion (first subnet deletion, second vpc deletion)

3- OBS deletion (first file deletion, second obs deletion)

4- IAM user AK SK is not necessary

If you have any thoughts, or suggestions please feel free to comment or if you want, you can reach me at guvezhakan@gmail.com, I will try to get back to you as soon as I can.

You can reach me through LinkedIn too.

Hit the clap button 👏👏👏 or share it ✍ if you like the post.

--

--