GaussDB for MySQL with OBS on Huawei Cloud
Introduction
Hello all, I’m going to make a demo about the GaussDB with OBS
Huawei’s enterprise-grade distributed relational database is called GaussDB. With no data loss, a GaussDB instance can process petabytes of data, support more than 1,000 nodes, and be deployed across AZs.
This medium article helps you create GaussDB(DWS) clusters, import OBS data to a data warehouse, and query data with a few operations.
So, how did we go about it!
Prerequisites: Log in to HUAWEI CLOUD.
1. Creating a VPC
2. Creating a GaussDB( DWS) Cluster
3. Uploading Data to an OBS Bucket
4. Using DAS to Connect to a GaussDB( DWS) Cluster
5. Importing OBS Data and Querying the Data
The architecture of Huawei Cloud
Let’s start 👇
Step 1: VPC Creation
Basic Information:
Region: AP-Singapore
Name: vpc-hakan
IPv4 CIDR Block: 192.168.0.0/16
Subnet settings:
AZ: AZ1
Name: subnet-hakan
IPv4 CIDR Block: 192.168.1.0/24
IPv6 CIDR Block: This option is not selected by default.
✅VPC is ready
Step 2: Gauss DB (DWS) Creation
Billing Mode: Pay-per-use
Region: AP-Singapore
AZ: AZ1 (If resources in this AZ are sold out, you can change to another AZ.)
Resource: Standard
Compute Resource: ECS
Storage Type: Cloud SSD.
CPU Architecture: x86
Node Flavor: dwsx2.xlarge (select the minimum specifications)
Hot storage: 20 GB per node
Nodes: 3
VPC: Select the created VPC vpc-hakan.
Subnet: Select subnet-hakan.
Security Group: Automatic creation
EIP: Select Buy now.
Broadband: 1 Mbit/s
ELB: Do not use
Configure the following parameters:
Cluster Name: dwshakan
Cluster Version: Retain the default value.
Administrator Account: dbadmin.
Administrator Password: Set the password to own self 😊
Database Port: Retain the default value.
Wait 10–15 minutes for the creation of the DWS service.
✅DWS is ready
Step 3: OBS Creation
Click Create Bucket in the upper right corner and configure the following parameters:
Region: AP-Singapore
Bucket Name: obs-sandbox (If the bucket name already exists, add a sequence number to the end of the bucket name, for example, obs-sandbox01.)
Data Redundancy Policy: Select Multi-AZ storage.
Default Storage Class: Standard
Bucket Policy: Private
Direct Reading: Select Disable.
After the create OBS
Create folder name: input-data
Create another folder: output-data
, upload the CSV data into input-data on the OBS
✅OBS is ready
Step 4: Data Creation
First: We get the data:
These are our types of data:
product_price — product_id — product_time — product_level — product_name — product_type1 — product_type2 — product_monthly_sales_cnt — product_comment_time — product_comment_num — product_comment_content
We open these in Notepad++ and paste them. paste. We convert from encoding to utf-8 and when saving, we save it as product_info0.csv product_info1.csv, and product_info2.csv.
✅Data creation is ready
Step 5: IAM parameters (AK-SK)
Access Key and Secret Key, OBS uses an access key ID (AK) and secret access key (SK) to authenticate the identity of a requester. When you use OBS APIs for secondary development and use the AK and SK for authentication, the signature must be calculated based on the algorithm defined by OBS and added to the request.
The authentication can be based on a permanent AK and SK pair or based on a temporary AK/SK pair and security token.
Click Create Access Key and download the Excel file.
On the Excel file, we will see the username AK and SK parameters
✅IAM AK-SK parameters ready
Step 6: Open DAS Service
After the Create DWS we will make:
1- Click Log In button
2- Enter dbadmin and password on the screen
3- On the development tool side click log in again. Enter the parameters and the DAS service opens successfully.
✅ DAS is ready to work
Step 7: SQL Commands
1st command: (Change the bold sections for your own example):
Creating the product_info_ext table into the DWS
CREATE FOREIGN TABLE product_info_ext
(
product_price integer not null,
product_id char(30) not null,
product_time date,
product_level char(10),
product_name varchar(200),
product_type1 varchar(20),
product_type2 char(10),
product_monthly_sales_cnt integer,
product_comment_time date,
product_comment_num integer,
product_comment_content varchar(200)
)
SERVER gsmpp_server
OPTIONS(
LOCATION'obs://obsdwshakan/input-data/',
FORMAT 'CSV' ,
DELIMITER ',',
ENCODING 'utf8',
HEADER 'false',
ACCESS_KEY 'AK',
SECRET_ACCESS_KEY 'SK',
FILL_MISSING_FIELDS 'true',
IGNORE_EXTRA_DATA 'true'
)
READ ONLY
LOG INTO product_info_err
PER NODE REJECT LIMIT 'unlimited';
2nd command:
Creating the product_info table into the DWS
CREATE TABLE product_info
(
product_price integer not null,
product_id char(30) not null,
product_time date ,
product_level char(10) ,
product_name varchar(200) ,
product_type1 varchar(20) ,
product_type2 char(10) ,
product_monthly_sales_cnt integer ,
product_comment_time date ,
product_comment_num integer ,
product_comment_content varchar(200)
)
WITH (
orientation = column,
compression=middle
)
DISTRIBUTE BY hash (product_id);
3rd command:
INSERT INTO product_info SELECT * FROM product_info_ext;
4th command:
SELECT * FROM product_info;
✅Results look like the above if run successfully
5th command:
INSERT INTO product_info VALUES (200,'ARCP-E','2016–05–04','B','2016 casual pants men','black','L',997,'2016–09–10',301,'good quality.');
6th command:
INSERT INTO product_info VALUES (500,'JYXL-E','2017–06–04','B','2017 casual pants men','black','M',990,'2016–11–10',302,'good quality.');
7th command:
UPDATE product_info SET product_price = 185 WHERE product_id = 'KDKE-B';
8th command:
DELETE product_info WHERE product_id ='KDKE-B';
9th command:
SELECT COUNT(*) FROM product_info;
✅Compare the number of changes with Step-4
Conclusion
The exercise shows how to connect to a GaussDB(DWS) cluster through DAS for data import and simple query and analysis.
** After the complete the demo, you will be deleted:
1- DWS deletion (delete with EIP — delete time takes 10 minutes)
2- VPC deletion (first subnet deletion, second vpc deletion)
3- OBS deletion (first file deletion, second obs deletion)
4- IAM user AK SK is not necessary
If you have any thoughts, or suggestions please feel free to comment or if you want, you can reach me at guvezhakan@gmail.com, I will try to get back to you as soon as I can.
You can reach me through LinkedIn too.
Hit the clap button 👏👏👏 or share it ✍ if you like the post.