How to Migrate Oracle DB to AWS RDS Using Oracle dump and DMS
This blog explains how to migrate Oracle DB to AWS RDS using two different ways — Oracle dump and DMS, as the second series of MySQL DB migration guide.
1. DB Migration with Oracle dump
📕Scenario
Let’s assume that EC2 in IDC VPC is the on-premise server and Oracle DB engine is installed on the server. Site-to-site VPN is configured to enable communication between IDC and AWS VPC. A dump file will be created locally, transferred to AWS S3, and data will be uploaded and imported from S3 to RDS.
- Source DB: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0–64bit Production in on-premise server
- SID: TEST
- Schema: TESTUSR(around 1GB)
- User: scott
- P/W: tiger
1) Full load
Enter the server where the source DB is installed and access to Oracle DB with the sqlplus client tool.
sqlplus / as sysdba
Press enter keyboard to enter the password.
Check the current DB name and schemas with the command below. In Oracle, the user is the schema.
SELECT NAME FROM v$database;
SELECT DISTINCT(OWNER) FROM ALL_ALL_TBLES;
Let’s create a dump file for the TESTUSR schema in the TEST DB and transfer it to RDS.
Create a user scott that we are going to use for dump migration by unlocking the user account with the command below.
ALTER USER &A IDENTIFIED BY &B ACCOUNT UNLOCK;
Use the command below to grant the necessary permissions to create the dump file and create a directory.
GRANT EXP_FULL_DATABASE TO scott; GRANT IMP_FULL_DATABSE TO scott;
GRANT READ, WRITE ON DIRECTORY DUMP TO scott;
GRANT CREATE ANY DIRECTORY TO scott;
CREATE DIRECTORY DUMP AS ‘/oracle/dumpfile’;
ALTER SESSION SET CURRENT_SCHEMA = TESTUSR;
GRANT SELECT, INSERT, UPDATE, DELETE ON TESTUSR.ITEM TO scott;
GRANT SELECT, INSERT, UPDATE, DELETE ON TESTUSR.CUSTOMER TO scott;
Then verify that the directory was created with the command below.
SELECT directory_name, directory_path FROM dba_directories WHERE directory_name=’DUMP’;
Exit the DB server and create the oracle directory and a dump file directory in the server where the source DB is installed with the command below.
Similarly, verify that the dump file in the oracle directory has been created.
cd /
pwd
sudo mkdir oracle
sudo mkdir oracle/dumpfile
The directory in the server must be the same as the directory path in the Oracle server that was created earlier. Please take a closer look at the directory path!
Use the command below to make ec2-user the owner of the oracle directory and dump file directory you just created.
sudo chown -R ec2-user:ec2-user oracle
Everything is ready for creating dump files! Create a dump file for the source DB with the command below. The command below exports a backup file containing only tables and indexes in Schema.
expdp scott/tiger dumpfile=test.dmp directory=dump schemas=TESTUSR job_name=test logfile=test.log
*Tips
1) If you want to export all schemas within Oracle DB, you can use the command below. This command runs the command as the system, the master user, and exports the dump file of the entire schema in the DB.
expdp system/oracle dumpfile=full.dmp directory=dump full=y logfile=full.log job_name=fullexp
2) The FLASHBACK_SCN option allows you to select a particular SCN point and generate a data dump for that point or later.
expdp scott/tiger dumpfile=cdc.dmp directory=dump schemas=TESTUSR job_name=cdc logfile=cdc.log FLASHBACK_SCN=${SCN 번호}
If it shows a message that the dump file export is complete, check the dump file directory for the file.
If you open the file with cat command , you can check that it has been converted into an unreadable form.
We are going to upload dump file on S3. Before this, go to the Endpoint tab on the VPC page to create a VPC endpoint so that S3 can communicate with the instance located in the private subnet.
After selecting the S3 Gateway endpoint, select the VPC and routing table where the IDC on-premise server is located.
Once the endpoint is created, the routing e table is automatically modified to pass the endpoint when communicating with S3.
Now that Endpoint is connected, private communication from the on-premise server to S3 is available. Access aws cli from the on-premise server to upload dump files to S3.
Copy the dump file from the dump file directory to the S3 bucket with the command below.
aws s3 cp /oracle/dumpfile/test.dmp s3://${S3 버킷명}
aws s3 cp /oracle/dumpfile/test.log s3://${S3 버킷명}
*If s3 cp command fails to connect to endpoint:
- Check whether you have permission for S3.
- Need to make sure that the region in the aws configure is the same as the region where the S3 bucket is located. If cli configure shows a different region than S3 bucket, change the region in ~/.aws/config.
Go to S3 page and select a bucket to check that the dump file was copied with the s3 cp command earlier.
Move to the IAM page Policy tab. Select Create Policy and select the following from the Visual editor:
Select S3 as Service and expands each list to select List — ListBucket, Read — GetObject, Write — PutObject for Actions. Selects Specific and Bucket for Resources. In Bucket, enter the ARN of the S3 bucket where the dump file is located, and check any for object. Click Create Policy to create the policy.
Move to the Role tab on IAM page and create a Role. For Trusted entity, select AWS service and RDS — Add Role to Database.
On the Add permissions page, select the policy you just created. Press Create role to complete the role creation.
Go to the RDS page and select the RDS you created. On the Manage IAM roles tab at the bottom of the Connectivity & security tab, select and add the IAM Role that you just created. Verify that the status changed to Active when the IAM Role is applied.
Go to the RDS page Option groups tab and select the option group applied to the target RDS. Go to the bottom of the page and under Options, select Add option.
For the Option name, select S3_INTEGRATION. Select Apply immediately to Yes for immediate application.
The dump file uploaded to S3 is ready to be migrated to RDS. Enter the target Oracle DB and upload the dump file in S3 to RDS with the query below.
SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3(
p_bucket_name => '${bucket name}',
p_directory_name => 'DATA_PUMP_DIR')
AS TASK_ID FROM DUAL;
When the query is executed, a TASK_ID is displayed. Enter the ID in the query below to check the progress of uploading the dump file.
SELECT text FROM table(rdsadmin.rds_file_util.read_text_file(‘BDUMP’,’dbtask-${TASK_ID}.log’));
If you get a message that the task is completed, check the dump files uploaded to the RDS using the query below. You can verify that the dump files and log files have been uploaded to the data pump directory.
Once the dump file has been uploaded to the RDS, restore the schema and table in the dump file into the DB with the procedure below.
DECLARE
hdnl NUMBER;
BEGIN
hdnl := DBMS_DATAPUMP.OPEN( operation => 'IMPORT', job_mode => 'SCHEMA', job_name=>null);
DBMS_DATAPUMP.ADD_FILE( handle => hdnl, filename => 'test.dmp', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_dump_file);
DBMS_DATAPUMP.ADD_FILE( handle => hdnl, filename => 'test.log', directory => 'DATA_PUMP_DIR', filetype => dbms_datapump.ku$_file_type_log_file);
DBMS_DATAPUMP.METADATA_FILTER(hdnl,'SCHEMA_EXPR','IN (''TESTUSR'')');
DBMS_DATAPUMP.START_JOB(hdnl);
END;
/
You can check the DB import log with the command below.
SELECT text FROM table(rdsadmin.rds_file_util.read_text_file(‘DATA_PUMP_DIR’,’test.log’));
In Oracle DB, table space is a logical unit of storage. DB data will be stored in the table space.
Check the default table space with the command below.
SELECT PROPERTY_VALUE FROM DATABASE_PROPERTIES WHERE PROPERTY_NAME = ‘DEFAULT_PERMANENT_TABLESPACE’;
Then check the free storage space in the table space with the command below.
The SYSTEM table space stores important information about the Oracle server, and SYSAUX stores information about tuning the server’s performance. As we have checked that the default table space for DB is USERS, you can check the usage of the USERS table space with the command below.
SELECT TABLESPACE_NAME “TABLESPACE”, EXTENT_MANAGEMENT,FORCE_LOGGING,BLOCK_SIZE,SEGMENT_SPACE_MANAGEMENT
FROM DBA_TABLESPACES;
When the load is completed, enter the DB and count the number of records in the CUSTOMER table using the command below to check if all records have been migrated successfully.
ALTER SESSION SET current_schema = TESTUSR;
SELECT COUNT(*) FROM CUSTOMER;
We have confirmed that the number of rows is correct, and completed the DB migration using Oracle dump. 👏
2. DB Migration with DMS
📕Scenario
Let’s assume that EC2 in IDC VPC is the on-premise server and Oracle DB engine is installed on the server. Site-to-site VPN is configured to enable communication between IDC and AWS VPC. A dump file will be created locally, transferred to AWS S3, and data will be uploaded and imported from S3 to RDS.
- Source DB: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0–64bit Production in on-premise server
- SID: TEST
- Schema: TESTUSR(around 1GB)
- User: scott
- P/W: tiger
💡Tips for DMS
- Full Load
Before DB migration, a listener configuration in the source DB is required for communication with DMS instance.
A port other than the default 1521 port can be used. You can specify additional listener names and listener ports in that file to initialize a new listener port.
Use the command below to edit the listener.ora file with the vi editor.
vi oracle/product/12.1.0/network/admin/listener.ora
Add the following additional information below the default Port 1541 listener settings.
TEST2 =
(ADDRESS = (PROTOCOL = TCP)(HOST =${IDC Oracle Server IP})(PORT = 1521))
)
SID_LIST_TEST =
(SID_LIST =
(SID_DESC =
(GLOBAL_DBNAME = TEST)
(ORACLE_HOME = /home/ec2-user/oracle/product/12.1.0)
(SID_NAME = TEST)
)
)
Check the status of the port 1521 listener you added with the command below.
lsnrctl reload TEST2
lsnrctl status TEST2
Before migrating to DMS, access the source DB with sysdba and grant user scott the following permissions.
sqlplus / as sysdba
GRANT SELECT on V_$DATABASE to scott;
GRANT SELECT on V_$THREAD to scott;
GRANT SELECT on V_$PARAMETER to scott;
GRANT SELECT on V_$NLS_PARAMETERS to scott;
GRANT SELECT on V_$TIMEZONE_NAMES to scott;
GRANT SELECT on ALL_INDEXES to scott;
GRANT SELECT on ALL_OBJECTS to scott;
GRANT SELECT on ALL_TABLES to scott;
GRANT SELECT on ALL_USERS to scott;
GRANT SELECT on ALL_CATALOG to scott;
GRANT SELECT on ALL_CONSTRAINTS to scott;
GRANT SELECT on ALL_CONS_COLUMNS to scott;
GRANT SELECT on ALL_TAB_COLS to scott;
GRANT SELECT on ALL_IND_COLUMNS to scott;
GRANT SELECT on ALL_LOG_GROUPS to scott;
GRANT SELECT on SYS.DBA_REGISTRY to scott;
GRANT SELECT on SYS.OBJ$ to scott;
GRANT SELECT on DBA_TABLESPACES to scott;
GRANT SELECT on ALL_TAB_PARTITIONS to scott;
GRANT SELECT on ALL_ENCRYPTED_COLUMNS to scott;
GRANT SELECT ANY TRANSACTION to scott;
GRANT SELECT on V_$LOGMNR_LOGS to scott;
GRANT SELECT on V_$LOGMNR_CONTENTS to scott;
GRANT SELECT on V_$LOG to scott;
GRANT SELECT on V_$ARCHIVED_LOG to scott;
GRANT SELECT on V_$LOGFILE to scott;
GRANT SELECT on V_$TRANSACTION to scott;
GRANT SELECT on V_$DATABASE to scott;
All the preparation from the source DB side is completed.
Migration with DMS requires the subnet group where the DMS Replication instance is located, the endpoint to be connected to the source DB and the target DB, and the replication instance to implement replication. Once the above three are ready, create the Database migration task to start the migration.
Create a subnet from the Subnet groups tab on the DMS page. This is the subnet where the Replication instance will be located. Select the VPC and the private subnet where the target DB is located. By doing so, the Replication instance will be located in the network same as the target DB side.
After the backup file is created from the source DB, transfer the backup file to the Replication instance with FTP. As the file is sent from the Replication instance to the target DB, which is located within the same AWS network, it can be transferred faster through the AWS backbone.
Before you create a DMS Replication instance, create a security group to apply to the DMS Replication instance. After selecting AWS VPC where the target DB is located for the security group, no additional rules are required.
After creating a DMS Replication instance security group, allow inbound rules for the DMS Replication instance security group for both the source DB and the target DB security group.
Return to the Replication instances tab on the DMS page, and create an instance to implement the migration. for the Replication instance configuration, select the appropriate Instance class and Multi-AZ configuration. I’ll go with dms.t3.small for testing.
The Replication instance storage stores ongoing migration task logs and cached changes. If you want to enable CloudWatch logs for the collection of migration process logs when creating tasks in the future, you will need to select larger storage.
Under Advanced security and network configuration, select the Replication subnet group and VPC security group that you just created.
When the instance is ready, create endpoints that are associated with the source and target DB.
On the Endpoints tab, create a Source endpoint.
Under Endpoint configuration, enter source DB information. For the Server name, enter the private IP of the on-premise DB server, and for the User name and Password, enter the source DB username and password.
When the Source endpoint is ready, check if the endpoint and source DB are connected properly on the Status tab.
After creating Source endpoints, create target endpoints as well. Check Select RDS DB Instance to select the RDS within my account for the target endpoint.
Enter the target RDS information for the Endpoint Configuration. Enter RDS endpoint for Server name and source DB user name and password for User name and Password.
Check if the target DB and endpoint are well connected on the Connections tab. Connecting the target endpoint is normally easier compared to the source endpoint.
As shown above, you’ve created the Subnet group where the DMS will be located, the Endpoint to be connected to the source DB and the target DB, and the Replication instance to implement the migration. Now create the Database migration task to start the migration.
Create a task on the Database migration tasks tab. Select the instances and endpoints that you’ve created.
- Do nothing keeps the data and metadata in the existing table. Need to check this option thoroughly since when there is existing data in the table, the data can be duplicated with new migration. Drop tables on target drop the existing table and create a new table to migrate data in it. Truncate maintains table metadata and truncates data in the existing table. Use this option when you need to migrate new data without changing the table schema.
All of the above options newly create a table if there is no target table. - LOB stands for ‘Large of object’ and is called CLOB and BLOB in short, which is used to store large data such as images, videos, and large text. In the above option, Full LOB mode migrates all LOB columns, which may cause a longer migration time.
If you have a column that contains a LOB, better select Limited LOB mode and limit the maximum size of the column. It is important to specify the Max LOB size to match the actual data size since the excess data will be truncated when the actual data exceeds the Max LOB size. If there are many LOB columns in a particular table, you can also create separate tasks for each particular table. - When the Validation option is enabled, the data will be validated to ensure that the data is migrated from the source DB to the target properly. Data validation is highly required for SOX* or security compliance.
- Enabling Amazon CloudWatch logs allows CloudWatch to monitor the progress of DMS tasks, resources used, and network activity used. Logging might require larger Replication instance memory.
*SOX: Laws regulating financial reporting and audit activities of listed companies
In Table mappings, select Add new selection rule to enter the schema and table of the source DB to migrate. ‘%’ selects all tables as wildcards.
For real-world migration, enable the Premigration assessment to identify data type conversion issues. During the test, create a task without checking.
When the migration starts after the task is created, you can check the progress in Table statistics.
You can view various indicators on the CloutWatch metrics tab.
You can also check Log groups on the CloudWatch page with the name of the Replication instance.
You can view more detailed logs in the Log streams in that log group.
When the load is completed, enter RDS and count the number of records from the ITEM table to make sure that all the data is migrated successfully.
SELECT COUNT(*) FROM ITEM;
Select 20 records with the command below to check if the data looks good.
SELECT * FROM ITEM WHERE rownum >=1 AND rownum <= 20;
2) CDC
When the full load is completed, the data that has been changed afterward also need to be migrated. Migrating the changed data in the target DB is called CDC (Change Data Capture).
In order to proceed with the CDC, you need to add the following permissions to the user scott in the source DB.
sqlplus / as sysdba
GRANT EXECUTE ON dbms_logmnr TO scott;
GRANT LOGMINING TO scott;
GRANT SELECT ON v_$transportable_platform to scott;
GRANT CREATE ANY DIRECTORY to scott;
Oracle manages system change numbers (SCNs) for changes that occur at specific points in the DB. The SCN is a number given when a transaction in the DB is committed. Let’s start CDC migration with SCN before the specific transaction is committed.
Use the command below to check the SCN before a specific update in the DB and delete 20 records in the ITEM table.
ALTER SESSION SET CURRENT_SCHEMA = TESTUSR;
SELECT COUNT(*) FROM ITEM;
SELECT CURRENT_SCN FROM v$database;
DELETE FROM ITEM WHERE ROWNUM >=1 AND ROWNUM <= 20;
COMMIT;
Make a note of the SCN to usewhen you create the DMS Task.
There are two modes of reading the redo log with the Oracle as a source for the CDC tasks: LogMiner and Binary Reader. Oracle LogMiner is an Oracle API that reads redo logs, and Binary Reader is a DMS method that parses redo logs immediately after reading them. Binary mode is usually recommended considering performance or source DB-side load. Detailed differences for each mode can be found on the link.
To use Binary Reader, you need to modify the extra connection attribute of the source endpoint. Select the source endpoint and click Modify.
On the Endpoint settings tab, check the Use endpoint connection attributes and add the settings below.
useLogMinerReader=N;useBfile=Y;
You can check that the attribute is applied from the Endpoint setting tab.
On the DMS page Database migration tasks tab, create a task. Select the Replication instance, Source endpoint, and Target endpoint that you created. When creating a task for CDC, select the Migration type as Replicate data changes only.
Select the other Task settings options as default except for CDC start mode. Selects Specify a log sequence number and enter the SCN that we got earlier.
In Table mappings, enter the schema name and selects all tables by entering wildcards.
Click Create task at the bottom to create a task.
A task is created and replication starts.
Check the table statistics On the DMS Task page to see that 20 records in the ITEM table have been DELETED.
Here we have completed the CDC with DMS.🙌