Seamless Database Transition: Migrating from a Monolithic Oracle DB to Database Per Service during Microservice Transition

Metin Yavuz
n11 Tech
Published in
9 min readJul 2, 2024

In the rapidly evolving landscape of software development, the shift from monolithic architectures to microservices has become a significant trend. This transformation promises improved scalability, flexibility, and resilience. One critical aspect of this transition is the database strategy, mainly used pattern is “Database per Service.” This approach ensures that each microservice manages its own data, providing independence and efficiency.

Migrating from a monolithic architecture to a microservice architecture with a database per service is a complex and challenging process. One tool that can greatly facilitate this transition is Debezium. Debezium is an open-source distributed platform for change data capture (CDC). It helps in capturing changes in data and propagating these changes across different services.

In addition to Debezium, Confluent JdbcSourceConnector can contribute microservice transation. It allows data from a relational database to be continuously ingested into kafka topics, facilitating real-time data streaming and integration. At n11, we are using both of them to facilitate microservice transition and applying proper architectural styles and patterns. Stay tuned for the JdbcSourceConnector usage which we will cover in a later post.

In this article, I would like to discuss importance of CDC tools throughout microservice transition and also I’d like to give examples. We’ll begin with what Debezium is and how we can run it. Next, We’ll delve in CDC usage scenarios in order to enhance our architectural perspectives. Then, we’ll setup a stack to stream changes from Oracle to PostgreSQL database.

The primary objective of Debezium is to capture changes such as updates, inserts, and deletes in database tables by creating a change data capture (CDC) stream. These data changes are typically transmitted through a message queue system, enabling asynchronous data flow between systems.

Debezium integrates with various popular relational databases, including PostgreSQL, MySQL, Oracle, MongoDB and also other popular databases. This flexibility allows developers and system administrators to track changes in different databases.

How Debezium Works

Debezium operates through the Change Data Capture (CDC) method. Here is the general working principles of Debezium:

Database Logs (Transaction Logs): Debezium typically uses the database’s own log files to track changes. These files contain information about structural and data changes in the database.

Debezium Connector: Debezium uses connectors specific to each database. These connectors are designed for a particular type of database (e.g., MySQL Connector, PostgreSQL Connector).
The connector monitors the database logs and extracts the necessary information to understand the changes.

Change Data Events: Change events are usually transmitted to other systems through a message queue system (e.g., Apache Kafka).
These events represent near real-time changes in the source database.

Consuming Change Data Events: Change events are transferred to other systems via kafka-connect sink connectors or kafka consumers. The receiving system can take appropriate actions based on incoming change events, update data, or perform other operations.

This architecture allows Debezium to track database changes in near real-time and effectively transmit them to other systems. It is particularly useful in scenarios such as microservices architectures, real-time data analysis, and data synchronization requirements.

Using Debezium in Event-Driven Architectures

Real-time data pipelines: You can capture database changes in real-time and stream them to downstream applications like analytics engines, search platforms, or notification systems.
This enables immediate updates, improved data consistency, and reactive processing.

All event are collected, processed and published in real time.

Microservice communication: Debezium can be used to decouple microservices by publishing database changes as events and then other services can subscribe to it. This promotes loose coupling, independent scaling, and improved agility.

Inventory listens Order Events

Event sourcing & CQRS: You can implement event sourcing by capturing all database changes as events and storing them in an event store.
CQRS (Command Query Responsibility Segregation) can be used to separate read and write models based on these events. This provides better auditability, traceability, and flexibility for data manipulation.

Product CQRS example

Data synchronization & Migration: You can also synchronize data across different systems by capturing changes in one database and publishing them to another using Debezium and Kafka. This ensures consistent data across multiple systems and avoids manual updates.

It’s time to up CDC Stack

Deployment Components

Must Components: Zookeper, Kafka and KafkaConnect

Optional Components: Kafka UI, Schema Registry (Data Serialization)

Docker Compose

version: '2'
services:
zookeeper:
image: quay.io/debezium/zookeeper:2.5
ports:
- 2181:2181
- 2888:2888
- 3888:3888
networks:
- debezium-article
kafka:
image: quay.io/debezium/kafka:2.5
ports:
- 9093:9093
networks:
- debezium-article
depends_on:
- zookeeper
environment:
ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CLIENT:PLAINTEXT,EXTERNAL:PLAINTEXT
KAFKA_LISTENERS: CLIENT://:9092,EXTERNAL://:9093
KAFKA_ADVERTISED_LISTENERS: CLIENT://kafka:9092,EXTERNAL://127.0.0.1:9093
KAFKA_INTER_BROKER_LISTENER_NAME: CLIENT
connect:
image: 'confluentinc/cp-kafka-connect:7.5.3'
networks:
- debezium-article
environment:
CONNECT_BOOTSTRAP_SERVERS: kafka:9092
CONNECT_REST_PORT: 8083
CONNECT_GROUP_ID: kafka-connect-group
CONNECT_CONFIG_STORAGE_TOPIC: kafka_connect-configs
CONNECT_OFFSET_STORAGE_TOPIC: kafka_connect-offsets
CONNECT_STATUS_STORAGE_TOPIC: kafka_connect-status
CONNECT_KEY_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE: 'false'
CONNECT_KEY_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schemaregistry:8085'
CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE: 'false'
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: 'http://schemaregistry:8085'
CONNECT_INTERNAL_KEY_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER: org.apache.kafka.connect.json.JsonConverter
CONNECT_REST_ADVERTISED_HOST_NAME: connect
CONNECT_LOG4J_ROOT_LOGLEVEL: INFO
CONNECT_LOG4J_LOGGERS: 'org.apache.kafka.connect.runtime.rest=WARN,org.reflections=ERROR'
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: '1'
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: '1'
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: '1'
CONNECT_PLUGIN_PATH: /usr/share/java,/usr/share/confluent-hub-components,/usr/share/kafka/plugins
volumes:
- './debezium-connector-oracle:/usr/share/confluent-hub-components/debezium-connector-oracle'
- './debezium-connector-jdbc:/usr/share/confluent-hub-components/debezium-connector-jdbc'
- './ojdbc8.jar:/etc/kafka-connect/jars/ojdbc8.jar'
ports:
- '8083:8083'
depends_on:
- kafka
- schemaregistry
command:
- bash
- -c
- |
confluent-hub install --no-prompt debezium/debezium-connector-postgresql:2.2.1
/etc/confluent/docker/run &
sleep infinity
kafka-ui:
image: provectuslabs/kafka-ui:latest
ports:
- 8080:8080
environment:
DYNAMIC_CONFIG_ENABLED: true
volumes:
- ./kui-config.yaml:/etc/kafkaui/dynamic_config.yaml
networks:
- debezium-article
depends_on:
- kafka
- connect
- schemaregistry
schemaregistry:
image: confluentinc/cp-schema-registry:7.5.3
ports:
- 8085:8085
depends_on:
- kafka
networks:
- debezium-article
environment:
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: PLAINTEXT://kafka:9092
SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL: PLAINTEXT
SCHEMA_REGISTRY_HOST_NAME: schemaregistry
SCHEMA_REGISTRY_LISTENERS: http://schemaregistry:8085
SCHEMA_REGISTRY_SCHEMA_REGISTRY_INTER_INSTANCE_PROTOCOL: "http"
SCHEMA_REGISTRY_LOG4J_ROOT_LOGLEVEL: INFO
SCHEMA_REGISTRY_KAFKASTORE_TOPIC: _schemas
networks:
debezium-article:
driver: bridge

Download dependencies

  1. Download the Debezium Oracle connector plug-in archive.
  2. Extract the files into your Kafka Connect environment.
  3. Download the JDBC driver for Oracle from Maven Central and extract the downloaded driver file to the directory that contains the Debezium Oracle connector JAR file.
  4. For further information, you can access Debezium Oracle page
  5. Download the Debezium JDBC connector plug-in archive.
  6. Extract the files into your Kafka Connect environment.

Source & Target Databases

For Oracle:

docker run -d \
--name dbz_oracle21 \
-p 1521:1521 \
-e ORACLE_SID=ORCLCDB \
-e ORACLE_PDB=ORCLPDB1 \
-e ORACLE_PWD=oraclepw \
container-registry.oracle.com/database/enterprise:latest
#########################
DATABASE IS READY TO USE!
#########################

Above log message indicates that everting is OK. Then you can enable archive log & minimal supplemental logging.

You need to enable archieve logging and supplemental logging using SQL commands. To do this, you’ll need to be connected to the database through SQL*Plus as the SYS user (be careful, this is a powerful account!).

docker exec -it -e ORACLE_SID=ORCLCDB dbz_oracle21  sqlplus sys as sysdba

To enable ARCHIVELOG mode status, execute the following SQL commands:

SQL> Shutdown
SQL> Startup mount
SQL> Alter database archivelog;
SQL> alter database open;

To check the ARCHIVELOG mode status, enter the following SQL command:

SQL> Database log mode        Archive Mode
Automatic archival Enabled
Archive destination USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence 5
Next log sequence to archive 7
Current log sequence 7

To enable Minimal Supplemental Logging, run the following SQL command:

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

User Setup

To monitor change events, Debezium connector needs to set up a JDBC connection with the database and run various LogMiner APIs. This process necessitates a user account with particular permissions to utilize these LogMiner APIs and collect information from these tables where changes are recorded.

To create tablespaces & user with privileges, run the following sql statements.

CONNECT sys/oraclepw@ORCLCDB as sysdba;
CREATE TABLESPACE logminer_tbs DATAFILE '/opt/oracle/oradata/ORCLCDB/logminer_tbs.dbf'
SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;

CONNECT sys/oraclepw@ORCLPDB1 as sysdba;
CREATE TABLESPACE logminer_tbs DATAFILE '/opt/oracle/oradata/ORCLCDB/ORCLPDB1/logminer_tbs.dbf'
SIZE 25M REUSE AUTOEXTEND ON MAXSIZE UNLIMITED;

CREATE USER c##dbzuser IDENTIFIED BY dbz DEFAULT TABLESPACE LOGMINER_TBS
QUOTA UNLIMITED ON LOGMINER_TBS
CONTAINER=ALL;

GRANT CREATE SESSION TO c##dbzuser CONTAINER=ALL;
GRANT SET CONTAINER TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$DATABASE TO c##dbzuser CONTAINER=ALL;
GRANT FLASHBACK ANY TABLE TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ANY TABLE TO c##dbzuser CONTAINER=ALL;
GRANT SELECT_CATALOG_ROLE TO c##dbzuser CONTAINER=ALL;
GRANT EXECUTE_CATALOG_ROLE TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ANY TRANSACTION TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ANY DICTIONARY TO c##dbzuser CONTAINER=ALL;
GRANT LOGMINING TO c##dbzuser CONTAINER=ALL;

GRANT CREATE TABLE TO c##dbzuser CONTAINER=ALL;
GRANT LOCK ANY TABLE TO c##dbzuser CONTAINER=ALL;
GRANT CREATE SEQUENCE TO c##dbzuser CONTAINER=ALL;

GRANT EXECUTE ON DBMS_LOGMNR TO c##dbzuser CONTAINER=ALL;
GRANT EXECUTE ON DBMS_LOGMNR_D TO c##dbzuser CONTAINER=ALL;

GRANT SELECT ON V_$LOG TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$LOG_HISTORY TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$LOGMNR_LOGS TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$LOGMNR_PARAMETERS TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$LOGFILE TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$ARCHIVED_LOG TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$ARCHIVE_DEST_STATUS TO c##dbzuser CONTAINER=ALL;
GRANT SELECT ON V_$TRANSACTION TO c##dbzuser CONTAINER=ALL;

Generate Changes On DB

Create sample data on Oracle DB using the following sqls.

CREATE TABLE CUSTOMERS (
CUSTOMER_ID NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY PRIMARY KEY,
FIRST_NAME VARCHAR2(50) NOT NULL,
LAST_NAME VARCHAR2(50) NOT NULL,
EMAIL VARCHAR2(100) UNIQUE NOT NULL,
PHONE_NUMBER VARCHAR2(15),
REGISTRATION_DATE TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
LAST_UPDATE TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO CUSTOMERS (FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER) VALUES ('John', 'Doe', 'john.doe@example.com', '123-456-7890');
INSERT INTO CUSTOMERS (FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER) VALUES ('Jane', 'Smith', 'jane.smith@example.com', '098-765-4321');
INSERT INTO CUSTOMERS (FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER) VALUES ('Alice', 'Johnson', 'alice.johnson@example.com', '555-123-4567');
INSERT INTO CUSTOMERS (FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER) VALUES ('Bob', 'Brown', 'bob.brown@example.com', '123-543-4567');
COMMIT;
ALTER TABLE CUSTOMERS ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

Let’s capture some changes using Debezium

Before source connector creation, you need to connect Oracle container to debezium-article network on docker context. You can connect running below script

docker network connect debezium-article_debezium-article dbz_oracle21

Source Connector

{
"name": "source-connector",
"config":
{
"connector.class": "io.debezium.connector.oracle.OracleConnector",
"database.dbname": "ORCLCDB",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "oracledb.history",
"database.hostname": "dbz_oracle21",
"database.password": "dbz",
"database.pdb.name": "ORCLPDB1",
"database.port": "1521",
"database.server.name": "ORCLCDB",
"database.user": "c##dbzuser",
"schema.history.internal.kafka.bootstrap.servers": "kafka:9092",
"schema.history.internal.kafka.topic": "schema.history",
"schema.history.internal.store.only.captured.tables.ddl": "true",
"schema.history.internal.store.only.captured.databases.ddl": "true",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "http://schemaregistry:8085",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schemaregistry:8085",
"table.include.list": "C##DBZUSER.CUSTOMERS",
"tasks.max": "1",
"topic.prefix": "cdc"
}
}

In order to create above connector, you can execute below script or create on Kafka-UI app.

curl -i -X POST -H "Accept:application/json" \
-H "Content-Type:application/json" \
localhost:8083/connectors \
-d @source-connector.json

If creation is successful, the kafka connect container will start performing a snapshot of the data in the CUSTOMERS table. We can confirm that by using the Kafka avro console consumer tool and reading the topic’s contents to the local terminal.

docker exec -it debezium-article-connect-1 /bin/kafka-avro-console-consumer \
--bootstrap-server debezium-article-kafka-1:9092 \
--from-beginning \
--topic cdc.C__DBZUSER.CUSTOMERS \
--property schema.registry.url="http://debezium-article-schemaregistry-1:8085"

Let’s sink changes to Postgresql

You can easily create a postgres intance on your local environment using below script.

docker run --name postgres -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d debezium/postgres:16
docker network connect debezium-article_debezium-article postgres

Then, you can create customers table, executing below script.

CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
phone_number VARCHAR(15),
registration_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

After table creation, we can start sinking our CDC to postgresql.

{
"name": "sink-connector",
"config":
{
"connector.class": "io.debezium.connector.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"connection.url": "jdbc:postgresql://postgres/postgres",
"connection.username": "postgres",
"connection.password": "postgres",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "http://schemaregistry:8085",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schemaregistry:8085",
"dialect.name": "PostgreSqlDatabaseDialect",
"insert.mode": "upsert",
"primary.key.mode": "record_value",
"primary.key.fields": "CUSTOMER_ID",
"schema.evolution": "basic",
"database.time_zone": "UTC",
"topics": "cdc.C__DBZUSER.CUSTOMERS",
"transforms": "changeTopicName",
"transforms.changeTopicName.type": "org.apache.kafka.connect.transforms.RegexRouter",
"transforms.changeTopicName.regex": "cdc.C__DBZUSER.(.*)",
"transforms.changeTopicName.replacement": "$1"
}
}

In order to create above connector, you can execute below script or create on Kafka-UI app.

curl -i -X POST -H "Accept:application/json" \                                                                                  INT ✘  16:52:23 
-H "Content-Type:application/json" \
localhost:8083/connectors \
-d @sink-connector.json

If creation is successful, the kafka connect container will start sink cdc data to CUSTOMERS table on public schema. We can confirm that by querying customers table.

SELECT * FROM public.customers;

Conclusion

In summary, we explored the fundamental concepts of change data capture (CDC) and how Debezium facilitates seamless database transition from monolithic to microservices architecture. By leveraging powerful tools like Debezium and Kafka, organizations can achieve real-time data synchronization and event-driven communication between services. With careful planning and the right tools, the transition to a microservice architecture can be a game-changer for your organization.

Additional Links & References

--

--