Oracle GoldenGate Big Data Adapters Use-cases, Architecture with Implementation demo

Trushant Bagate
6 min readMay 23, 2021

--

Background

Near real-time / real-time data replication, Continuous flow & streaming. These are all terms used to identify a business’s need for quick access to data.

It is evident that organizations need to quickly access, analyze, and report on their data across their Enterprise in order to be agile in a competitive market.

Data is becoming more of an asset to companies; it adds value to a business but may be stored in any number of current and legacy systems, making it difficult to realize its full potential. Known as Big Data, Big data is data that contains greater variety arriving in increasing volumes and with ever-higher velocity.

Challenges

Nowadays business’s requirements are transforming from being reactive to proactive and predictive, eager to utilize Big Data applications.

The challenge is to bring together and make available a high volume, of a variety of data (from various formats and sources), at high velocity, for active near real time insights.

Real Time Data Streaming into Big Data and Typical Use Cases

· Fraud Detection and Audit trail of Critical Data

· Real Time Reporting Capabilities

· Risk Analysis

· Building data lake or big data reservoir

Solution

Oracle GoldenGate for Big Data offers high-performance, fault-tolerant, easy-to-use, and flexible real-time data streaming platform for big data environments.

It easily extends customer’s real-time data integration architectures to big data systems without impacting the performance of the source systems and enables timely business insight for better decision making.

Below diagram shows majority of possible use cases,

OGG BD Use cases

Oracle GoldenGate standard workflow

Below architecture diagram shows high-level architecture of standard Oracle GoldenGate replication,

Oracle GoldenGate workflow diagram

To get in depth understanding of GoldenGate and its capabilities do refer a very good blog from Trushar, https://link.medium.com/muaYPT2VUeb

Introduction to Oracle GoldenGate Big Data Adapters

Oracle GoldenGate for Big Data 19c streams transactional data into Big Data and Cloud systems in real-time, without impacting the performance of source systems. It streamlines real-time data delivery into the most popular Big Data solutions to facilitate improved insight and timely action. Including Apache Hadoop, Apache HBase, Apache Hive, Confluent Kafka, NoSQL Databases, Elasticsearch, JDBC, Oracle Cloud, Amazon Web Services, Microsoft Azure Cloud, Google Cloud Platform, and Data Warehouses.

OGG BD Key features

· Secured, reliable, and fault-tolerant data delivery

· Simple to install, configure, and maintain

· Captures real-time data from messaging systems like JMS and changed data from NoSQL databases, such as Cassandra.

· Streams real-time changed data into Big Data systems, real-time messaging systems, NoSQL Databases, cloud data warehouses

· Easily extensible and flexible to stream changed data to other big data targets and message queues like JMS, Apache Kafka, Amazon Kinesis etc.

· Automatic data transformation from row-oriented data format to nested or columnar data or compressed formats such as XML, JSON, Avro, Parquet, ORC etc.

· Process and analyze streaming data using an interactive user interface on a scalable and highly available clustered Spark environment using Oracle Stream Analytics

OGG BD Key Benefits

· Improves IT productivity in integrating with Big Data systems

· Use real-time data in Big Data analytics for more timely and reliable insight

· Improves operations and customer experience with enhanced business insight

· Minimizes overhead on source systems to maintain high performance

OGG Big Data Adapters workflow

Below diagram displays the high-level architecture of OGG for Big Data

Oracle GoldenGate BD Adapters workflow diagram

Demonstration of using OGG BD Adapter:

As part of this demo exercise, we’ve implemented replication from Oracle database to Confluent Kafka platform using Oracle GoldenGate Big Data Adapters.

The Oracle GoldenGate for Big Data Adapters contains built-in support to write operation data from Oracle GoldenGate trail records into various Big Data targets, such as HDFS, HBASE, Kafka etc. The functionality is separated into handlers that integrate with third party applications and formatters, which transform the data into various formats.

Below steps in document outlines standard steps to configure real time replication from source Oracle Database to target Confluent Kafka.

System details

  1. Source side details,

2. Target side details,

Configuration of Source Environment

  1. Source OS details

2. Check for DB level prerequisites at source database. Make sure all User, DB and Table level prerequisites are in place.

3. Table to be replicated using OGG

4. Create required extract and replicat processes at source side

Configuration of Target Environment

  1. Target OS details,

2. Check Confluent Platform details,

3. Configure the required version for Oracle OGG BD Adapters and create the replicat process according to environment.

4. Update the Properties files according the confluent platform details

5. Sample properties file is as below,

6. Sample kafkaconnect properties file,

Testing Replication

  1. Source table currently has no rows,

2. Topic in Confluent Kafka have no data in it.

3. Test Insert replication

4. Source,

5. Target,

6. Check Update replication

7. Source,

8. Target,

9. Check Delete replication

10. Source,

11. Target,

Thus, we have completed the installation and configuration of replication from source Oracle database to target Confluent Kafka environment.

References

· Using the Kafka Connect Handler:

https://docs.oracle.com/en/middleware/goldengate/big-data/19.1/gadbd/using-kafka-connect-handler.html#GUID-81730248-AC12-438E-AF82-48C7002178EC

· Kafka Connect Handler Configuration:

https://docs.oracle.com/en/middleware/goldengate/big-data/19.1/gadbd/using-kafka-connect-handler.html#GUID-23F5CCE3-845C-43F0-A08E-42C2BD1824FB

· Kafka Connect Handler Performance Considerations:

https://docs.oracle.com/en/middleware/goldengate/big-data/19.1/gadbd/using-kafka-connect-handler.html#GUID-48998500-C606-4D29-B1FE-C7FFA243C20C

· OGG Big data certification matrix:

https://www.oracle.com/middleware/technologies/fusion-certification.html

--

--