In this article we will discuss the Debezium connector configuration to run Change Data Capture in the PostGres database. We are not covering here how to create a PostGres database. If you need some help, you can follow this link.
So, the first thing that is needed is to add or change the following parameters in the postgresql.conf configuration file. You can find more aboute that in Debezium official page:
log_min_error_statement = fatal# CONNECTION
listen_addresses = ‘*’ # change it to listen in all interfaces# MODULES
# shared_preload_libraries = ‘decoderbufs’# REPLICATION
wal_level = logical # minimal…
Imagine the following scenario, we have a CUSTOMERS table in the ORACLE database that we need to capture changes instantly (CDC — change data capture) and monitor these changes to analyze the changed data in real time.
In our test environment, we set up a POC to validate this scenario. This article has this purpose, to describe how this POC was made and its outcome.
Apache Ranger is an awesome security tool for audit and manage data security accross Hadoop Ecosystem.
Using it, you can granularly grant or deny access to policies applyed to the following services HDFS, HBASE, HIVE, YARN, KNOX, SOLR, KAFKA, STORM and NIFI.
Even more, Apache Ranger can provide:
- Centralized security Management throughout a user interface or by using its REST APIs.
- Fine grained policies granting or denying specific grants or Hadoop’s components operations.
- Standardize authorization method across all Hadoop components.
- Different authorization methods for example: RBAC — Role Based Access Control, Attribute Based Access Control, and so on.
- Centralized auditing…
Apache Sqoop is a versatile and very useful tool when it comes to gathering data for your Big Data project.
In my case, we developed several critical processes, for example, importing valuable information from Oracle RDBMS.
We need to extract essential information and generate insights for company’s heads to take important decisions.
How to incorporate this data in our Data Lake or in ETL processes in a way that they generate reports for decision’s makers?
With Apache Sqoop it became an easy task and you can find a number of information on how to use the tool on the web…
O Apache Sqoop é uma ferramenta versátil e muito útil no que se diz respeito a realizar ingestões nos seus projetos de Big Data.
No meu caso, desenvolvemos alguns processos críticos na empresa para por exemplo, importar valiosas informações do RDBMS Oracle.
Precisamos extrair informações essenciais e gerar insights para a cúpula da empresa tomar decisões importantes.
Como incorporar estes dados em nosso Data Lake ou em processos ETL para que geram reports para tomada de decisões?