See: https://github.com/tspannhw/FLaNK-DataFlows/blob/main/jdbc/README.md
Reading from Apache Iceberg Tables with Cloudera DataFlow
Add a processor to your page to read, for example, ExecuteSQLRecord 1.20.0.2.3.8.1–1, and name it ExecuteSQLRecord Impala. You can use any that use JDBC connections such as
- ExecuteSQL
- ExecuteSQLRecord
- QueryDatabaseTable
- QueryDatabaseTableRecord
Processor Settings
- Normalize Tables/Column Names: true
- Use Avro Logical Types: true
- Query
SELECT * FROM `default`.tim_syslog_critical_archive
Set all your parameters in the processor.
Services Settings
First add a connection service for your processor.
Now you can add parameters to your service.
Set all the following parameters.
- Service Name: DBCPConnectionPool Impala Iceberg
- Database Connection URL:
jdbc:impala://oss-kudu-demo-gateway.oss-demo.qsm5-opic.cloudera.site:443/;ssl=1;transportMode=http;httpPath=oss-kudu-demo/cdp-proxy-api/impala;AuthMech=3;
- Database Driver Class Name:
com.cloudera.impala.jdbc.Driver
- Database Driver Location(s):
#{Database Driver Location}
Set parameter and then upload driver
- Database User:
#{CDP Workload Username}
- Password:
#{CDP Workload User Password}
Detailed Parameters
References
https://docs.cloudera.com/cdw-runtime/1.5.0/iceberg-how-to/topics/iceberg-data-types.html