Data Integration with Ballerina
What is Ballerina?
Ballerina is an open source, concurrent programming language which has both textual and graphical representation. It is mainly designed for seamless integration of networked applications. Ballerina is strongly and statically typed with powerful type system. [1]
What is Data Integration?
The importance of data integration is apparent to anyone who’s spent time fetching information from multiple systems for a basic report. Data integration involves combining data from several disparate sources, which are stored using various technologies and provide a unified view of the data. Today data integration becomes increasingly important and integral part of any kind of business/process integration scenario. Data integration is a term covering several distinct sub-areas such as:
- Data Warehousing — aggregating structured data from one or more sources so that it can be compared and analyzed for greater business intelligence
- Data Migration/Transformation (ETL) — transferring data from one system to another while changing the format, storage, database or application.
- Enterprise Application/Information Integration — establishing consistency among systems and provide unified view of data from different sources.
- Master Data Management — consistently manage the non-transactional data.
Why Using Ballerina for Data Integration?
Ballerina language is specially designed for integration domain and it allows faster and easy data integration due to following reasons.
Ballerina Type System
In most of the traditional programming languages SQL result-sets, JSON data, XML data etc are not treated as first class types. When using or manipulating these data we have to use various external libraries or add-ons to get the work done. But ballerina has designed with a sophisticated type system with first class support for different data types and formats. So users can generate, manipulate, convert from one type to another easily and less number of code lines. Following are the basic types in Ballerina which is capable of handling different data types.
- Value types — int, float, string , boolean, blob
- table— Represents tabular data in ballerina. (Ex: Data in a Result-set returned from a SQL query)
- json — built-in type to represent json data
- xml — built-in type to represent xml data
- record— allows to define user defined types
- array — array of data
- map — key value pairs
The table , json , xml , record types are highly useful when working with data integration scenarios. In Ballerina, table can be directly converted into xml or json type and table can be mapped into record types where each row of tabular data is mapped into record. Also json ,map, and record etc are inter-operable types and casting/converting allows transformation between these different types easily.
Connector Support for Various Data Sources
Ballerina client endpoints are used to connect with external entities or APIs. For data integration purposes ballerina provides several SQL and NoSQL endpoints to interact with tabular SQL and NoSQL data sources. Ballerina is equipped with following data endpoints as of now (Ballerina 0.970.0 version) Ballerina provide extension mechanisms for writing custom native/Ballerina client endpoints which can connect to any custom data sources if required.
- JDBC Endpoint— Built In connector which connects with SQL based tabular data sources via JDBC drivers. [2]
- MySQL Endpoint — SQL endpoint customized for MySQL DB. [3]
- H2 Endpoint — SQL endpoint customized for H2 DB. [4]
- MongoDB Endpoint— Connects to MongoDB and allows data find operations and manipulation operations like update, delete etc. [5]
- Cassandra Endpoint— Used to connect Ballerina with Cassandra data source and update, select data.[6]
- Redis Endpoint — To connect Ballerina with Redis datasources. [7]
Built-In Transaction Support
Ballerina transaction is a series of data manipulation statements that must either fully complete or fully fail, leaving the system in a consistent state. Ballerina language supports both local and distributed transactions for data and JMS connector actions. Ballerina provides syntax support for defining transaction boundaries and handling transaction failures and retries. [8]
Data Transformation Capabilities
The transformer syntax in ballerina is useful for having custom transformation between different types such as records and jsons. Together with the data casting/conversion functionality this becomes a key part in data integration scenarios.
Data Streaming Support
In Ballerina, table to json and table to xml type conversions are resulted in streamed data. With the data streaming functionality, when a service client makes a request, the result is streamed to the service client rather than building the full result in the server and returning it. This allows virtually unlimited payload sizes in the result, and the response is instantaneous to the client. There the result set corresponding to a particular query is converted to XML/JSON row by row and written to the wire as the conversion takes place upon a row.
Graphical Data Modeling With Composer
Composer is a tool to edit ballerina programs both in graphically and textually. The visual representation of ballerina is based on sequence diagram model and it helps the developer to have a clear view on the entire data integration flow.
Ability to Expose Data as Services via HTTP Service
Success of a business lies in its ability to integrate its data from across the organization and analyze it to make more informed decisions. So accessing data in a convenient way is a key requirement in any data integration scenario. APIs make this data exposure possible and REST is one of the most popular APIs to communicate with web, mobile and cloud apps. With the rich, fast and easy HTTP REST service development support in Ballerina, it allows rapid data services exposure via REST APIs.
References:
[2] Ballerina JDBC Client Example
[3] Ballerina MySQL Client Example
[4] Ballerina H2 Client Example