Data Services with Ballerina

If you want data access in your SOA/MSA, then data services are the way to go. The idea is to create a data abstraction layer (DAL), that your other applications/services can use. That is, a data service gives you a generalized interface to the data you’re exposing, and gives access to it in a standard manner. This would be, in a well understood protocol and a known data format. For example, a popular approach is to use JSON via HTTP/S.

Writing a data service is not just about creating CRUD operations. Usually, if that is all you’re doing, it is generally considered an anti-pattern, and you should strive to do something more useful in your data service. That may include, some data filtering, validation, transformations, transactions etc… Basically, you need to do something intelligent for the data service to be useful overall. There are plenty of DSL based data integration tools to do these tasks. But sometimes, the functionality and the flexibility you need is more than a specific tool as such can provide. In these situation, you would usually turn on to your trusty general purpose programming language. That is, your Java, C#, Node.js to get it done. But then again, most of these languages have a higher overhead in getting a basic data service up and running, and maintaining them.

Ballerina is meant to avoid this requirement of boilerplate code, and provide maximum agility for the developer. Ballerina is more focused on integration scenarios, thus, writing data services comes very much natural to it. In supporting this, it contains first class language constructs for services, endpoints, transactions, data security, and more.

Writing a CRUD Service

Okay, so I started saying, just CRUD is not good. But this will show the basics you need to get it up and running, and after that, we can improve on it to add some more interesting features. Here, we will be creating a RESTful service, that will consume and produce JSON.

Ballerina has support for SQL databases through JDBC, so any JDBC driver can be used to support an RDBMS. You simply have to download the JDBC driver jar, and copy it to $BALLERINA_HOME/bre/lib, for example, in using the Debian/Ubuntu package, this location will be “/usr/lib/ballerina/ballerina-[version]/bre/lib/”. Also, there are more connectors for databases such as MongoDB and Cassandra at https://github.com/wso2-ballerina/.

Let’s first see how we create the HTTP service for our task. In Ballerina, services and endpoints are first class constructs in the language. A service is defined in the following way.

Listing 01: Ballerina HTTP CRUD Service Template

The above service is a Ballerina service template for a CRUD data service. In the service, we can define individual resources. The resources mentioned here are “employee”, “employeeById”, “employeeInsert”, “employeeUpdate” and “employeeDelete”. The operations these resource represent would be self explanatory, and they are represented each by its corresponding HTTP verb and the resource paths and the payloads it contains.

Ballerina services have data binding support in the service resources, where its path parameters and payload can be directly mapped to the parameters in the resource. A path parameter example is shown in the “employeeById” resource, where its “id” section in the path is mapped to the “id” integer value in the resource. Also, in “employeeInsert” resource, its body payload is mapped directly to the “Employee” record type. In this case, the incoming JSON is mapped to the fields of the Employee record. If needed, for any custom mappings, we can also declare the raw JSON object to be retrieved as the parameter to the resource as well. (For more information on Ballerina services, and their configuration properties, refer to “HTTP/HTTPS” section of Ballerina by Example)

Accessing Databases

In accessing a database, we model the remote database as an endpoint. In Ballerina, any external network entity access, we interact through an endpoint. The language is modeled in this way to match with the concept of using a sequence diagram to describe the actors, objects and the message passing between them.

Listing 02: Ballerina SQL Database Endpoint Declaration

The endpoint is used in Ballerina by invoking actions in them. A special syntax is there to access the actions in an endpoint, which is using an arrow (->). This symbolizes the idea of doing a network call using the endpoint.

Listing 03: Database Endpoint Usage in a Service Resource

The above code segment shows how the database endpoint can be used to execute an SQL query, with the given record type (“Employee”) and the arguments (“id”) for the query. This returns a union type of “table<Employee>|error”, which is resolved using a match statement. (For more information on using the database connectors, refer to “Database” section of Ballerina by Example)

After this, the “caller” endpoint can be used to respond to the client with the payload. Here, this is something unique to Ballerina, where other frameworks such as JAX-RS/WS etc…, they mostly simply return a function, and if there is a problem in writing the result back to the client, we do not have a place to handle it properly. Here, we get that chance, since we are sending the response back explicitly in the code.

Data Security

In implementing a data service, we need to make sure, we are handling the data in a secure way. May it be masking out confidential information, data filtering, or avoiding SQL injection attacks. A common amateur mistake developers do is, generating SQL queries directly by concatenating argument values into it. Let’s re-write the content in Listing 03 using this approach.

Listing 04: Invalid SQL Query Usage with Arguments

The code stated in Listing 04 has a clear SQL injection attack vulnerability. So in typical programming languages, the developer must catch this and fix it properly. In Ballerina, the above code actually doesn’t even compile!. The reason is, the in-built taint analysis features available in the language. Here, the SQL query string parameter is marked as “sensitive”. So unless the variable “id” is explicitly “untainted”, the SQL query string derived from the concatenation is marked as “tainted”, thus making it not compatible with the sensitive value expected as the SQL query string for the “select” operation.

So what this shows is, Ballerina does its best in not allowing the developer to make any mistakes in the first go itself. This is the concept of “secure by default” followed throughout the Ballerina language.

Authentication / Authorization

Ballerina contains an authentication framework, which is capable of plugging in provider implementations. Some of the out-of-the-box providers contains, file based authentication providers, which can be used give text based username/password/scope information in a configuration file, and the JWT authenticator. The JWT authenticator will validate its signature, and map to the scopes of the Ballerina service to authenticate and authorize service calls. An LDAP based authenticator is in the works now, and with this, you will be able to plugin to any LDAP based user store to do user authentication/authorization. For more information on Ballerina authentication framework features, please check https://ballerina.io/learn/how-to-write-secure-ballerina-code/#authentication-and-authorization.

Listing 05: Service with Authentication Configuration

The above service definition contains an authentication configuration, along with a transport level secure endpoint, where the keystore and truststore information is given. In this configuration, the service is declaring that, only users with the scope “scope1” is allowed to call this service.

Transactions

Ballerina transaction handling is designed to make it very convenient for the developer to define the operations, and simply mark the transaction boundary these operations belong to. Thereafter, Ballerina will simply give an all-or-nothing guarantee on the execution of all the operations in that group.

Listing 06: Ballerina Transaction Block Sample with SQL Connector

Listing 06 shows a transaction block in action. There, the operations represent a scenario where the employee data is swapped between two employee ids. For this, first we read in the two employee records, swap the record ids, and do separate SQL update operations to do the swap. And all these operations need to be done in a single transaction to make sure, we don’t get into an inconsistent state. So here, in Ballerina, what we need to simply do is, wrap all the SQL connector operations in a transaction block, and it automatically takes care of the internal details in committing/rollbacking the transactions.

Visualizing the Data Service

After our service logic is implemented, it is now time to show a preview of the sequence diagram flow of the program. This is one of the places where the power of Ballerina specific language constructs, such as services and endpoints, is shown.

Image 01: Database Select Operation Flow

Demo Run

The following are some sample curl commands to test the data service:-

Summary

In this write-up, it shows how a general data service can be implemented using Ballerina, and how it gives the full power to the developer, while making sure, the developer do the right operations at the right time.

The full source code for the Ballerina data services and the database scripts for both Oracle and MySQL can be found here.