We have a complex business application where we have several business rules. The application needs to be available in online mode, e.g. a responsive web application and also in offline mode, e.g. a mobile app when the connectivity cannot be guaranteed. In offline mode, same business rules must be respected. When the device gets connectivity, it should seamlessly synchronize with the server.
For this problem, we can assume that
- The data model is striped in a fashion that appropriate related data can be replicated easily to the mobile device
- The business logic, if applied on the replicated local data, is acceptable per the constraints imposed by the application.
We want to create a solution where we do not duplicate the business logic, i.e. we do not want to create business rules in the toolset of the application server as well as in the toolset of the mobile platform.
First, we need to appreciate that it is a complex problem — offline data management like this is difficult. Following are my thoughts on the subject — please provide your feedback.
- Where should the business logic live?
My suggestion will be “in neither technology and in both places”. I consider the “business logic” as an extension of “data”. Just like the data will be replicated to the offline platform — similarly the “business logic” will be replicated.
While traditionally we create business logic on the application server and for example it may be compiled code that takes the form:
if ( condition 1 )
else if ( condition 2 )
do this;do my SQL to add data to the SQLServer database.
The suggestion is to turn this code into data. So let’s say we put this code in a repository and call it my_logic_001. And there we put this in the interpreted language of our choice — as data:
if ( condition 1 )
else if ( condition 2 )
do this;mySQLconn = <get connection. On mobile the SQLLite, otherwise SQLServer>Perform SQL operation against mySQLconn using ANSI SQL that works on both sides.if ( running_on_offline_platform )
enqueue this call with parameters // discussed later
So now at the specific point in our UI (whether it is mobile/offline or web/online) we will simply invoke something that calls this logic by its name, e.g. my_logic_001 and pass the parameters that it needs.
So compiled code will be something like:
call "my_logic_001" (with parameters)
This “call” premitive API will be implemented on all relevant platforms where it may read a database table (SQLServer or SQLLite) for the logic. It will then run the logic or perform just in time compilation and then run it.
- Our Business Logic Language should be able to call other business logic APIs as well
The “call” premitive API that allows the compiled code to invoke business logic by querying a data store will be available in our business language as well. This will allow the code to be reused easily.
- Business Logic SQL Engine
With that core assumption, we will provide some other basic services in the “business logic language” for example:
- An API to execute SQL. The API will detect the context, i.e. if it is running in offline or online mode. Based on that it will construct an appropriate connection handle (to the local SQLLite database or to the server SQLServer database). It will then execute the SQL and provide the results. Since the data model is identical and SQL is ANSI that should not be a problem.
- If the SQL is DML (i.e. insert, update, or delete) and it is running in offline mode — it will put that exact SQL statement in a queue table as well. So, all of these SQLs will execute in the order they were added when there is connectivity (concept is discussed later).
- Change Unit of Synchronization from SQL to API
Based on the complexity of the application — replicating each SQL may not be desirable as it may create an inconsistent data structure on the server. So, an alternate approach for more complex applications will be to enqueue the API call itself rather than the SQL.
For exmaple, let’s say we have business logic called “createCustomer”. We have written this logic in our interpreted language. Let’s say that call is something like “createCustomer” (customer:A, name:B, etc.)
if ( rule 1 )
do thatcallOurSQL ( "insert into customer ....." );
callOurSQL ( "insert somewhere else some other data ...." );if ( calledInOfflineMode)
enqueTheCalledAPI ( "createCustomer( customer:A, name:B)");
So rather than enqueuing the two SQLs, we enqueued the API call itself. We can modify our primitive API “call” to include this logic (i.e. logic in condition calledInOfflineMode) so that it is abstracted from the business logic.
This implies that when synchronization executes, it will commit after each of these calls, so the resulting server database will always be consistent.
- Offline Database Basic Concepts
We will establish a general rule that the data model of the offline database (e.g. SQLLite) and the server side database ( e.g. SQLServer) will be 100% identical. While the offline database may have fewer rows — the structure will be identical.
We will have two such identical databases (from the point of view of the structure) on the offline side.
- DBOFFLINE — that will be the database that is used by the application.
- DBSYNC — As described later, whenever the offline applictaion will find connectivity it will attempt to download the new data. But as it may take some time or eventually due to loss in connectivity the synchronization session may be abandoned — synchronization will update this database. When this database is successfully synched to the server — after that the DBOFFLINE and DYSYNC will be swapped once any inflight transactions have processed. This database may not be needed if we move towards an incremental data download strategy discussed later.
- Data Sync Thread
The offline solution will have a thread that will be constantly looking for a connection. As soon as it finds it, it will start to execute the SQLs that have been enqueued in the order they were queued.
When the upload has finished, it will start the download process — this will update the DBSYNC database. Once the process successfully completes — it will
- Take any enqueued SQL statements (i.e. DMLs that may have been executed during the synchronization process) into the DBSYNC database
- And then swap DBDYNC and DBOFFLINE.
- The new DBSYNC database will be completely wiped out so that it can be used in the next synchronization process.
- Replication — Basic Data Model Concepts
Regarding the offline model, we need to make the offline database manageable. We should not try to replicate the whole server database to the offline endpoint. Several possibilities may be explored:
- Very small data-set to perform a very small defined task.
For example, if we are picking an order, we may replicate the data for that specific pick list along with the supporting data to the offline data store. In that model we could mark the server-side data as checked out and simply disallow any further operations other than a check in later on. This will work well when we can afford the data to be checked out by a single user and to not allow any updates during that time. In exchange the synchronization process is greatly simplified.
This technique is often employed when integrating large scale systems in supply chain processes. Orders from an ERP may be downloaded to a WMS. WMS may create pick lists and download them to a WCS (Control System). Each time the higher-level system typically marks the data to indicate that the ownership has changed. It then limits the possible operations on that data until an upload is received from the lower system.
2. Stripe the data and download a stripe
This approach suggests that we find a suitable unit in our data model that will allow us to bring a subset of the data to the offline database. Email clients follow this model. The email server has emails for all users — but the specific email client gets only the emails that are for you. So, it striped the data by email address and downloaded just that stripe.
This approach requires extensive discussion with the end users to define a suitable entity that can be used to stripe the data. In this model there can be multiple people who download the same stripe (just like your email may be downloaded to multiple email clients).
Another example may be if we are maintaining data for several farmers (i.e. their financial data, their crops, harvests, etc.) — we may consider the farmer as our unit of striping. This will allow multiple people at the farm to download the data and work with it in an offline mode.
- Replication — Upload Concepts
This refers to the stage in replication when offline data is uploaded to the server. Based on one of the two models, i.e. each SQL vs each API — we will simply execute them (SQL or API) in order. If we go with the API approach, then the rules will be re-validated on the server. We must execute these in order and if something fails, we must stop at that time so that problems do not snowball.
If we are going with the API approach — synchronization should not fail because of any business rules; and only reason should be a technical glitch. In that case rolling back the active transaction and trying again later should be sufficient.
A complexity may arise if same exact data may be uploaded from two different offline clients. This concern should be highlighted upfront in the design process so that the end users can provide their preferred solution. Simplest approach will be to say that the later transaction wins. Here “later” means the attempt to upload and not when the offline transaction actually occurred. Another technique that may be incorporated is to not differentiate between “Add” and “Update”. So same basic API should be provided for “Add” and “Update” — this will simplify the ability to post such transactions as well.
In this model the APIs must be created to be extremely forgiving. For example, let us say we are managing inventory — to support offline model; we must allow our inventory to go into negative. That is because let’s say our “shipper” offline user synchronized before our “receiver” offline user — we should not start to fail transactions simply because the inventory went into negative.
- Replication — Download Concepts
This refers to the stage in replication when the online data is going to be downloaded from the server. While the upload was incremental, making download incremental in the same fashion is not simple. That is because typically you will have several offline clients and keeping track of the state of the offline client is not straight forward. Nevertheless, it is possible. I will discuss both concepts here.
- For full replication, we will simply receive the data from the server. To streamline the process — it may be best for the server to create a complete SQLLite database and send it in a compressed mode. This way the heavy lifting was done on the server and the offline system would receive a consistent dataset at the end.
- If full replication is not feasible, we can implement incremental replication as well. In that model the queue that we implemented in the offline database will be present in the online database as well. So, the server will maintain a queue of all API calls that are executed in proper order perpetually. When an offline client synchs, it keeps track of the last transaction number (this is the position of the API in the queue) that it pulled. When the next synchronization starts — it will start after the transaction that it pulled last time. Note that the offline client will always perform its upload first so that upload would have created enqueued API calls as well — this is fine as it is desirable that the offline client would redownload its own transactions.
Supporting offline data manipulation is a difficult problem — but as mobile platforms become ubiquitous and people want to incorporate them into real-world business use-cases; managing short terms interruptions becomes a basic requirement. In business world we may be executing complex business processes where a lot of business rules need to be respected. What I have described here is a possible approach to streamline the problem of managing the business logic. Using interpreted languages in general for business problems is a good idea as it provides a framework to express logic rather than a complex data model for supporting simple business rules.