An SQL Based rule engine

siddhesh jog
5 min readOct 24, 2019

--

Over the past year, we migrated our monolithic platform into a more component-based one. We were able to modularize the platform to the extent that logical service functions are now performed by individual API modules.

Thus there are individual modules to perform notification, label generation, etc. The old code had huge if-else code blocks for fine-tuning notifications (email/SMS), label generation, lifecycle tracking, etc.

When we migrated to new small individual modules we wanted to ditch these if-else blocks for something that was

  • maintainable: easy to read, easy to modify and incrementally test
  • performant: would execute quickly without straining underlying infrastructure resources.
  • flexible: could be used across multiple modules without significant change.
  • cost-efficient: not be a burden on our wallet.

With cost being a factor we first assessed Drools. While it was straight forward to write rules with Drools, it fell flat on the maintainability (it was/is difficult to test the rule engine execution) and performance (execution of rules often required large amounts of data fetched from the database and was slow) fronts.

Having broken down the platform into individual components, we could embed rules engines in each component and in that regard Drools rule engine, was flexible and of course quite cost-efficient.

We have since replaced it with an in-house SQL based rule engine.

Rules!

It essentially has three parts:

1) A database configuration table rule_config that holds all the rules with the following key fields:

  • rule_module: all related rules belong to the same module: example “SMS Rules”
  • rule_sql: the SQL that represents the rule and always ends with entity_id = :v_entity_id,
  • rule_description: Human-readable description of the rule so that we understand what the rule is for.
  • sort_order: to sort the rules.
  • rule_action_flag: to determine if this is a negation rule. Should the action be taken or not if the rule is a hit. For example, a rule for NOT sending an email.
rule configuration table.

The rule_sql always takes the following form:

SELECT count(e.id) 
FROM entity e
JOIN <some_joins_if_needed>
WHERE <some_conditions>
AND e.entity_id = :v_entity_id

When the rule SQL is executed for a particular entity_id its response is 0, if the rule does not hold good for that particular entity and > 0 if it does.

For example: consider a rule that selects all employees in the Math department

SELECT count(e.id) 
FROM employee e
JOIN employee_department ed ON e.id = ed.employee_id
JOIN department d ON d.id = ed.department_id and d.name = 'Math'
WHERE e.id = :v_entity_id

This rule was return 1 for all employees that belong to the department Math and 0 for those that do not. This rule could thus be used for sending tailored emails to employees of the Math department.

2) A stored procedure execute_rules, that executes the rule SQLs for a particular module for the given entity_id and returns the rule_id of the rule_id for that rule that got the “hit”.

execute_rules(IN p_rule_module, IN p_entity_id, OUT p_rule_id)

The procedure essentially first fetches all rule_config rows for a particular rule_module with ascending sort_order.

It then replaces the, “:v_entity_id” parameter with the specific entity id of the rule and fires the query.

SET v_sql = v_rule_config.rule_sql;SET @SQL = REPLACE(v_sql, ":v_entity_id", p_entity_id);PREPARE v_sql FROM @SQL;EXECUTE v_sql;

If the resultant count is > 0 then the rule has been hit and rule_id is returned from the stored procedure.

The stored procedure is a MUST. The queries themselves are quite inexpensive (if written correctly). Using JPA/Hibernate to execute each query though would lead to way too many I/O operations with the associated serialization-deserialization (This was still faster than Drools). The stored procedure allows for 1 journey to the database.

3) The third bit is a standalone spring java jar module with a RuleEngineService that calls the stored procedure and then fetches the entire RuleConfig object once the procedure returns the rule_id.

public class RuleEngineService {   public RuleConfig fireRules(String ruleModule, String parcelId){
log.info("Firing rules for module {}", ruleModule);
Long ruleId = ruleExecutor.executeRules(ruleModule, parcelId);
return ruleConfigRepository.findById(ruleId).get();
}
}

This jar is then added to individual services to embed the rule engine into the service.

The key though is that the rule engine only makes a decision. It does not execute the action. That is left to the rule invoking module.

Right off the bat, it feels like a step back in time. The stored procedure locks you in with the RDBMS as well as a specific DB engine.

But it has quite a few advantages and delivers on most of the objectives we had to start with:

  • Maintainability: It is quite easy to maintain rules, modify them and test them, with basic JUnit testing in the standalone rule jar for each module. It does need half decent query writing skills and needs technical skills to write new rules though.
  • Performance: There has been a major upgrade from a performance standpoint, more than halving the response time of the APIs where it has been used (as compared to drools). The stored procedure executes all SQLs in one go thus reducing the time and resources used for serialization-deserialization. We did try to hibernate to executes SQLs one by one from the java module but the time gain when using stored procedure was outstanding. Large data sets were fetched into the service for the Drools engine to make a decision. That is no longer required. It has allowed us to trim down the domain model inside the APIs/services; no need to fetch and store fields that are required to make the decision. Only fetch and store data required to provide the service (notification/label configuration etc). The database load has also gone down; with small queries dependant on IOPS rather than throughput.
  • Flexibility: easy to modify/add rules (a bit too easy); as all you need to do is alter the SQL. There is no sprint development time required. This was particularly problematic with Drools as changing rules required sprint development.

On the flip side:

  • Business team members (and even tech ) without SQL knowledge cannot update rules.
  • SQL Rules are susceptible to the introduction of bugs and need a strict regimen and process for modifying/adding rules.
  • You are effectively tied to the database. It would not be hard to change from MySQL to Oracle/Postgres etc. But changing to a NoSQL database is now harder.
  • If there are 100s of rules within a module, the engine may run into performance.

To summarize; The SQL rule engine is a great fit for use cases where rules run its 10s rather than in the 100s or 1000s. It has heavy tech dependency but is quite maintainable; super flexible and performant.

It has met the goals that it was built for and has allowed us to rapidly change/add rules and significantly improve the performance of rule-based services.

Code: https://github.com/tech-parcelpoint/rule-engine

--

--