Deploy AI models for real-time inferencing in your z/OS IMS transactions — IBM Watson Machine Learning for z/OS v2.4 new feature

Shuangyu
12 min readDec 19, 2022

--

The previous post I had, “Deploy Deep Learning model for real-time fraud detection in 2 hours — IBM Machine Learning for z/OS introduction”, talked about the deep learning inferencing capability on IBM System Z and how IBM Watson Machine Learning for z/OS (WMLz) can help you quickly integrate deep learning ONNX models into your current z/OS applications for real-time insights. In this article, you will learn how you can further optimize the end-to-end inferencing performance in IMS online transactions and minimize the application impact by leveraging the new scoring capability that comes with the latest WMLz v2.4 update in Q4 2022.

As an enterprise machine learning solution that runs on IBM Z, WMLz provides numerous features to help you simply the deployment of machine learning and deep learning models into your transactional applications on z/OS, allowing real-time insights without compromising security, performance and resiliency. WMLz offers several online scoring approaches including RESTful APIs, Java APIs and CICS LINK command with the intention to provide the best-fit real-time AI model inferencing interface for various types of applications on z/OS. While this is all good but still not perfect for transactional applications which are processed by IMS Transaction Manager.

Before the release of the WMLz v2.4 update in Q4 2022, the RESTful interface was the only available option for IMS transactions. As depicted by the following architecture flow, the blue box on the left is the IMS Message Processing Region where the online transactions are being processed, and the green box on the right is the WMLz scoring server where the AI model is deployed to. The IMS COBOL program can send RESTful requests to the WMLz scoring server so as to embed AI model inferencing into the online transactions.

Calling WMLz for real-time inferencing in IMS transactions using the RESTful interface

This is technically feasible but with the following deficiencies:

  1. The communication between the IMS application and the WMLz scoring server is through TCP/IP which is not the most efficient way. Especially for security consideration, HTTPS must be used and token validation is required on the WMLz scoring server side, which can further add additional overhead over the communication.
  2. Accessing RESTful endpoint in IMS COBOL applications is not trivial from programming perspective without the involvement of z/OS Connect.

The optimized scoring interface using WOLA

In order to address the performance and application development challenges with the RESTful interface, an enhancement has been made available in the recent WMLz v2.4 update (See Prerequisites and Maintenance for IBM Watson Machine Learning for z/OS for the details). A new scoring interface is now available which enables a deeper integration between IMS and WMLz by leveraging WebSphere Optimized Local Adapter (WOLA).

What is WOLA?

WebSphere Optimized Local Adapter (WOLA) is a function of WAS for z/OS and Liberty for z/OS that allows very fast, efficient, and low-latency memory to memory exchanges between Java applications in WAS/Liberty z/OS and programs running in external address spaces such as CICS, IMS and Batch. WOLA is an excellent means of communication between programs where high throughput is required. WOLA is fast because the data exchange is performed by copying messages in bytes array from one memory location to another, which avoids the overhead of going through the network or other exchange protocols. In a summary, WOLA is an efficient and secured way of data exchange — less CPU per exchange and no exposure to network security concerns.

How the WOLA scoring interface works?

The following architecture diagram illustrates how the scoring WOLA interface works. Once configured, the IMS COBOL application which runs under the IMS Message Processing Region can use the WOLA API to invoke the WMLz scoring server for real-time inferencing. The information needed for inferencing, such as the model deployment ID, the record to be scored, are transferred from the IMS dependent region address space to the WMLz scoring server address space using the shared memory created by WOLA. Like wise for the scoring output. This memory to memory communication is much more efficient comparing with the RESTful interface, at the meantime, it is much more secured as the data transferred has no exposure to the network. From programming perspective it is also simplified a lot. As the z/OS application developer, you can use the native callable service that you are very familiar with to invoke AI model for inferencing. With the WOLA scoring interface, the application impact to infuse AI is reduced to the minimum.

Calling WMLz for real-time inferencing in IMS application using the WOLA interface

Using the WOLA APIs for AI model scoring — the first example

Now, let’s use an example to demonstrate how you can easily deploy a fraud detection model trained anywhere to WMLz and use that in your IMS COBOL application to detect fraudulent transactions before they are committed.

Step 1: Configure a scoring service with WOLA enabled

Some setup and configurations are needed so as to enable the WOLA interface of your scoring server and be able to make inbound requests from IMS online transactions. The key configuration steps include:

  1. Configure and start the Liberty angel process.
  2. Create z/OS SAF profiles required by WOLA and grant permissions.
  3. Configure WOLA as an external subsystem in IMS ESAF.
  4. Add the WOLA Load library to the IMS control region and dependent region startup PROCs.

See Configuring a WML for z/OS scoring service with WOLA enabled for the detailed instructions.

Once those setups are properly done, define and start a standard scoring service from the WMLz Administration Dashboard user interface. In the example here, a scoring server named ALNWOLA is created and started successfully.

Define and start a scoring service

In the log of the scoring server, check and confirm the existence of message CWWKB0501I which indicates the server is started with WOLA enabled. The CWWKB0501I message shows that the scoring service is registered to the optimized local adapter channel using a three-part name: ALN SCORING <serverName_in_uppercase>. The COBOL application can use this three-part name to register to the scoring server before sending inferencing requests.

CWWKB0501I: The WebSphere Optimized Local Adapter channel registered with the Liberty profile server
using the following name: ALN SCORING ALNWOLA

Step 2: Import the AI model to WMLz and deploy to the scoring server

The scoring WOLA interface supports SparkML, PMML and deep learning ONNX models. You have the freedom of choosing to train the models on any platform and then deploy to WMLz for inferencing on z/OS. In this example, the model is a PMML model trained using XGBoost. As shown in the screenshots of the WMLz user interface here, this antifraud model is deployed to the scoring server named ALNWOLA which was created in the previous step.

Anti-fraud PMML model deployed to the scoring server ALNWOLA

On the deployment details page, locate the deployment ID in the “Scoring endpoint” URL, and the Copybooks of the model input and output in the “Schema” table. Those information will be used in the COBOL application for making the scoring requests.

Get the deployment details: deployment ID, input and output copybooks

Once the model is deployed, you can run a scoring test from the user interface which goes through the RESTful route. The input to the scoring RESTful interface is a JSON payload which carries the record(s) to be scored. The result of scoring is also returned in JSON format.

Test model scoring using the RESTful interface

As the scoring test is conducted successfully, you can move ahead to try with the WOLA interface in the COBOL program.

Step 3: Generate Java helper classes for model input and output

Each model has its own input and output schemas. The input record for inferencing and the scoring result are transferred between the COBOL program and scoring server in bytes array using the WOLA shared memory. COBOL Copybooks for model input/output and the corresponding Java helper classes are used for bytes array interpretation between the caller and the receiver. You can use the provided script to generate the Java helper classes for a certain deployment on a specified scoring server.

cd $IML_INSTALL_DIR/bin
./gen_helper_class.sh <serverName> <deploymentID> <classPrefix>

The following example shows that Java helper classes FraudInWrapper and FraudOutWrapper are generated for the antifraud deployment which was created in Step 2.

-bash-4.3$ cd $IML_INSTALL_DIR/bin
-bash-4.3$ ./gen_helper_class.sh ALNWOLA 4c2121cb-9ade-4d8f-8441-3f92455b171d Fraud
Reading configuration file /home/mlzdev/imlhome/spark6/configuration/scoring.cfg.ALNWOLA.
Start checking if scoring service is alive...
Start generating copybooks for deployment...
Start generating adata from copybooks...
Start generating java helper class for model input...
Java class generated successfully.
Start java class compilation...
Java class compiled: /home/mlzdev/imlhome/spark6/usr/servers/ALNWOLA/lib/global/com/ibm/scoring/helper/FraudInWrapper.class.
Start generating java helper class for model output...
Java class generated successfully.
Start java class compilation...
Java class compiled: /home/mlzdev/imlhome/spark6/usr/servers/ALNWOLA/lib/global/com/ibm/scoring/helper/FraudOutWrapper.class.

In the next step, you will see how these Java classes are used in the COBOL program to enable the data interpretation between the IMS application and the WMLz scoring server.

Step 4: Update your COBOL program for scoring using the WOLA APIs

WOLA provides a set of z/OS native callable service APIs that you can use in your COBOL program to interact with the WMLz scoring server. Before diving into the program details, let’s first take a look at the key programming flow. As explained by the following flow chart, before making scoring requests, the first step is to use the BBOA1REG registration API to register to the scoring server. Upon registration, a WOLA shared memory will be created for data transfer and communication between the two parties. Registration only needs to be done once, for example, at the time when your IMS program is scheduled. Once registered, the registration can be reused repeatedly by every transaction through the WOLA BBOA1INV invoke API, to send inferencing request to the WMLz scoring server and get the response back.

Scoring flow using the WOLA basic APIs

The following is a sample COBOL program which demonstrates the core logic of server registration, scoring invocation and unregistration.

IDENTIFICATION DIVISION.
PROGRAM-ID. CARSWOLA.
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
* API Parms
01 daemongroup PIC X(8) VALUE LOW-VALUES.
01 node-name PIC X(8).
01 server-name PIC X(8).
01 register-name PIC X(12) VALUE SPACES.
01 minconn PIC 9(8) COMP VALUE 1.
01 maxconn PIC 9(8) COMP VALUE 10.
01 regopts PIC 9(8) COMP VALUE 0.
01 urgopts PIC 9(8) COMP VALUE 0.
* Invoke Parms
01 service-name PIC X(255).
01 service-name-length PIC 9(8) COMP.
01 rqst-area PIC X(900) VALUE SPACES.
01 rqst-area-addr USAGE POINTER.
01 rqst-area-length PIC 9(8) COMP VALUE 100.
01 resp-area PIC X(900) VALUE SPACES.
01 resp-area-addr USAGE POINTER.
01 resp-area-length PIC 9(8) COMP VALUE 100.
01 wait-time PIC 9(8) USAGE BINARY.
01 rqst-type PIC 9(8) COMP VALUE 1.
01 rc PIC 9(8) COMP VALUE 0.
01 rsn PIC 9(8) COMP VALUE 0.
01 rv PIC 9(8) COMP VALUE 0.
* Antifraud model scoring input and output
01 CINPUT. (A)
03 DEPLOYMENT-ID PIC X(40).
03 INPUT-CLASS PIC X(64).
03 OUTPUT-CLASS PIC X(64).
03 FRAUDIN
06 NF-DT-MONTH COMP-2 SYNC.
06 TNX-AMT-length PIC S9999 COMP-5 SYNC.
06 TNX-AMT PIC X(255).
06 NF-DT-HOUR COMP-2 SYNC.
06 PROCESS-CODE COMP-2 SYNC.
06 COUNTRY-CODE COMP-2 SYNC.
06 CUSTOMER-ID COMP-2 SYNC.
06 CARD-ID COMP-2 SYNC.
06 MCC COMP-2 SYNC.
06 SERVICE-CODE COMP-2 SYNC.
06 NF-DT-DAY COMP-2 SYNC.
06 CARD-MODE COMP-2 SYNC.
06 NF-DT-WEEKDAY COMP-2 SYNC.
06 CARD-TYPE-CODE COMP-2 SYNC.
06 NF-DT-DAYSINMONTH COMP-2 SYNC.
06 XRETURN-CODE COMP-2 SYNC.
06 TNX-CHANNEL COMP-2 SYNC.
06 EXPIRATION COMP-2 SYNC.
01 COUT. (B)
06 LP-1X0 COMP-2 SYNC.
06 LP-FRAUD COMP-2 SYNC.
06 LP-0X0 COMP-2 SYNC.
06 LC-FRAUD COMP-2 SYNC.
06 L-FRAUD COMP-2 SYNC.

* Procedures Section
PROCEDURE DIVISION.
MAINLINE SECTION.

* Specify the scoring server to register to
MOVE 'ALNWOLA' TO register-name. (C)
MOVE 'ALN' TO daemongroup.
MOVE 'SCORING' TO node-name.
MOVE 'ALNWOLA' TO server-name.

* Assign value to deployment ID and helper classes
MOVE '4c2121cb-9ade-4d8f-8441-3f92455b171d' TO DEPLOYMENT-ID.
MOVE 'FraudInWrapper' TO INPUT-CLASS.
MOVE 'FraudOutWrapper' TO OUTPUT-CLASS. (D)

* Assign record for scoring
MOVE 1 TO NF-DT-MONTH.
MOVE 5 TO TNX-AMT-length.
MOVE '11860' TO TNX-AMT.
MOVE 23 TO NF-DT-HOUR.
MOVE 1 TO PROCESS-CODE.
MOVE 10 TO COUNTRY-CODE.
MOVE 1100 TO CUSTOMER-ID.
MOVE 1100 TO CARD-ID.
MOVE 1 TO MCC.
MOVE 2 TO SERVICE-CODE.
MOVE 1 TO NF-DT-DAY.
MOVE 1 TO CARD-MODE.
MOVE 4 TO NF-DT-WEEKDAY.
MOVE 1 TO CARD-TYPE-CODE.
MOVE 10 TO NF-DT-DAYSINMONTH.
MOVE 0 TO XRETURN-CODE.
MOVE 1 TO TNX-CHANNEL.
MOVE 300 TO EXPIRATION.

* Assign scoring service name (E)
MOVE 'java:global/scoring-wola/WOLAHandler!com.ibm.ml.scoring
- '.online.service.WOLAHandler'
TO service-name.

* Register into Daemon Group using BBOA1REG API

INSPECT daemongroup CONVERTING ' ' to LOW-VALUES.

CALL 'BBOA1REG' USING (F)
daemongroup,
node-name,
server-name,
register-name,
minconn,
maxconn,
regopts,
rc,
rsn.

IF rc > 0 THEN
DISPLAY "Failed to register to:" daemongroup node-name
server-name
DISPLAY "OLA - BBOA1REG problem -- rc/rsn : " rc "/" rsn
GO TO Bad-RC
ELSE
DISPLAY "Successfully registered into "
daemongroup node-name server-name
END-IF.

* Invoke using BBOA1INV for scoring

MOVE LENGTH OF CINPUT TO rqst-area-length
MOVE LENGTH OF COUT TO resp-area-length
INSPECT service-name CONVERTING ' ' to LOW-VALUES

SET rqst-area-addr TO ADDRESS OF CINPUT
SET resp-area-addr TO ADDRESS OF COUT

CALL 'BBOA1INV' USING (G)
register-name,
rqst-type,
service-name,
service-name-length,
rqst-area-addr,
rqst-area-length,
resp-area-addr,
resp-area-length,
wait-time,
rc,
rsn,
rv

IF rc > 0 THEN
DISPLAY "OLA - BBOA1INV problem, rc/rsn: " rc "/" rsn
GO TO Bad-RC
ELSE
DISPLAY "Scoring result: "
DISPLAY "L-FRAUD: " L-FRAUD
END-IF.


* Unregister from Daemon Group using BBOA1URG API

CALL 'BBOA1URG' USING (H)
register-name,
urgopts,
rc,
rsn.

IF rc > 0 THEN
DISPLAY "OLA - BBOA1URG problem -- rc/rsn: " rc "/" rsn
GO TO Bad-RC
ELSE
DISPLAY "Successfully unregistered from " daemongroup
node-name server-name
END-IF.

GOBACK.

* Section used to exit batch if any API returned RC>0

Bad-RC.
DISPLAY "OLA - EXITING program due to non-RC=0."
GOBACK.

(A) The CINPUT data structure carries the input data to send to the scoring server through the WOLA interface, which includes: the deployment ID, the name of the input and output Java helper classes, and the record to be scored.

(B) The COUT data structure carries the output of scoring for a given input record.

(C) Specify the scoring server to register to through the WOLA interface. In the example it is the ALN SCORING ALNWOLA server used in the previous steps.

(D) Assign value to the input data area: the deployment ID of the model, the name of the Java helper classes which were created in Step 3, and the record to be scored.

(E) Specify the service to invoke for scoring which isjava:global/scoring-wola/WOLAHandler!com.ibm.ml.scoring.online.service.WOLAHandler

(F) Register to the scoring server using the BBOA1REG registration API.

(G) Call the BBOA1INV API to invoke scoring. The scoring result is returned to the COUT data area pointed to by resp-area-addr. Print out the value of field L-FRAUD which is the inerferencing result indicating if the transaction is fraudulent or not.

(H) Unregister from the scoring server using BBOA1URG API.

See Preparing a model for online scoring in COBOL program with WOLA for the detailed programming instructions, hints and tips.

Running the COBOL program you will see the following messages printed out:

Successfully registered into ALN SCORING ALNWOLA
Scoring result:
L-FRAUD: 0
Successfully unregistered from ALN SCORING ALNWOLA

As the foundational AI solution on IBM Z, IBM Watson Machine Learning for z/OS has been continuously evolving to meet the ever growing needs on the platform for AI. As a summary, this new WOLA scoring interface will empower you to infuse AI model real-time inferencing into every single IMS transaction with optimized performance and minimized application impact. Last but not the least, not only for IMS applications, this WOLA scoring interface can also be used by COBOL applications that execute either in Batch jobs or CICS Transaction Server.

--

--