Mainframe Batch — Automated Testing

Sujay Solomon
13 min readFeb 19, 2020

--

Changes are good. Changes are essential. Changes keep the application relevant and compliant, improve usability, and bring more value to the end user.

Changes can also be bad. Changes may introduce program errors, system errors, performance issues, or errors in the output.

Fear not, modern development best practices excel at minimizing the risk of changes and maximizing the ability to deploy changes often — ya know, DevOps and such. It’s just a matter of taking these practices and adapting them to the z/OS batch world.

Learn the basics of z/OS batch applications — read my blog on Mainframe Batch 101 — Concepts & Why it Matters.

As mentioned in my previous blog on Mainframe Batch 101, there are plenty of opportunities to streamline, automate, and reduce risk associated with batch application changes. One of them is to automate testing of the batch application at various levels. Let’s explore.

Survey

I ran a survey on z/OS batch applications on SurveyMonkey using social media and various mainframe watering holes like the open mainframe project slack, CA Brightside slack, and the mainframe subreddit.

This blog entry and survey are part of a technical session I’m delivering at the SHARE Ft Worth event.

Frequency of Changes

Majority of z/OS batch applications undergo changes in production once a month.

Time Spent Testing

By calculating the mean response, we can deduce that a single change requires around 25 person-hours of testing before the change can make it into production.

Quality

The QA testing stage tends to catch issues frequently which will result in the change request going back to the development stage. It’s also interesting to note that a substantial number of users do not have a dedicated QA team. This may be good or bad. If the development team is agile and takes ownership of quality as part of their delivery, then it’s good. On the other hand, if there is no dedicated QA team because quality is not prioritized, that can be bad.

Batch production issues caused by change requests are quite frequent. This shows that the techniques employed in the QA stage are not sufficient.

Types of Errors

It was expected that application program errors are the most likely culprits after change request deployments. However, it’s interesting to see that JCL errors are ranked #2 — this is something that could easily be avoided if some sanity testing was done in production as part of deployment.

Why Automate?

One might argue that the testing of batch applications is already automated to some extent. In many cases, test-job-sequences are already defined in a batch scheduler and run automatically. While this may submit the jobs automatically and generate output, it still requires the tester to manually find the output, correlate it with specific test scenarios, and assert that the resulting data matches expected outcomes. It’s these manual tasks that are prime candidates for further automation.

In addition to the time saved and reduction in human error by automating these tests, other known benefits of automated testing include cost savings, earlier detection of bugs, thoroughness, and even information security.

My previous blog on Mainframe Batch 101 illustrates the lifecycle of an application change and various teams involved in the entire process. These stages in the lifecycle can easily be mapped to established levels of testing and can be automated.

Organizing your Automated Tests

My zos-batch-testing repository on github provides a framework for a hierarchical set of automated tests for z/OS batch applications.

  • zosBatchApplication folder — contains all the JCL for each job that is part of the batch application
  • __tests__ folder — contains subfolders for different levels of tests
  • batchUnitTests — contains unit tests
  • batchComponentTests — contains component tests
  • batchSystemFunctionalTests — contains system tests focused on functional accuracy
  • batchSystemPerformanceTests — contains system tests focused on performance

I’m going to be using Jest as my testing framework in all my samples in this blog. See Building Mainframe Metal C and Testing with Jest and Zowe CLI by Dan Kelosky for guidance on setting up Jest for your tests.

I’ll be sharing sample tests from these different test levels in the following sections. Keep in mind that these are samples that do trivial assertions — the tests could certainly handle more complex assertions. The intent is to give you an idea of how to organize and write your tests. Feel free to clone the git project and use it as a starting point.

Development Stage — Unit Testing

Unit testing aims to test the smallest testable unit in code. For the purposes of batch application testing, we treat a callable program as the smallest unit.

Unit tests are great for the development stage as developers can run the specific unit tests that are relevant to the changes they’re making. This should keep the length of the test runs short and still provide some feedback to the developer about their planned changes.

Ideally, you should be able to run unit tests on your developer workstation without requiring platform specific infrastructure. Unfortunately, this is not possible with current technology as both unit testing options below require tests to run on z/OS. The only way to run these tests on a workstation would be for a developer to use IBM’s zD&T to emulate z/OS on their workstation and use it as their unit testing target environment. While this may sound attractive, z/OS emulation on zD&T is very resource-intensive and most developer workstations just don’t have enough juice to be able to run them with decent performance. I hope this changes in the future and that developers can have the freedom and independence to unit test entirely on their workstations without requiring z/OS on real Z metal.

Option 1 — language-specific unit testing

Every program has a corresponding unit test

This approach involves creating unit tests for programs that are written in either COBOL or PL/I. Tools such as IBM’s Z Open Unit Test (aka zUnit) allow generation of test cases for a particular COBOL program by extracting test data from a test-run of the application as a model. The user can then modify the expected outcome manually if the extracted test data is not appropriate for their desired unit test logic. Z Open Unit Test currently requires IBM’s Eclipse IDE called IDz to generate the tests. Once generated, these tests can be run from terminals in open source IDEs like Visual Studio Code using Zowe CLI:

zowe zos-uss issue ssh 'zunit -c=\"//\'dsn.pds(test)\'\" -r=/uss/location/test.junit -v -x=/uss/location/zunit/xsl/test.xsl' --cwd /uss/location/workingdirectory

This command can be embedded into task runners such as npm scripts, gradle or gulp for easier IDE access or called remotely from continuous integration (CI) tools like Jenkins.

This method is useful to gain confidence in all callable programs that are part of your batch application. You can even generate code coverage statistics that tell you what percentage of your application code base is being tested by the generated test cases. This can be added to your quality gates as a requirement before code can be promoted to further application lifecycle stages. However, this type of language-specific unit testing does come with some limitations for COBOL and PL/I.

Option 2 — language-agnostic unit testing

Each step-level program has a corresponding unit test

An alternative to language-specific unit testing is to create tests that are language-agnostic which focus on all possible inputs/outputs to the entry points of a step-level callable program. The type of automated testing emulates the way developers unit test their code changes manually today. Since we are not testing every callable program individually, and are rather focusing only on step-level callable programs, this method is not as granular as option 1, but it may offer a practical and realistic way of building up more unit testing automation.

The advantage of such a method is that it eliminates limitations caused by language-specific unit testing and enables more flexibility in test logic. It also allows the test developer to focus on the externals of the program (inputs/outputs) rather than being bogged down by the internals of program-to-program logic flow.

A disadvantage of this method is that it does not automatically generate code coverage statistics or generate test cases using model test data. It would still be possible for you to artificially generate code coverage metrics by taking inventory of the number of callable programs and tracking which programs you’ve developed an exhaustive set of unit tests for. This method may also take longer to run than a true unit test described in option 1.

Here’s a proposed unit test scenario:

  1. Create or re-use existing job scripts that invoke each externalized program with a one-to-one relationship between a job script and a step-level program.
  2. Create or re-use existing static test data for each unit test.
  3. Choose a client-side testing framework of your choice like MochaJS or Jest. If you prefer python, you could choose a framework like Robot.
  4. Implement a test for every step-level program in the batch application. Zowe CLI can be used as the interface between the client-side test frameworks and z/OS.
  5. The tests should submit the job directly to JES and be able to use the modified program code to obtain test results.
  6. Assess the obtained results with the expected results to determine if the individual tests passed or failed.
Sample unit test for a step level program extracted into a single job

Developers usually make a code change and spend 15 minutes manually testing it. Now imagine doing that 10 times a day for 10 days after every minor code change. That’s around 30 hours in 2 weeks spent manually unit testing. Whether you choose option 1 or option 2, automating such actions frees up the developer to kick off the test and go back to coding. Once the automated test completes, they can use the results as a sanity check or see if their desired change is functional. Sure, there’s the initial overhead of creating these tests, but it’s an accepted trade-off in software development to invest once and reap the rewards forever. Nothing revolutionary here, just adapting modern software development principles to the world of z/OS batch applications.

Sample Unit Test Run using the language-agnostic Option 2

QA Stage — Component Testing

This next level of testing involves testing each batch job that’s part of the batch application. These are the batch jobs that are individually scheduled into the batch scheduler to string together the logical batch application. Each batch job may have multiple job steps that call multiple programs.

Proposed component testing scenario:

  1. Establish a repository for the JCL in the job scripts — preferably in source control management tools like CA Endevor or Git. Ensure that any JCL changes are staged through this agreed upon repository.
  2. Choose a client-side testing framework of your choice like MochaJS or Jest.
  3. Create or re-use static test data for each component test.
  4. Implement a test for every batch job that is part of the batch application. Zowe CLI can be used as the interface between the client-side test frameworks and z/OS.
  5. The tests should submit the job directly to JES and be able to use the updated program code to obtain the test results.
  6. Assess the obtained results with the expected results to determine if the individual test passed or failed. Use the regression testing documentation as a guide to determine what can be tested in addition to the obvious validity of the job output.
Sample component test for a multi-step job that’s part of a series of jobs in a batch app

Component level testing need not be run every time a developer makes a code change, but it is useful to run these as part of the hand-off from development to QA. The automated component tests greatly lighten the load on the QA stage and allows more time to be spent building automated tests to increase coverage. In addition to writing automated tests, they can also focus on manually testing corner cases that may be specific to each change request. It takes human creativity to think of these corner cases and investing time into testing them can prevent batch disasters.

Pre-Prod Stage — System Testing

System testing elevates testing to the holistic application level. Unlike component testing, we want to test the entire application in an environment as close to production as possible. In addition to functional testing, system testing also covers non-functional requirements like performance, memory or disk usage, etc.

Proposed system testing scenario for performance requirements:

  1. Gain access to a static environment with access to the job scheduler and the content management tool that’s used in production.
  2. Choose a performance testing framework like JMeter.
  3. Implement performance tests that track resource usage as the jobs are being executed on the system. Store the results each time as benchmark data so that they can be used in future tests to determine whether thresholds are being exceeded. Zowe CLI can be used as the interface between the performance management frameworks and the performance metric tools like CA SYSVIEW on z/OS.

Here’s a webinar recording on using the commercial tool Blazemeter (based on JMeter) to create mainframe performance tests.

Proposed system testing scenario for functional requirements:

  1. Gain access to a static environment with access to the job scheduler and the content management tool that’s used in production.
  2. Choose a client-side testing framework of your choice like MochaJS, Jest, or Robot.
  3. Create a script to ensure that the scheduling tool is using the latest JCL from the established repository.
  4. Create or re-use static test data for the system test.
  5. Implement a test that executes the sequence of jobs through the job scheduler, waits for them to complete, and interfaces with the content management tool to acquire the output. Zowe CLI can be used as the interface between the client-side test frameworks and the services on z/OS.
  6. Assess the obtained results with the expected results to determine if the system test passed or failed. Use the regression testing documentation as a guide to determine what can be tested in addition to the obvious validity of the job output.
Sample functional test that retrieves a report from CA View using the Zowe CLI plugin for CA View

In this example, I’m running the batch application and using the CA View Zowe CLI plugin to download a report from z/OS to a file and then assess the content of the report for accuracy.

Sample run of the functional test that triggers a batch scheduler event and asserts output from the output manager CA View

System testing can give an unprecedented level of confidence before deploying the change request to production. Ultimately, achieving automated testing to this level will speed up deployments and allow for innovation to prosper. It also ensures that performance degradation or batch failures can be suppressed to great levels ensuring that the SLA agreements can be met or exceeded!

Production Stage — Sanity Testing

Before deploying to production, it’s a good idea to ensure that the environment is able to support the changes.

For batch applications, this can be achieved testing the JCL for each job in the application through JCL validation tools like CA JCLCheck.

Proposed system testing scenario for environmental requirements:

  1. Gain access to the production environment with the JCL for the jobs.
  2. Choose a task runner like npm scripts, gradle, or gulp.
  3. Create tasks to validate JCL for each job.
  4. Create a task to consolidate all the job JCL validation into one task.
  5. Run the task against the production system to ensure that the scans all pass without issue.

Here’s an example using CA JCLCheck’s Zowe CLI plugin (beta version) — if you’re using npm scripts as your task runner, insert the following line as part of your “scripts” in the package.json file:

"test:production:sanity": "zowe jclcheck check lf \"zosBatchApplication/badjcl.jcl\""

This will result in the “test:production:sanity” task to appear in the npm scripts bar on VS Code which you can click on to trigger the sanity test.

Sanity testing your batch applications will allow you to detect environmental issues like lack of access to a new dataset or volume. Imagine running two jobs in production that take 5 hrs to complete overnight and then finding out that the third job doesn’t have access to a dataset. This can cause chaos with production operations the next day.

Test Reports

For each of the above tests, Jest is able to produce formatted html reports that are easy to review and consolidate the results into a single file using the reporting package jest-stare. These reports can be stored off in a test result repository for audit and attestation purposes.

Sample jest-stare report

Summary

Batch applications on z/OS are mission critical in many of today’s industries and continue to undergo frequent changes. Any type of outage could have severe impacts on the business operations of a company.

Most respondents said there may be interest in automated testing of batch applications at their companies. However, they’re not sure if it would be prioritized. This may be due to a lack of perceived value or a lack of knowledge on how to automate the testing. I hope this article helps move more z/OS batch users to the green “high priority” group.

The automated testing strategies described in this article can greatly increase developer productivity, reduce risk, and increase deployment frequencies. There are varying levels of testing that can be automated bringing benefits in different parts of the application lifecycle. Automated testing for z/OS batch is a major step in advancing your company’s efforts towards adopting DevOps and it can be done with modern open source tooling like mocha, jest, and Zowe.

--

--