Two Innovations for the Zowe API Mediation Layer from Broadcom

Elliot Jalley
Modern Mainframe
Published in
8 min readMay 18, 2021

Zowe API Mediation Layer enables secure access to mainframe services from off-platform for automation, modernization or DevOps use cases, providing features like load balancing, token-based security, single sign-on, support for multi-factor authentication and digital certificate support for Zowe-conformant APIs. These security features have obvious benefits, but how can the API Mediation Layer help with other key non-functional requirements like availability or auditability?

This week, Broadcom released two new features for CA Brightside subscribers to address these requirements:

  • API State Monitor for Zowe API Mediation Layer
  • API Audit Log for Zowe API Mediation Layer

These extensions offer differing ways to monitor Zowe conformant APIs running through the Zowe API Mediation Layer. However, both are united in offering the capability for a System Administrator to quickly react to service degradation or even failure.

Let’s take a closer look at each one.

Use the API State Monitor to minimize downtime of Zowe conformant API services

The API State Monitor for Zowe API Mediation Layer monitors the runtime state changes of the API services accessible from the Zowe ecosystem and reports those changes back to the System Log (SYSLOG) in the form of Write To Operator (WTO) messages. Specifically, the API State Monitor posts information on whether services are registering or de-registering from the API Mediation Layer Discovery Service (based on EUREKA). In addition, it reports on services that are shut down ungracefully or may just be inaccessible. The API State Monitor is concerned about the microservices availability and accessibility, rather than whether a particular process, job, or program is running.

It helps system operators to minimize downtime of Zowe conformant API services running through the API Mediation Layer. WTO messages generated to the SYSLOG by the API State Monitor can be picked up by standard mainframe automation tools, such as CA OPS/MVS® Event Management & Automation. Using automation, they can act immediately to bring up a failed service.

WTO message example

In this way, the API State Monitor supports high-availability of Zowe conformant API services and minimizes workflow disruptions, optimizing team performance.

Keeping the team productive

Let’s consider a scenario where there is no API State Monitor Service in place.

Imagine a modern application developer named Michelle, using Visual Studio (VS) Code or Eclipse Che to connect to the mainframe and use services that provide her with access to the resources she needs to carry out her work. That might include accessing source code (using CA Endevor REST services), debugging capabilities (using CA InterTest services), access to data sets, jobs and job output (using z/OSMF services) and so on. These services are her lifeline to those mainframe resources.

One typical day, Michelle is working in VS Code. She is waiting on a response that is taking a long time. Then, instead of getting her source code, she gets an error message. She retries multiple times without success and even her colleagues can’t help. She realises there’s a genuine problem and opens a ticket. Everything stops. Maybe she can fill her time with other work, but she can’t execute her work plan. She’s frustrated and unproductive. As part of an agile development team, working together on the same enhancement, she is not the only one in this predicament.

Now imagine that the API State Monitor extension is installed.

Tyler, a system administrator responsible for making services available to Michelle and her team, has the API State Monitor running as part of his API Mediation Layer instance. The API State Monitor performs ongoing health checks on all the services running through the API Mediation Layer Gateway.

After realising that she is not getting a response from her Endevor instance, Michelle checks with her colleagues that they are seeing the same issue. Once she gets their confirmation she reports it to Tyler. He looks through the SYSLOG and immediately recognizes, via a WTO message, that the Endevor REST API has FAILED and is DOWN. He can see from the timestamp in the message when this happened. In fact, given that Endevor is a business-critical API, Tyler had configured a high level of monitoring for it. So, when Endevor failed to respond to a health check, a WTO message was generated within 5 seconds. Tyler restarts the Endevor service and is relieved to see a new WTO message in the SYSLOG indicating the Service is back UP, OK, and being monitored once again by the API State Monitor. He sends out a group notification to say that the service is restored.

In fact, with automation in place, via a product like CA OPS/MVS® Event Management & Automation, the failed service could be recovered automatically without any action from Tyler or Michelle.

So, the Monitoring Service minimizes the delays, eliminates the loss of productivity and reduces the frustration Michelle feels when REST APIs, on which she depends, fail. Keeping that lifeline to mainframe resources up and available ensures a successful modern development experience.

WTO message structure

Use the API Audit Log to turn service access data into insights

The API Audit Log for the Zowe API Mediation Layer is a lightweight yet powerful monitoring extension that collects data related to the usage of services via the Zowe API Mediation Layer. The API Audit Log extension collects data from various sources to a centralized log. This data is then shipped to external tools for indexing and processing. This enables the creation of dashboards to visualize details and events on Zowe conformant API activity. It helps Zowe administrators, service owners and internal auditors who all have an interest in seeing both actual and historical data on Zowe-conformant API services.

Specifically, it collects two types of events: Login and Routing. Login events include data like the username used for login or the time of the login. Routing events include data such as the service instance called and the user IP address.

The extension was optimized for use with the Elastic (ELK) stack as the means for displaying the data. As such, it includes dashboard templates and documentation for visualizing with a new or existing ELK stack. However, ELK is not a prerequisite and the data provided by the extension can be visualized in other utilities or even in existing custom reports.

Analysing unusual activity

Let’s first consider a scenario where there is no API Audit Log extension installed.

Shelley, an internal auditor, is called to look into suspicious activity involving services running via the Zowe API Mediation Layer. She has to quickly extract data out of Zowe API Mediation Layer logs as part of an ad-hoc incident investigation. Shelley grew up in the distributed world, working on Windows, UNIX and Linux devices and knows very little about the mainframe. In order to get to the bottom of the issue she has to go to the service owners for the raw data. There is a lot of back and forth before she has the data set she needs. Further analysis of the raw data is needed to make sense of what happened. Sometimes, days pass before she has the answers. Luckily, so far, this hasn’t resulted in any major damage to the organization.

Now consider this same scenario once Tyler has the API Audit Log extension running as part of his Zowe API Mediation Layer instance.

Shelley opens up Kibana and right away she has access to active dashboards, pulling together charts, maps and filters, giving her an immediate picture of recent events on the Zowe API Mediation Layer. She drills down and quickly identifies an unusually high number of requests coming from user APIMTST.

Kibana chart example

Without needing to contact service owners or Tyler, within minutes she has the answers by exploring the underlying data with a few clicks. She exports the relevant data as a jpeg which she then includes in a report for Tyler, service owners and other stakeholders providing insight to the root causes of the incident.

A few short hours after being tasked, Shelley has completed her investigation without any involvement from Tyler or development teams and is able to share the outcome in the form of a clear, concise, visual report.

Maintain service availability

Tyler is responsible for configuring and maintaining the Zowe API Mediation Layer. He needs to make sure services don’t go down and, if they do, quickly troubleshoot the issue and restore failed services immediately.

Consider Tyler’s experience without the API Audit Log installed.

Tyler has multiple APIs running through the Zowe API Mediation Layer. However, only a handful of those are business critical, causing potential financial and reputational damage if down for any significant time. Tyler currently has no way to easily monitor the ongoing health of the APIs. One moment an API is up and the next moment it is down without any forewarning. If the API is business critical, he will quickly hear from a service owner relying on the service that something is wrong and needs to be fixed fast. He can get it back up quickly but by then it’s often too late and the damage is done.

Now consider Tyler’s position with the API Audit Log extension running as part of his Zowe API Mediation Layer instance.

Tyler gets into the office and, as he drinks his morning tea, peruses the dashboards that came pre-configured with the API Audit log. He doesn’t see any critical issues which is not a surprise given that he would already have received alerts from Kibana to his email and Slack channels in the event of anything unexpected happening. He does notice that one of his high-priority APIs, monitored via its own dashboard, has experienced a slight slowdown in response time.

Kibana chart example

It’s nothing critical yet but the trend indicates some degradation on the part of the API. He fires off a link to the data via Slack to the service owner. The service owner thanks Tyler for bringing this to her attention and asks the team to investigate and fix any potential performance issue. In the meantime, she asks Tyler to message her in case the performance moves beyond a certain threshold so they can raise the priority if necessary.

Where can I find the extensions?

These extensions are available right now as an SMP/E install to all CA Brightside customers. You’ll find them on the Broadcom support portal alongside the latest full Zowe distribution.

In addition to these two new powerful extensions, CA Brightside also provides enterprise-grade, 24 x 7 support for the Zowe LTS release. With CA Brightside you have access to streamlined, tested software distributions, IP legal assurance, and support for all of the extensions included in the Code4z mainframe developer code pack available at the Visual Studio Code marketplace. Finally, the growing number of Zowe Conformant Broadcom product plug-ins include comprehensive support as part of their “parent” product license meaning no additional license is required.

For more details, product and contact information, visit the CA Brightside web page.

For more information on Zowe, visit zowe.org

Learn about it! Read a blog, visit medium.com/zowe

Talk about it! Join our Slack Channel, visit OMP Zowe Slack Channel

--

--

Elliot Jalley
Modern Mainframe

Product Manager at the Broadcom Mainframe R&D Centre in Prague. Modernizing the way we work with z/OS.