Using Dynatrace to shed light on batch job performance

Darren Neimke
5 min readJul 7, 2024

--

Shedding light on batch jobs

Batch and background processing jobs are the unsung heroes of many businesses, silently handling critical tasks like file transfers, data processing, report generation, and routine system maintenance. Their importance cannot be overstated, as they keep the lights running. However, because these jobs often run unseen, issues can lurk in the shadows, causing problems that require reactive maintenance work.

Benefits of monitoring batch jobs

Real-time performance monitoring: Get real-time insights into job execution times, resource utilization, and potential bottlenecks. Identify slow-running jobs before they disrupt critical processes.

Root Cause Analysis: Use data from logs, traces, and metrics to help pinpoint the root cause of job failures. This will save you valuable troubleshooting time and make your operations more efficient.

Proactive Alerts: Set up custom alerts to be notified of job failures or performance deviations before they become issues, giving you a sense of control over your operations.

Bringing Batch Jobs into the Light: A Step-by-Step Guide

Now that you understand the benefits of monitoring let’s explore the steps to ensure your jobs are monitored effectively.

Step 1: Instrumenting code

We need our batch jobs to emit logs so that we can track some of the following aspects of health and performance:

  • Whether a job started successfully, was completed as expected, or encountered errors.
  • tracking the progress of jobs, highlighting individual steps and their completion times
  • access to detailed error messages, pinpointing the exact cause of any failures within your job
  • custom data points can provide valuable context about specific job runs or processed information

Structure log messages to include relevant information when writing your logs. Consider including information such as:

  • Timestamp — When the log message was generated.
  • Batch Run Id — A unique identifier for the current job run.
  • Step Name (Optional) — If your job has distinct steps, include the name of the current step.
  • Log Message — provide a clear and concise description of the logged event.
  • Data points — Define custom attributes within log content to help enrich log data.

The following example shows several sample log entries that contain such helpful information.

2022-04-26 10:53:01 INFO Start Batch=3244986; Environment=PROD
2022-04-26 10:53:02 INFO Processing Product=12345678; Batch=3244986; AttributeCount=32;
2022-04-26 10:54:04 INFO Processing Product=87654321; Batch=3244986; AttributeCount=43;
2022-04-26 10:56:01 ERROR Batch=3244986; Product=87654321; Unknown attribute name 'zyxw'
2022-04-26 10:57:01 INFO End Batch=3244986; Environment=PROD; HasErrors=True; ProductCount=2; TotalAttributeCount=75;

These log entries tell us things such as:

  • When batches start and finish. This information allows us to easily count the number of batch runs and store it as a metric.
  • What environment is the batch running in
  • The size of the batch in terms of the number of data items to process and their complexity
  • Specific details about failures and warnings can help troubleshoot and spot trends.

Here’s a simplified example (using PowerShell’s built-in logging module) of emitting a log message indicating a batch job has started:

Write-EventLog -LogName "My Batch Job" -Source "Batch Jobs" -EntryType "Info" -EventId "5005" -Message $Message

A couple of things to note about the log entry:

  • A custom log source is a great way to scope log entries for each batch job application. Dynatrace provides filters based on log sources to assist with scoping log queries.
  • Using custom Event IDs to map log entries to different actions or activities also aids with scoping log queries.

Consume and process logs.

Dynatrace can ingest logs from your batch jobs through integrations with popular logging platforms or by utilizing its log collection capabilities.

Once your logs are ingested, Dynatrace can provide valuable insights into job execution status and potential errors. Consuming logs is a 3-step process.

1. Create your log source

First, you must configure a log source so Dynatrace knows where your log files are stored. By default, Dynatrace will search certain known locations, but you will need to configure custom Windows event logs separately.

I like configuring Log Monitoring on host groups to maintain consistency as additional hosts get added to environments.

To define your log source in Dynatrace:

  • Navigate to your host group landing page.
  • Within the Log Monitoring menu, select Custom log sources.

Add a custom log source with the following attributes:

  • Source type: Windows event log
  • Path: Enter the name of your custom event log

2. Define log ingest rules

Once you’ve configured your log source, the next step is to instruct Dynatrace to include logs from it by configuring a log ingest rule.

To define your log ingest rule in Dynatrace:

  • Navigate to your host group landing page
  • Within the Log Monitoring menu, select log ingest rules

Add a rule with the following attributes:

  • Rule type: Include in storage

Add a condition

  • Matcher attribute: Log source
  • Value: Enter the name of your custom event log

3. Configure processing rules

The final step is to configure processing rules to parse data that you will use to enrich your logs in Dynatrace.

Consider the format of log entries for the processed product items:

2022-04-26 10:54:04 INFO Processing Product=87654321; Batch=3244986; AttributeCount=43;

We aim to process these entries when they arrive in Dynatrace and transform your log data to contain the attribute values as fields. We can then use those fields later to perform queries and help us extract metrics.

To define your processing rule:

  • Navigate to Settings > Log Monitoring > Processing

Add a rule with the following attributes:

  • Rule name: Something relevant such as “batchjobname-process-product”
  • Matcher: matchesValue(log.source “Your source name”) and matchesPhrase(content, “Processing ProductId”)
  • Processor definition: refer to the rule below

Processing rule definition

PARSE(content, "
LD 'Product=' LD:my.product ';'
SPACE LD 'Batch=' LD:my.batch ';'
SPACE LD 'AttributeCount=' LD:my.attributeCount ';'
")

This processing rule will extract the Product, Batch, and AttributeCount data from the log entry and enrich the entry within Dynatrace with custom attributes. Users can then use those attributes to search and filter logs.

The following DQL sample shows an example of a query that uses the custom my.product attribute to filter logs based on products with an ID of 87654321:

fetch logs
| filter matchesValue(log.source, "Your log source", and matchesPhrase(content, "Processing Product")
| filter matchesValue(my.product, "87654321")
| sort timestamp desc

You can read more about the log processing syntax in the Dynatrace docs: https://docs.dynatrace.com/docs/observe-and-explore/log-monitoring/log-processing/log-processing-examples

Conclusion

Proactive monitoring is a game-changer for organizations that rely on batch and background processing jobs. By shedding light on these often-overlooked processes, a good monitoring solution will help to ensure their smooth operation and maximize efficiency.

Logs are crucial in monitoring because they provide visibility into system activities, error messages, and other important events. A reliable monitoring solution leverages logs to offer valuable insights and ensure the system’s health and performance.

In the next article, we’ll delve further into using Dynatrace for monitoring our jobs, setting up custom metrics and alerts, creating dashboards for better visibility, and creating notebooks to assist with troubleshooting issues.

--

--