Photo by Wilhelm Gunkel on Unsplash

Processing Cisco ASA Logs with Cloudera Flow Management

David Handermann
Cloudera
Published in
12 min readFeb 8, 2022

--

Background

Cisco Adaptive Security Appliances provide network filtering and communication security for a variety of use cases. As firewalls, Cisco ASA devices can filter, route, and translate network requests and responses over Internet or private links. Cisco ASA devices also support encrypted virtual private network connections, enabling protected access from remote workers or maintaining secure communication between multiple locations.

For monitoring, archiving, and aggregation, Cisco ASA devices support extensive logging using the syslog protocol. The information contained in log messages supports a number of potential uses, from anomaly detection to communication auditing.

Apache NiFi Record Processing

As an extensible framework for data processing, Apache NiFi is capable of interacting with a wide variety of systems and services. In addition to generalized file processing, NiFi includes a number of components for handling structured data in the form of records.

Record processing in NiFi enables maximum processing performance while supporting formatting, filtering, and routing for common file structures. Whether handling schema-oriented binary formats such as Apache Avro, or processing arbitrary lines of text through custom patterns, NiFi record-oriented components can support complex operational flows.

Advanced Processing in Cloudera Flow Management

Cloudera Flow Management builds on Apache NiFi and consists of open source capabilities as well as selected additional features. NiFi includes standard services for reading syslog messages as records, but the nature of the syslog protocol makes it difficult to perform additional parsing on messages specific to a given product vendor.

To meet emerging needs of enterprise customers, CFM incorporates a new record reader supporting advanced parsing of Cisco ASA syslog messages. The CiscoEmblemSyslogMessageReader integrates with standard NiFi record processing components and allows users to provide additional custom patterns. With an understanding of the syslog protocol and the specifics of the Cisco EMBLEM structure, it is possible to deploy complex flows capable of handling high volumes of information from many devices.

Syslog Protocol Overview

The syslog protocol itself has developed over several iterations, providing a flexible messaging structure for a large variety of applications. The structural flexibility of syslog makes it difficult to format complex messages, resulting in various strategies among multiple vendors.

Syslog Protocol RFC 3164

The syslog protocol consisted of a de facto standard for several decades before RFC 3164 codified the initial version. The BSD syslog protocol defined standard communication in three elements:

  • PRIORITY
  • HEADER
  • MESSAGE

The optional PRIORITY element consists of a number that incorporates both logging facility and logging severity information. The HEADER element is required and contains timestamp and hostname information. The MESSAGE element describes the remainder of the string and does not have any formatting requirements.

A simple syslog message without the PRIORITY field appears follows:

Jan 15 12:30:45 localhost System Started

As shown in the example, the timestamp format lacks year and timezone information, leading to ambiguity when the receiving system attempts to parse the string.

Syslog Protocol RFC 5424

To address timestamp formatting issues and incorporate more structural guidelines, RFC 5424 obsoleted RFC 3164 in 2009. The updated timestamp format follows the ISO 8601 international standard, eliminating the ambiguity of the legacy syslog protocol.

An example timestamp indicating Coordinated Universal Time using the Z character, together with optional fractional seconds, appears as follows:

2022-01-15T12:30:45.005Z

RFC 5424 incorporated a number of other changes defining structured data elements consisting of extensible pairs of keys and values.

Cisco EMBLEM Overview

Cisco ASA devices support the legacy BSD syslog protocol defined in RFC 3164, with the option to enable timestamp formatting according to the newer RFC 5424 standard. This hybrid approach makes it necessary for parsing components to handle both timestamp formats.

As a custom extension of the syslog protocol, Cisco ASA devices support logging using the Cisco EMBLEM format, which includes a standard prefix for message elements. The information contained in the EMBLEM prefix enables consuming systems to support selective parsing strategies.

The EMBLEM prefix on Cisco ASA logs includes a facility, a level, and a message number, the last of which can be used to narrow down the expected message structure.

The following EMBLEM prefix example defines a facility of ASA, a level of 6, and a message number of 199005:

%ASA-6-199005:

An example syslog record incorporating the EMBLEM prefix and using the legacy RFC 3164 timestamp format appears as follows:

Jan 15 12:30:45 APPLIANCE-1 %ASA-6-199005: Startup begin

Starting with the Cisco ASA Series Syslog Messages reference documentation, it is possible to implement tailored parsing that translates message strings to structured records.

Receiving Syslog Messages

Implementing optimized syslog collection using NiFi is beyond the scope of the current discussion, but the ListenTCP processor is one direct approach to receiving syslog messages.

The ListenTCP processor can be configured to accept connections on a specified port number, and it is capable of handling multiple messages in a single NiFi FlowFile using batching. It is also possible to use the ListenTCPRecord processor, but the ListenTCP processor provides an optimized approach to initial input handling, delegating record processing to subsequent components. With the goal of delegating parsing to the Cisco EMBLEM reader, however, ListenTCP provides the shortest processing path for received messages.

The ListenTCP processor, as well as the other listening processors, support encrypted communication using a StandardRestrictedSSLContextService. The SSL Context Service supports Transport Layer Security, protecting communication between sending devices and the receiving processor. TLS provides an essential level of security for any deployment outside of testing on an isolated network.

Reading Cisco ASA Logs

The CiscoEmblemSyslogMessageReader can be configured in any processor that supports a configurable Record Reader property. In order to use the EMBLEM Reader, it is important to understand the parsing approach and standard record fields.

The EMBLEM Reader takes a best-effort approach to message parsing, enabling the NiFi flow to handle invalid or unexpected records. This strategy avoids data loss and allows flow designers to organize records using a variety of methods.

Record Schema Fields

Every record processed through the EMBLEM Reader includes the following fields:

  • log
  • format

The log field contains the original message string prior to any syslog parsing. The format field contains one of the following values depending on the level of parsing success:

  • LOG
  • EMBLEM
  • PARSED

The LOG value indicates that syslog parsing failed, and the resulting record contains only the log and format fields.

The EMBLEM value indicates that syslog parsing succeeded, including both standard syslog fields and EMBLEM message prefix fields. The EMBLEM value also indicates that the record did not match one of the known Cisco ASA syslog message numbers, meaning that the reader did not attempt additional field parsing.

The PARSED value indicates that syslog parsing succeeded, and the reader parsed a known Cisco ASA syslog message pattern. Records with a PARSED value contain both standard EMBLEM fields as well as fields specific to the associated message number pattern.

EMBLEM Record Schema Fields

Records identified with the EMBLEM or PARSED value in the format field contain several fields derived from the syslog standard and the EMBLEM message prefix. In addition to the log and format fields, processed records contain the following standard fields:

  • timestamp
  • hostname
  • message
  • facility
  • level
  • messageNumber

The timestamp field contains the date and time read from the syslog message. For messages using the legacy RFC 3164 format, the EMBLEM Reader defaults to the current year and current time zone offset according to the operating system on which NiFi is running. Using the newer RFC 5424 format is preferable to avoid potential unexpected timestamp values when processing syslog messages across year boundaries.

The hostname field contains the name of the host read from the syslog message string. The value of this field depends on the sending device.

The message field contains the portion of the syslog message following the EMBLEM prefix. Separating this section of the message allows for further parsing without having to process the entire syslog string.

The facility field contains the first element from the EMBLEM prefix. The value of this field will be ASA when reading logs from Cisco ASA devices.

The level field contains the second element from the EMBLEM prefix, which will be a number between 1 and 7 defining the severity level of the log.

The messageNumber field contains the third element from the EMBLEM prefix, indicating the message number associated with the specific type of Cisco syslog. The message number allows the EMBLEM Reader to perform targeted regular expression parsing. The message number also allows NiFi flow processing to handle selected types of messages.

EMBLEM Record Processing

The EMBLEM Reader is capable of parsing any syslog message matching the standard pattern, such as the following message that includes a timestamp formatted according to RFC 5424:

2022-01-01T12:00:00Z APPLIANCE-1 %ASA-6-199005: Startup begin

The EMBLEM Reader will parse the syslog message into a record, represented as JSON in the following:

{
"format": "EMBLEM",
"log": "2022-01-01T12:00:00Z APPLIANCE-1 %ASA-6-199005: Startup begin",
"timestamp": 1641038400000,
"hostname": "APPLIANCE-1",
"message": "Startup begin",
"facility": "ASA",
"level": 6,
"messageNumber": 199005
}

PARSED Record Schema Fields

Records with the value of PARSED in the format field contain all of the fields listed for EMBLEM records, as well as fields specific to the type of syslog message associated with the message number.

Cisco ASA syslog messages can have a variety of fields, and the default configuration of the EMBLEM Reader defines the message field names according to named-capturing groups specified in the associated regular expression pattern.

The EMBLEM Reader default configuration includes regular expression patterns for a selected set of Cisco ASA syslog messages.

PARSED Record Processing

For syslog messages matching configured regular expression patterns, the EMBLEM Reader produces records containing both standard schema fields as well as fields specific to the named-capturing groups defined in the pattern.

The following syslog message describes the start of an SSL handshake from client to server:

2022-01-01T12:00:00Z APPLIANCE-1 %ASA-6-725001: Starting SSL handshake with client OUTSIDE:10.0.0.1/1024 to 192.168.1.100/443 for TLS session

Based on the configured pattern, the EMBLEM Reader produces a record containing additional fields, represented in JSON as follows:

{
"format": "PARSED",
"log": "2022-01-01T12:00:00Z APPLIANCE-1 %ASA-6-725001: Starting SSL handshake with client OUTSIDE:10.0.0.1/1024 to 192.168.1.100/443 for TLS session",
"timestamp": 1641038400000,
"hostname": "APPLIANCE-1",
"message": "Starting SSL handshake with client OUTSIDE:10.0.0.1/1024 to 192.168.1.100/443 for TLS session",
"facility": "ASA",
"level": 6,
"messageNumber": 725001,
"peerType": "client",
"sourceInterface": "OUTSIDE",
"sourceAddress": "10.0.0.1",
"sourcePort": "1024",
"destinationAddress": "192.168.1.100",
"destinationPort": "443",
"handshakeProtocol": "TLS"
}

Configuring Regular Expression Field Patterns

In addition to the default set of patterns, the EMBLEM Reader has an optional property to override or extend message parsing capabilities. The Regular Expression Field Patterns property can be configured to read a file of comma-separated values containing one or more pairs of message number and parsing pattern values. Configuring a CSV supports extensible processing without deploying a new version of the EMBLEM Reader library.

The expected CSV format consists of a message number in the first column, followed by a comma, and the associated regular expression pattern that the EMBLEM Reader will use for parsing. The following provides an example CSV row for parsing CPU warm temperature logs:

735015,CPU (?<processorNumber>\d+): Temp: (?<temperature>\d+) (?<temperatureUnits>\w+) (?<temperatureLevel>Warm)

The configured pattern includes four named-capturing groups that the EMBLEM Reader will use as record field names. The pattern is capable of parsing the following syslog:

2022-01-01T12:00:00Z APPLIANCE-1 %ASA-4-735015: CPU 0: Temp: 75 C Warm

The EMBLEM Reader will produce the following record using the custom pattern configuration:

{
"format": "PARSED",
"log": "2022-01-01T12:00:00Z APPLIANCE-1 %ASA-4-735015: CPU 0: Temp: 75 C Warm",
"timestamp": 1641038400000,
"hostname": "APPLIANCE-1",
"message": "CPU 0: Temp: 75 C Warm",
"facility": "ASA",
"level": 4,
"messageNumber": 735015,
"processorNumber": "0",
"temperature": "75",
"temperatureUnits": "C",
"temperatureLevel": "Warm"
}

The EMBLEM Reader Regular Expression Field Patterns property requires that each row has a regular expression that conforms to the syntax defined in the documentation for java.util.regex.Pattern. The groups included in the regular expression must meet the requirements defined in the named-capturing groups documentation.

As described the documentation, group names must begin with a letter and must consist of uppercase letters, lowercase letters, or numbers. The EMBLEM Reader will validate the configured patterns when the framework enables the reader. Invalid patterns prevent the framework from enabling the EMBLEM Reader.

Routing and Filtering Cisco ASA Logs

The NiFi QueryRecord processor is one of several components that can be configured with the CiscoEmblemSyslogMessageReader for record evaluation and routing. When configured with the EMBLEM Reader, the QueryRecord processor provides a powerful solution for initial syslog handling.

The QueryRecord processor uses dynamic properties to create custom component relationships, where the property name represents the relationship name, and the property value contains a SQL statement to run against parsed records.

Triaging Cisco ASA Syslog Messages

As an one strategy for triaging Cisco ASA syslog messages, the QueryRecord processor can organize records according to the level of parsing success using the format field.

To group and route records that did not match the standard Cisco EMBLEM structure, create a dynamic property named log with the following SQL statement:

SELECT
*
FROM
FlowFile
WHERE
format = 'LOG'

To group and route records that matched the EMBLEM structure, but did not match any configured regular expression patterns, create a dynamic property named emblem with the following SQL statement:

SELECT
*
FROM
FlowFile
WHERE
format = 'EMBLEM'

To group and route records matching a configured regular expression pattern, create a dynamic property named parsed with the following SQL statement:

SELECT
*
FROM
FlowFile
WHERE
format = 'PARSED'

The QueryRecord processor supports more advanced SQL statements to define record field projections that use names different than the field names defined in the EMBLEM Reader.

Filtering and Field Projection

To select SSL handshake records and rename a subset of parsed fields, create a dynamic property named handshake with the following SQL statement:

SELECT
"timestamp" AS created,
hostname AS sender,
peerType,
sourceInterface,
sourceAddress,
sourcePort,
destinationAddress,
destinationPort,
handshakeProtocol
FROM
FlowFile
WHERE
format = 'PARSED'
AND messageNumber = 735015

The SQL statement will produce records containing the selected fields. Based on a Cisco ASA log describing a handshake, the SQL statement will produce a record represented in JSON as follows:

{
"created": 1641038400000,
"sender": "APPLIANCE-1",
"peerType": "client",
"sourceInterface": "OUTSIDE",
"sourceAddress": "10.0.0.1",
"sourcePort": "1024",
"destinationAddress": "192.168.1.100",
"destinationPort": "443",
"handshakeProtocol": "TLS"
}

A similar filtering and field projection approach can be applied in a variety of ways to meet specific use cases.

Performance Characteristics

Log processing performance in absolute terms depends on a large number of factors from hardware specifications to flow design.

The fundamental performance characteristics of the EMBLEM Reader depend on speed of parsing strings using regular expressions. The EMBLEM Reader uses a strategy that involves executing one or two regular expressions against each log string.

The first pattern parses the log according to the general EMBLEM syslog structure. The EMBLEM Reader then uses the message number to search for a compiled pattern associated with the message number. When the EMBLEM Reader finds a pattern associated with the message number, it executes the regular expression against the portion of the log following the EMBLEM header. Based on this approach, the EMBLEM Reader executes one or two patterns, depending on whether the message number has an associated pattern configured.

Running a small set of logs through the CiscoEmblemSyslogMessageReader in an isolated program provides one approach to relative performance evaluation. The NiFi GrokReader provides a baseline for comparison, as it can be configured with a single custom regular expression pattern.

Running a Record Reader in isolation limits processing to a single thread, but running within NiFi using a number of concurrent tasks, when aligned with hardware specifications, should provide greater throughput in absence of other competing factors.

Relative Performance Comparison

Running a performance comparison required creating a basic test harness together with sample log messages. A set of five Cisco ASA Log messages, each with a different message number, served as a standard input array. Each message number had an associated regular expression pattern configured in the EMBLEM Reader, with the result that each message required two pattern matching passes.

The standalone program prepared groups of five log messages in memory for the reader to process. Preparing 100,000 groups produced a total of 500,000 log messages to be processed.

Running on an Intel Core i7-9750H CPU, and configured with a pattern to parse an RFC 3164 syslog, the GrokReader was able to read all 500,000 log messages as records in around one second.

Running the same program through multiple iterations, the CiscoEmblemSyslogMessageReader was able to read all 500,000 log messages as records in an average of 2.2 seconds.

Running the same comparison with 1,000,000 log messages showed linear performance characteristics, with the Grok Reader taking around 2 seconds, and the EMBLEM Reader taking around 4.5 seconds.

Although these relative performance numbers are specific to the hardware and sample logs described, the differences illustrate the fundamental characteristics of parsing messages. Optimizing custom regular expressions is key to overall processing speed. Although the EMBLEM Reader requires two passes for regular expression evaluation, the targeted evaluation based on message number helps to keep both passes streamlined.

Conclusion

The CiscoEmblemSyslogMessageReader enables complex and varied NiFi flow processing using the capabilities already available through record-oriented components. The EMBLEM Reader does not include an exhaustive set of default regular expression patterns, but the ability to configure custom patterns through a CSV enables rapid adaption to specific environments.

The standard NiFi distribution includes support for reading standard syslog messages, but the EMBLEM Reader provides syslog handling tailored to Cisco ASA devices. Building on powerful record processing capabilities, the EMBLEM Reader supports high-performance parsing, extensible configuration, and advanced workflows.

--

--