CodeX
Published in

CodeX

New features of the hybrid monitoring AIOps system Monq

Monq has significantly updated its functionality, a free version has appeared and the licensing policy has been updated. If monitoring systems in your company start to get out of control, and their number rushes somewhere beyond the horizon, we suggest you take a look at Monq to take control of monitoring.

In one of the previous articles, I already written about the hybrid monitoring system from Monq. Almost two years have passed since then. During this time, Monq has significantly updated its functionality, a free version has appeared and the licensing policy has been updated. If monitoring systems in your company start to get out of control, and their number rushes somewhere beyond the horizon, we suggest you take a look at Monq to take control of monitoring. Welcome to podcast.

Licensing has become more flexible

I will start the review with good news — the appearance of a free version of the solution. In the new version of the licensing policy, the previously monolithic Monq was divided into three subproducts, each with an individual license:

  • Collector — collection and analysis of logs,
  • AIOps — hybrid monitoring and incident management,
  • TestForge — functional testing of information systems and services.

Monq Collector can be downloaded for free from the vendor’s website. Later in the article, you will learn how you can use it. AIOps and TestForge are provided in the extended version of the product and licensed according to the number of configuration units and to the number of runs of test scripts per day, respectively.

Simplified installation

Since the last article was published in 2019, the Monq deployment procedure has undergone many changes. First, the architecture of the deployed solution now consists not of four servers but of one. Secondly, now the administrator does not need to manually prepare the infrastructure, install Kubernetes, run playbooks, and so on. From now on, the product is delivered in the form of a virtual machine image which can be simply imported into the virtualization environment, connected to the console and assigned network settings that provide name resolution and Internet access. All other actions are performed in the graphical interface.

After the IP address is set on the virtual machine, go to the RUNDECK web interface.

In this video the installation process is shown in some details. There are three scenarios available to the administrator:

  • infra install — This is done first, the script will install all the necessary infrastructure services.
  • monq install — the script is run after the infrastructure is prepared. As a result, the script will install all the microservices of the system.
  • monq erase — — the script is optional to run, it is needed in non-standard cases when you need to remove the application software.

The final point of installation is the issuance of a login and password to enter the platform.

Operation became easier and new possibilities appeared

Updated interface

After logging into the system, as before, the sections “Workgroups” and “Users” require configuration. You also need to configure the mail plugin: specify the mail server, username, and password. Without this setting newly created users will not receive their generated passwords.

The old graphical interfaces of workgroups and users have also changed: the UI has changed and now it is based on Angular. A few screens are still on the same front (VueJS) but, as the vendor says, they will soon be a thing of the past as well.

For demonstration purposes, I created a workgroup called “Habrahabr” and a user called “Habr” in the monq space for further work.

In the workgroups there were not only interface changes but also functional ones. Let’s try to figure out what new features the vendor has added.

A section for managing workgroup policies has been added to the administrative panel. In general, these policies are global for a userspace and define workgroup management capabilities for ordinary users. While administrators still have access to any changes, for ordinary users, the ability to create or delete groups is determined by policies.

The platform user can now have several different roles in the same group at the same time. It makes possible more flexible configuration of user access rights to various objects and functions of the system.

Instead of the “Public group” type, the following types of privacy for working groups have been introduced:

  • Open — visible to all users, users can join the group themselves;
  • Closed — visible to all users, users cannot join the group on their own;
  • Private — groups are visible only to members of these groups.

The user section, consisting of the user profile and the page for management of platform users, was also transferred to Angular, and, while retaining all the previous functionality, there are also some improvements. New features now available:

  • mass adding of users;
  • adding users to a workgroup upon creation;
  • notifying the user by email about registration, blocking, unblocking;
  • viewing notifications to which the user is subscribed;
  • management of workgroups and roles from the user page.

The interface is now available in two languages: Russian and English.

Updated Resource-Service Models (RSMs)

The RSM configuration was reworked from the ground up. However, all the functionalities have been saved.

The following pictures are some screenshots of the new RSM panel interface.

The panel for creating a configuration item (CI) where you can add several items in one go:

Creating influence and dependence links, as well as deleting and viewing basic data on CIs are performed from a pop-up menu:

In order to bind triggers to a particular CI, you need to open the panel with related triggers and select the necessary ones:

The new panel gives the opportunity to work with such a parameter of CI as its health. This parameter is a composite one and, first of all, it serves to visually assess the state of the RSM. The user can clearly see which CI has the most serious effect on the health of the problematic CI, based on the value of its negative impact coefficient. Here is an example of the CI health panel:

Updates for integration settings

In the 6th version of the monq, the overview of which I demonstrate here, a significant modification has been made to the module of integration with external systems. From the time when the first article was written, the interface of the integration module also, like many other screens, moved to Angular.

“Integration type” was transformed into “configuration template”, and the integrations themselves were redefined as data streams.

The following system templates for configuration of data streams are available from the box: Zabbix, Scom, Prometheus, Ntopng, Nagios.

Integration with other monitoring and logging systems can be done through the “AnyStream” configuration template. Starting with version 6.0 you can write your own handlers in which you can parse the RAW stream into JSON and enrich the resulting events with customized labels.

The screenshot below shows a parser of incoming events from Prometheus that converts the date format:

A free version for analyzing logs is now available

Logs Screen is the main tool of the free version for analyzing logs. This module appeared about a year ago. The plug-ins subsystem has become available to users in the same release. Plugins allow you to collect data from various sources.

The main goal of the screen of primary events is to provide users with a tool for visualization and centralized search, analysis, and processing of logs stored in the ClickHouse database.

Let’s try to figure out how to use this tool.

At the top of the screen there is a search bar in which the native MQL syntax is used that outwardly is somewhat similar to Lucene.

In addition to search queries, there is an analytical tool which helps to determine the number of unique values for any of the fields and their respective percentages within the rest of the values. All values in primary events are active elements on which you can immediately configure filtering.

New options in settings for monitoring and alerting

In recent versions of Monq, it is possible to create your own alert plugins. Plugin subsystem allows space administrators to create and add their own notification plugins to the Monq system.

Let’s recall that Monq has the ability to create escalation chains with advanced logic.

Availability SLA reports are implemented

Availability reports have replaced SLA reports. This is a completely new tool for calculating the availability of information systems.

The “Availability” section allows you to:

1. Work with availability information of:

  • configuration items and their impact on IS availability,
  • information systems that consist of a selected number of CIs,
  • complex ISs, consisting of many subsystems with the ability to determine the value of the influence for each of them.

2. Configure parameters for generating an availability report and save them as a template. The templates for the new report received a wider range of settings:

  • Using PCM maps as a filter — to generate a report there is no need to create and update a list of KEs, it is enough to select an already saved PCM map.
  • More functional filter for alerts.
  • The Recovery Time Objective (RTO) indicator is the maximum time during which a CI can have problem status. This parameter allows you not to take into account unstable statuses of CIs in the calculation.
  • Service time (working time) — not to be confused with the service modes for CI, it allows to take into account non-working hours in the calculation in a proper way.

3. Manage templates and multi-templates (report templates for complex information systems):

  • create personal and group templates,
  • edit template parameters,
  • delete templates.

4. Quickly configure parameters and generate a report using them without saving the template.

More details about the methodology for calculating availability can be found on Habr in this article.

Updated module for functional tests (Monq TestForge)

The heading of the article is about AIOps, but I also want to make a brief overview of another important tool for complex business monitoring.

The Monq TestForge product, provided under a separate license, has functionality that allows you to abandon the mandatory use of external systems for configuring and running tests. It also makes possible managing of projects from a common interface.

There are two types of functional testing projects: managed and stand-alone. Standalone projects are managed from the environment in which they are launched, and managed projects are directly managed by TestForge.

The projects screen is divided into several views:

Project management panel. It lists all of the available projects in the system with different labels and a very nice interface.

Configuration templates are provided for the convenience of quickly creating and configuring new managed projects. Templates contain a set of project environment variables and job code. You can create a new template based on an existing project.

The management of the schedule for running tests looks very good. The scheduler starts the execution of the task code based on:

  • the general timetable and the results of the previous task execution,
  • user actions: execute now or execute at a specific time.

History of builds. For managed projects it is possible to manually start the build with additional features:

  • setting of startup variables;
  • viewing the test execution log;
  • viewing the broadcast of the test execution in real time.

Agents and coordinators of agents are introduced in Monq

In 2021, the system expanded its presence with agents. Agents are installed on systems running Linux or Windows and can:

  • get information about the system and transfer it to the data stream;
  • tail the log file and send raw data to the data stream;
  • tail the log file, parse it and send JSON to the data stream;
  • run TestForge scripts and send the execution result to the TestForge preprocessor;
  • fetch data from PostgreSQL without permanently establishing a connection and send it to the data stream;
  • forward messages from brokers (RabbitMQ, Kafka) to the data stream;
  • connect to Zabbix database and check for changes in triggers.

From the side of the platform, agents’ work is managed and monitored by agent coordinators. An agent processes a set of tasks coming from a coordinator and generates outputs and the execution log.

There are two types of agents in Monq:

  • static — agents monitored by the system,
  • dynamic — agents whose state is not monitored by the system.

Agent coordinators are responsible for identifying agents in the system and configuring access to information received from the agents.

By managing the coordinators you can:

  • create or remove a coordinator,
  • stop or start the coordinator,
  • configure access rights to the coordinator,
  • re-issue a token for connecting agents,
  • add or remove labels for agents,
  • configure coordinator parameters.

The list of connected agents is presented with the following information:

  • agent type (static or dynamic),
  • agent name,
  • agent state,
  • labels,
  • the version of the installed agent,
  • the agent’s task execution log,
  • date and time of the last processed job,
  • technical data of the agent.

Conclusions

In this article, we talked about the new and updated features of the Monq hybrid monitoring AIOps system. The product is in continuous development and in the near future we will receive many new improvements. You can follow the updates on the vendor’s website, as well as subscribe to a special telegram channel.

It should be noted that Monq is not limited to integrations with Zabbix, MS SCOM, Prometheus, Ntopng and Nagios. If necessary, it is possible to develop your own integration modules and take data or events from somewhere else.

Everything connected with Tech & Code. Follow to join our 900K+ monthly readers

Recommended from Medium

Hashmap Code Challenge Breakdown

The Three Stooges

Native, Web, or Hybrid Development: How to Choose Wisely

One developer is showing another how the app works on their phone

Create Docker Images without Docker daemon (Kaniko)

Android development environment using virtual machines

Serverless VS Microservices

Text Analytics with REST APIs for Autonomous Databases

Part 2: Automated AWS multi account setup with Terraform and OneLogin SSO

Creating a docker image with MongoDB and run it with Mongo Express

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Nikolay Ganyushkin

Nikolay Ganyushkin

ceo&founder monqlab - AIOps data platform for log analysis, monitoring and automation. MS of Nuclear Physics. MBA Skolkovo.

More from Medium

Carbon Footprint Calculator for Green Cloud Computing

MTBF in SaaS is useless

SRE: Simple Definitions for SLO-Based Alerting

Problem Postmortems