Team members in front of a massive screen showing data analyses, with the slogan “Anomaly Detection” below.
Image illustrating anomaly detection (DALL-E, March 2024)

Anomaly detection with Azure Stream Analytics (Microsoft)-Part I

A predictive analytics use case

1. Introduction

While the development of artificial intelligence (AI) continues to keep us all on our toes (most recently with OpenAI’s new text-to-video model Sora and Alibaba’s Emote Portrait Alive, or EMO for short), the main issue for organizations is to find use cases for generative and predictive AI technology that add value to their respective businesses and whose results are trustworthy, reliable, accurate, etc.

With this in mind, this two-part mini-series briefly introduces anomaly detection based on Microsoft’s Azure Stream Analytics, which is an interesting use case from the field of predictive analytics, where historical data is used to predict future events / behavior:

In this first part, we first get to know the Azure Stream Analytics platform:

  • What is it?
  • What are its key features and benefits?

We then present some sample anomaly detection use cases where Stream Analytics could be applied.

In the second part, we analyze fraudulent call data with Azure Stream Analytics and visualize results in a Power BI dashboard.

2. What is Azure Stream Analytics and what are its key features / benefits?

A man is running, surrounded by glowing lights and in front of a background consisting of data-analytical elements (graphs, etc.).
Creative illustration of Azure Stream Analytics (Bing Image Creator, March 2024)

Microsoft’s Azure Stream Analytics is a real-time data processing service with the following main features and benefits:

2.a. Serverless, scalable, and complex event processing service

Stream Analytics is a fully managed, complex event processing service on the cloud computing platformsAzure offered by Microsoft, so there is no hardware / infrastructure to deploy or operating system / software to update.

2.b. Ease of use

Getting started with Azure Stream Analytics is easy. It only takes a few clicks to create an end-to-end streaming data pipeline.

2.c. Cost effectiveness

Azure Stream Analytics provides a cost-effective solution for real-time data processing because

  • It’s a fully managed job service: This means you don’t have to spend time managing and maintaining clusters. Instead, you can focus on your analytical tasks.
  • It’s billed at the job level: This results in low up-front costs (with only one streaming unit) while providing scalability as your workload grows.

2.d. Support of different types of input sources on Azure

  • Azure Event Hubs and Azure Internet of Things (IoT) Hubs, which efficiently handle the collection of millions of events per second and serve as high-throughput pub-sub (publish-subscribe) event ingestors.
    In short, both are cloud-native services which enable low- latency data streaming from any source to any destination.
  • Azure Blob Storage: A binary large object (Blob) storage solution for the cloud which is optimized for storing massive amounts of unstructured data,

or

  • Azure Data Lake Storage Gen2, which is a set of capabilities dedicated to big data analytics, built on Azure Blob Storage, for managing large-scale data lakes.

2.e. Connection to those input sources

You can create a Stream Analytics job that can connect to

  • Azure Event and IoT Hubs for streaming data collection.
  • Azure Blob Storage or Data Lake Storage Gen2 for historical data collection.

2.f. Edge computing: Direct processing of data on IoT devices

Azure Stream Analytics is available in the Azure IoT Edge runtime. This allows you to process data directly on IoT devices, i.e. predictive maintenance sensors, supply chain tracking devices, etc., where the data originates.

2.g. Guarantee of exactly-once event processing

This means that for a given set of input data, the system will consistently produce the same results even if jobs are restarted or multiple jobs are run in parallel on the same input data.
This guarantee is essential for repeatability and consistency of results.

2.h. Robust pipelines

Azure Stream Analytics can create robust pipelines for streaming data and analyze millions of events with milli-second latency.

2.i. Query language for real-time analytics

Users can create real-time analyses by means of an SQL-like language.

2.j. Real-time dashboards

It‘s possible to build real-time dashboards using Power BI.

2.k. Conclusion

All those functions make Azure Stream Analytics a versatile platform for building a streaming data pipeline to identify patterns and relationships in data that originates from various input sources such as applications, sensors, clickstreams, social media feeds, etc.

You can then use those patterns to trigger actions and initiate workflows, including

  • Generating alerts.
  • Providing data for reporting tools.
  • Persistently storing converted data for future use.

3. Anomaly detection use cases for Azure Stream Analytics

A dynamic chart with fluctuating data points representing real-time flows and highlighting anomalies as spikes or deviations from the norm.
Image illustrating anomaly detection (Bing Image Creator, March 2024)

Here are some sample use cases for anomaly detection where Azure Stream Analytics could be of use:

3.a. Anomaly detection in sensor data

Detect spikes, dips, and slow positive or negative changes in sensor data to identify unusual patterns in real-time data streams.

3.b. Geo-spatial analytics for fleet management and driverless vehicles

Monitor vehicle locations, routes, and performance to

  • Optimize fleet operations.
  • Improve safety.
  • Enhance efficiency.

3.c. Remote monitoring and predictive maintenance of high-value assets

Monitor critical equipment, machinery, or infrastructure remotely to

  • Predict maintenance needs based on real-time data.
  • Reduce downtime.

3.d. Clickstream analytics to determine customer behavior

Analyze user interactions on websites or mobile apps to

  • Understand user preferences.
  • Optimize marketing strategies.
  • Personalize experiences.

3.e. Real-time telemetry streams and logs from applications and IoT devices

Process and analyze data from various sources such as applications, IoT devices, etc. to use the insights for operational improvements, troubleshooting, or reporting.

In the second part of this mini-series, we analyze fraudulent call data with Azure Stream Analytics and visualize results using a Power BI dashboard.

As always, thanks for reading and, hopefully, see you again in the next post on Medium.

Author for WAITS Software und Prozessberatungsgesellschaft mbH, Cologne, Germany: Peter Bormann — March 2024.

--

--

WAITS Software- und Prozessberatungsgesellsch. mbH

www.waits-gmbh.de // Authors are different associates of the company: Consultants, Developers and Managers. Posting languages are German [DE] and English.