GCP IIoT: Visualization and Analytics in Looker for Oil & Gas Systems

Seva Miheev
Badal-io
Published in
11 min readSep 21, 2021
Charting data in a dashboard built in this demo.

Introduction

This is Part 2 of a series of posts where we demonstrate how to build an Industrial IoT (IIoT) data management and analytics platform on Google Cloud (GCP). In Part 1, we created a framework for first mile data collection, storage, real time analytics and alerting using FogLAMP, Dataflow and BigQuery. In this post we will show a set of ready to use Looker dashboards for visualizing and analyzing real time IIoT data with a specific focus on Oil & Gas use cases.

This post presents a foundation for a next generation IIoT analytics engine that leverages Looker’s Modern BI & Analytics and GCP’s scale, compute power, and advanced ML & Analytics tooling to provide a much cheaper, flexible and powerful solution than existing products.

Currently, most industrial companies have a decentralized data gathering process, where data is trapped in “data silos” and is gathered in an error prone manual process. Analysts are then tasked with manually creating reports and meticulously searching through the data to find issues in the system. Since reporting is usually done on a monthly basis, issues and imbalances are usually found late, and cannot be fixed in time. By moving to a real time, automated system companies can improve the accuracy of their data, detect issues as they happen, and eventually use advanced analytics and ML techniques to predict and proactively fix issues before they happen.

We will be focusing on the use cases of Oil & Gas pipeline operators in this post, and will be highlighting the following features through these use cases:

  • Dashboards for exploring asset inventory and hierarchy visualizations that allow users to click through a linked set of dashboards
  • Analytical and operational dashboards which stream live measurement data
  • Custom real-time calculations of shrinkage, over/short and measurement inaccuracy for facilities/assets based on individual field meter points
  • Framework for defining and detecting real time critical system events (“Event Frames”) along with drill down capabilities to find related events, do a root-cause analysis, and generate alerts in a dashboard that explores these anomalies
  • Self-serve report creation with automated calculations
  • Custom alerting

In our next blog posts we will show how to leverage a BigQuery ML ARIMA model to do anomaly detection and how to use GCP AI Notebooks to train custom models based on events generated by the frameworks.

What is Looker?

Before we dive into the features of our dashboards, we would like to give a brief overview of what Looker is. Looker is a cloud native business intelligence and analytics platform that simplifies the process of visualizing and analyzing data. It is browser-based and uses a modelling language (LookML) to create highly customizable calculations, visualizations and dashboards based on centrally governed logic. Looker natively connects to databases, allowing it to efficiently query real-time data directly from the source.

System Overview

We will start off with the System Overview dashboard, in which we can see a high level view of the system of field meters and facilities. This is a good place to start because it gives us a chance to elaborate on our data model.

Our data model for a system of facilities consists of assets (plants, pumps, pipelines, etc.) that have field meters at various points. The points where a resource — in this case, gas — is coming into the asset are designated as inlets, and the points where a resource is moving out of the asset are designated as outlets. There can be many or no inlets/outlets in an asset, depending on its function. IoT devices that take measurements are then linked to each field meter point.

At a glance, the dashboard shows the locations of the assets, and provides a general overview of their state and production. The bottom half of the dashboard summarizes the events that have been detected, and shows the devices and assets that are most problematic. We will dive deeper into events later on in this post.

Interactive Asset Exploration

The System Overview dashboard provides a starting point for exploring the parts that make up the system. The dashboards are linked, providing an interactive explorative experience. Users can immediately drill down into an asset or field meter, and see a full set of visualizations for its performance.

Say we want to check up on the facility that is producing the highest flow rate. All we need to do is click on the name to see a more detailed breakdown of the measurement data for that facility.

This brings us to the Asset Overview dashboard. Here, various measurement data for this facility is streamed from BigQuery. If we want to further explore the data, there is an option to change the time granularity and the timeframe that we are looking at. For instance, we can see that there is a divergence in the inlet and outlet flow when looking at the measurements chart at the hourly time granularity. If we want to zoom in to the discrepancy between the inlet and outlet in this facility, we can look at the date range of when it happened with each point representing a minute of data.

Alternatively, if we want to see the bigger picture for this facility, we can choose to look at the aggregate daily data for the month.

If we want to drill down into the data for a specific field meter point from here, we can do so by clicking on any point in the inlet or outlet graph.

Live Data Streaming

Each dashboard in this series is enabled for live data streaming. This is a feature that comes natively with Looker and gives us the ability to do real time analytics. In the backend, Looker refreshes its queries at the specified interval — as often as every second — to pull the most recent data from BigQuery.

This computing power is extended to calculations that are based on incoming data — such as shrinkage and over/short calculations. After receiving measurements from the devices, calculations are immediately run in BigQuery; the speed at which this occurs is significant, as Looker updates the charts for calculations as quickly as it updates the charts for raw measurements.

Real-Time Calculations

The set of calculations that we focused on was for the ‘meter proving’ (or ‘meter calibration’) process, which is used to determine the accuracy of flow readings done by field meters. If it is found that the measurement inaccuracy of a facility is high, a technician is sent to calibrate (or ‘prove’) the field meter(s) causing the issue.

To derive the measurement inaccuracy of a facility, the following calculations need to be performed:

First, the shrinkage — the physical loss of volume due to differences in molecular size — is found using the following formula:

Then, the over/short — the gross volume gain or loss for a facility — is found using the following formula:

Finally, the measurement inaccuracy (M/I) — the difference between the over/short and shrinkage — is calculated using the following formula:

The live data streaming feature gives us the ability to use the Field Meter and Asset dashboards for operational purposes such as monitoring asset performance. For instance, instead of waiting until month end to prepare the analysis to calculate shrinkage, the over/short and the measurement inaccuracy, all of these calculations for system balancing are done in real time and the field meters causing these issues are flagged. In other words, the response time to fix issues such as field meter inaccuracies becomes relatively instantaneous, resulting in a highly efficient process for meter proving.

Event Detection & Analysis

Events are important, unusual occurrences that happen in the system. Some of the events that are detected in our framework include:

  • low flow rate measurements
  • temperature changes or variances
  • device errors (which pop up when measurement data is missing)
  • device version changes

When these events are detected, we create a unique ID for each event along with properties such as start time, end time, severity, and various other attributes. This aggregate summary is referred to as an “Event Frame”.

Going back to the events section of the System Overview dashboard, we have a timeline of events for our devices, along with the severity of each event. Clicking on an event that interests us will give us the option to explore it in the Event Overview dashboard.

In the Event Overview dashboard, the type of event that we’re looking at is highlighted alongside all other events of the same type that have recently occurred. The measurements from other devices are displayed to give context to the event — showing how regularly it occurs and the relative duration of the event in comparison to what is the norm. Additionally, the measurement segments that have been identified as the selected event type are graphed to show the trends in magnitude, movement and duration.

The information generated by this dashboard provides an analytical snapshot of an event that can be used as actionable intelligence by support staff. The ability to compare attributes across events to generate insights into their causes can lead to significant improvements in performance and efficiency, and facilitates the transition from lagging indicators to leading indicators.

For example, if we’re interested in the first Low Flowrate event that we saw in the System Overview dashboard, we can click on the option to examine the event. Once in the Event Overview dashboard, we can see if there were any similar events that occurred around the same time; there was a Low Flowrate event that occurred in another device shortly after — this might be something to investigate. We can also see that these two events are not frequently occurring, and that aside from these anomalies, the flow rate across devices is steady at higher levels. Finally, we can compare the duration and magnitude of similar events to see that more often than not Low Flowrate events are short lived; this type of pattern is something that we can keep in mind when investigating.

Custom Alerting

Along with monitoring events, we can also set alerts based on customized criteria so that we can be notified when they occur in real time. Looker has built in functionality to send alert notification emails, and can be further customized to send other types of alerts.

Circling back to the events section of the System Overview dashboard, we just need to click on the “bell” icon in the tile that we want to monitor to set an alert. For instance, if we want to be notified when a high severity event occurs, we can specify that in the alerting options, and create the alert.

Let’s say that we’re also interested in Low Flowrate events. We would follow the same process here. First, we would click on the bell icon in the Events by Type visualization tile. Then, we would customize the alert by setting the condition to monitor for Low Flowrate events, and set the frequency to check for events every 15 minutes throughout the day.

Automated Reporting

The final use case that we would like to highlight is automated reporting. This report automatically calculates the production of our system for the selected period — along with some performance metrics — and compares it to the preceding period. The reports are highly customizable, and any specific date range can be selected for the report. In addition to this, the comparison period can be selected as either the preceding period, or as the same date range in the previous week, month or year. Automated reporting significantly cuts down on the time spent by analysts to put together data and run the calculations for a report, while adding flexibility to the analysis that can be generated.

Mobile Access

Looker provides a mobile app to view these dashboards, which is handy for technicians that may need to access data while in the field. The visuals are rendered to fit a phone or tablet screen.

Demo Implementation

In order for the provided turn-key code and dashboards to work, there are a few prerequisites:

  • A Looker instance
  • A BigQuery connection — our LookML references BigQuery syntax
  • The database schema needs to be organized in the same way as it was generated by the demo in Part 1

Our demo system of dashboards can be found on GitHub. It is turn-key and super easy to set up! After cloning the repository, the files can be dragged and dropped into any Looker instance.

There are only a few changes that need to be made:

  • The first is changing the name of the database connection in the facilities-demo.model file:
  • The second is changing the links that reference the badal.io Looker instance to the name of your instance (the reference to the board may also need to be adjusted). These can be easily found and changed using the Find & Replace in Project tool in the Develop sidebar:

And that’s it! That’s all it takes to set up a system of operational and analytical dashboards for industrial facilities!

Acknowledgements

We would like to thank the team at Looker and Google — particularly Rachel Kamienski, Tom Krywitsky and Cindy Spence — for providing support and generously lending their expertise.

--

--