Applying big data frameworks to Revenue Assurance

Tim Cross
Energetiq
Published in
1 min readNov 12, 2018

Recently we’ve been looking into big data solutions to process and analyse data for revenue assurance.
This involves extracting actual data from multiple systems — NBV and Quantify, and comparing it to find discrepancies.

On the AWS cloud, Amazon EMR (Elastic Map Reduce) is a managed cluster platform that will run the big data software (Apache Hadoop and Apache Spark).

The process is to load the data from the Quantify inspection period (default two years worth) into the EMR cluster.
Next apply a data transformation to group the relevant data for each site together and stripe different sites across multiple workers for parallel processing.
Finally with the site data in memory, the Quantify inspector issue detection business logic runs over the data to generate the adverse findings.

The following diagram provides an illustration of this process, colour coded into site/month combinations to show the data flow.

Data flow

This architecture can handle a large volume of data efficiently. Compared to the traditional approach of the batch engine running a heavy load of SQL queries against the database, the new architecture can process an order of magnitude faster.

The revenue assurance software is available for Quantify 2.1.

--

--