What happens to your data when you make a call? 🤔

Have you ever wondered what happens to your data after making a call? What does your carrier do with that data? How do they know what type of phone plan to offer you? Or are these offers just random suggestions?

Mehdi Lebdi
SFU Professional Computer Science
8 min readMar 12, 2019

--

Content Contributors: Mirza Tauqeer Baig, Mehdi Nikkhah, Junaid Qazi

Telecommunication is a fast-paced industry that has evolved drastically over decades right from First Generation(1G) to Fourth Generation(4G). We are about to witness the launch of Fifth Generation(5G) globally in the coming decade. 5G will need to be flexible enough to efficiently handle a wide range of devices ranging from very simple ones that send only small, rare bursts of data to advanced ones that send large amounts of data quickly. 5G becomes significant when it comes to addressing higher capacity, higher data rate, lower end-to-end latency, massive device connectivity, and reduced cost. Which raises the question, are current business supporting solutions(BSS) capable enough to handle such a massive amount of devices/applications? Here we are about to discuss online charging systems that generate customer bills based on usage over the network. Our suggested approach is to integrate the layers involved in charging into the big data ecosystem.

Whenever you make a call, a record containing information surrounding that call is generated. This record is called a Call Detail Record(CDR). So, what is a CDR:

A call detail record(CDR) holds information regarding calls made over a phone service. It documents how a particular phone number and/or user is behaving. A CDR report is produced to communicate when, where, and how calls are made mainly for reporting and billing purposes. The CDR data commonly contains:

  • Unique sequence number: The Sequence number identifying the record
  • Calling party: The phone number of the caller
  • Called party: The phone number of the receiver
  • Billed number: The billing phone number that is charged for the call
  • Call duration: The duration of the call in minutes
  • Stat time: The starting time of the call(date and time)
  • Call type: The type of call that was made(VoIP (Voice over Internet Protocol), voice, or raw data)
  • Call disposition: The disposition of the results of the call (was the call connected?)
  1. The information reported in Call detail records is utilized to monitor phone activity for the purpose of computing your bill at the end of each call.
  2. The information reported in Call detail records is utilized to gain deeper insights into a company’s calling activity.

How does your carrier compute your bill? And what is the engine behind billing calls?

CDRs are generated whenever a call is made from a device (i.e. mobile). There can be different events in a call such as VoIP (Voice over Internet Protocol), voice, or raw data. Once a call is made, it is passed to a mediation layer. The mediation layer is then responsible for converting CDRs into a format that the rating/billing engine can understand. The rating/billing engine is comprised of a rating layer, responsible for rating and applying various offers, and a billing layer responsible for creating the final bill. Through the data processing layer, we can generate powerful insights that marketers can only get from phone calls. For example, which provinces or regions drive the most calls? Who are the top calling customers for each region? Who are your top customers by sales? And what type of offers can you extend to them?

Problem Definition:

Real-time charging is one of the components of business support solutions (B.S.S) in the telecom industry. The goal of this blog is to create a real-time charging system using data gathered from Call detail records (CDRs). Using a big data ecosystem, the following solution is capable of charging events coming from the network. The model considers only the mandatory fields relevant for billing. Github repository for the real-time charging system is available at the end of blog.

Methodology:

The architecture proposed for this solution is a microservices architecture where each component can be packaged and deployed separately. There are four main components to this solution:

  1. Simulator: This layer is responsible for generating CDRs at various thresholds which are configured into an external file i.e, config.ini
  2. Mediation Engine: This layer is responsible for data ingestion; CDRs are converted into a format that other layers can understand
  3. Rating/Billing Engine: This layer is responsible for applying offers and rate a call. The rated call is then billed as per offer and usage
  4. Visualization: This layer is responsible for data analysis, there are top-N type analysis performed on Tableau

The above methodology is illustrated in the diagram below:

Real-time Charging System Solution Diagram

1. Simulator:

  • Loads a list of customers and cell tower locations from two csv files
  • Maps the customers and cell tower locations randomly with given time interval of start date time and end date time
  • The message created is then pushed into a Kafka server at a given range of rates. It maintains feeds of messages in topics

Raw CDR generated by simulator and pushed in Kafka Topic:

2018–01–15T11:08:00.000Z, 2018–01–15T11:36:00.000Z, 43.918900|-79.816100, 44.763900|- 79.992800, 4294283714, 8967939604, 0, DATA

Structure of the sample data at Simulator Layer:

The customer list has two columns for caller and receiver:

Sample customer list with caller and receiver numbers

The cell tower locations have two columns for ID and latitude/longitude values:

Sample cell tower locations with longitude | latitude

The message produced follows the given order:

Data format and structure of the message produced

Actual sample generated from Kafka Server after Simulation Layer is run:

Sample message generated from the Kafka server

2. Mediation Engine:

  • Consumes the Kafka stream and maps the data into a structure
  • Filters the calls that are dropped to showcase the capabilities of the system
  • Creates a dataframe that is passed to the Rating module

Record after CDR is mapped into dataframe by mediation module:

2018–01–15T11:08:00.000Z, 2018–01–15T11:36:00.000Z, 43.918900|-79.816100, 44.763900|- 79.992800, 4294283714, 8967939604, 0, DATA

  • Using reference from Cisco guide, the CDRs are flattened which are dependent on routers

3. Rating/Billing Engine:

  • Receives streaming dataframes from Mediation module
  • Loads static dataframes from customers and offers collection in MongoDB
  • Joins the two dataframes and applies business logic for calls that are rated and billed
  • The small static dataframes will be broadcasted by the framework itself

Final dataframe after applying business logic in rating module:

C4294283714, 2018-01-15T11:08:00.000Z, 2018-01-15T11:36:00.000Z, 43.918900|-79.816100, 44.763900|-79.992800, 4294283714, 8967939604, 0, DATA, 1680, 0.04, 67.2

Spark-Listener/Charging System listener:

An important aspect of this architecture is the Spark-Listener. Properties are loaded using environment variable $APP_HOME, where config.ini is residing. Then the configurations are passed to controller which:

  • Starts streaming process
  • Detects type of device
  • Maps raw events to structured data frames
  • Invokes mediation process
  • Invokes rating process
  • Checks data in hdfs
  • Pushes data on configured database which is the MongoDB cloud server (mLab) for this architecture

This solution also includes fault tolerance using hdfs(The Hadoop Distributed File System) checkpoint: Error mitigation is important in real-time systems. In case of failure, or unintentional shutdown while writing the stream’s batch into mLab, a checkpoint was created to provide a fault-tolerant system. The checkpoint feature will save all the progress information to the checkpoint location.

The information reported in Call detail records is utilized to gain deeper insights into a company’s calling activity

4. Visualization:

  • Loads data pulled from mLab (Cloud database service that hosts MongoDB databases)
  • Imports data to Tableau
Sample data loaded into Tableau

The final data includes the following features to be further explored in Tableau:

Sample CSV data representing features needed for analysis

Diagrams are generated for insights in Tableau:

Real-time analytics of events was done in Tableau, generating graphs that provide insights for Telecom companies. For instance, the graphs below entail that most customers are located in Ontario and in second place in British Columbia as well as insights about top paying customers and their location.

Canada-wide map of customer distribution side by side with Top-N customers by sales
Top-N customers by sales for the province of Ontario
Call traced from original Cell Tower in Ontario to destination Cell Tower in Alberta
Total bill per Province side by side with Top Calling customers

Complete Tableau dashboard resulting form the Visualization layer:

Tableau dashboard resulting from sample call detail records

→ The above dashboard is accessible on Tableau Public

Conclusion:

The information reported in Call detail records is utilized to monitor phone activity for the purpose of computing your bill at the end of each call. The charging system presented is an end-to-end solution that is implemented in the Telecom industry and also capable of accommodating new devices with different rate types, for example in the case of autonomous cars where billing is based on startLocation/endLocation as well as tartTime/endTime. The proposed charging solution is capable of handling large number of devices and is easily scalable. For the implementation, several technologies had to be merged to work in sync, such as Kafka, Spark, HDFS, MongoDB, and Tableau. The outcome of CDR reporting is providing strong insights and supporting businesses in creating powerful marketing strategies. These insights include resource allocation to concentrated areas as well as improving customer offers.

Github repository for the real-time charging system: https://github.com/MehdiLebdi/Charging_system

--

--

Mehdi Lebdi
SFU Professional Computer Science

Aspiring Data Scientist / Msc in Computer Science, Big Data Specialization — Simon Fraser University