How to dynamically allocate resources to cellular networks using AI

Published in

Analytics Vidhya

6 min readJul 26, 2021

Wireless communication has become an integral part of our day to day life. Whether its voice calls or group meetings over zoom, there isn’t a single time, we don’t encounter glitches or lags. Complaint lines of telecom operators always answer busy tone and this leads to frustration. All the mobile phones work on radio singles and thus require a lot of resources for smooth functioning of tele communication. Lets understand currently, how resources are allocated to cellular base stations.

Current Resource Allocation System

You must have observed whenever you are near a huge public gathering like concerts, sports event etc or on special events like new year eve, Christmas night, your network connectivity totally goes down and you unwillingly go into ghost mode. This is because resource allocation at cellular networks are handled statically based on average number of devices and their utilisation at a particular base station. So whenever there is a drastic surge in the network utilisation, it sometimes leads to network outage and thus you loose connectivity. Due to static allocation and under constant pressure of maintaining high quality network connectivity and ensure smooth functioning, most of the base stations are over provisioned with radio resources.

How AI can do the magic

To overcome this scenario, we can leverage the power of artificial intelligence. AI can be used to make the resource allocation process dynamic. Machine Learning algorithms backed up by enormous past data, can help to detect network anomalies and thus help in triggering resources as and when needed. In simple terms, whenever there is a sudden spike in the number of devices connected to a particular base station, it gets detected by AI model and the system automatically allocates more resources to the base station thus ensuring smooth functioning. Dynamic resource allocation also helps in reducing energy consumption and cost thus making it more efficient and affordable.

Lets dive deeper

Lets take a look at a practical demonstration and understand how can this be achieved. For understanding purposes, I have used a kaggle competition dataset which can be found here.

About The Dataset

The dataset has been obtained from a real LTE deployment. During two weeks, different metrics were gathered from a set of 10 base stations, each having a different number of cells, every 15 minutes. The dataset is provided in the form of a csv file, where each row corresponds to a sample obtained from one particular cell at a certain time. Each data example contains the following features:

Time : hour of the day (in the format hh:mm) when the sample was generated.
CellName1: text string used to uniquely identify the cell that generated the current sample. CellName is in the form xαLTE, where x identifies the base station, and α the cell within that base station (see the example in the right figure).
PRBUsageUL and PRBUsageDL: level of resource utilization in that cell measured as the portion of Physical Radio Blocks (PRB) that were in use (%) in the previous 15 minutes. Uplink (UL) and downlink (DL) are measured separately.
meanThrDL and meanThrUL: average carried traffic (in Mbps) during the past 15 minutes. Uplink (UL) and downlink (DL) are measured separately.
maxThrDL and maxThrUL: maximum carried traffic (in Mbps) measured in the last 15 minutes. Uplink (UL) and downlink (DL) are measured separately.
meanUEDL and meanUEUL: average number of user equipment (UE) devices that were simultaneously active during the last 15 minutes. Uplink (UL) and downlink (DL) are measured separately.
maxUEDL and maxUEUL: maximum number of user equipment (UE) devices that were simultaneously active during the last 15 minutes. Uplink (UL) and downlink (DL) are measured separately.
maxUE_UL+DL: maximum number of user equipment (UE) devices that were active simultaneously in the last 15 minutes, regardless of UL and DL.
Unusual: labels for supervised learning. A value of 0 determines that the sample corresponds to normal operation, a value of 1 identifies unusual behavior.

We will use this dataset where “Unusual” is our target column which we will predict and all other columns are our features which will help us in predicting anomalies. We also have to perform few pre-processing steps like converting objects to float as ML algorithms only understand numbers.

After defining features and target columns, we will train the model. I have used XGBClassifier as it is a boosting algorithm. There are a lot of reasons why XGBoost is widely used in Machine Learning competitions and productions. Main advantages are :

It is designed to handle missing features and thus it reduces our data cleaning efforts
It supports regularisation and performs parallel processing which makes it faster and and uses less computational resources
It works well with small to medium datasets as it uses decision trees as base learners.
It is a boosting algorithm, so it combines many weak learners to make a strong learner. Hence it is referred as ensemble learning.

I split my data into training (90%) and testing (10%). This means we used around 33000 rows of data to train the model and kept 3600 rows of data for testing. This data is randomly shuffled and split so as to evaluate our model unbiasedly.

And BOOM !

We achieved 99.1% accuracy ! Which means based on the given features, 99 out of 100 times, our model accurately detected that there is a sudden spike in our network usage and we need to allocate more resources so as to ensure smooth functioning of the cellular network.

Future Enhancements

With just a sample data of 36000 rows, we could achieve such a great accuracy. Definitely, things might get complex when we use every aspect of data including different time period, events, locations. And to make it a generalised model would require various modifications. However this surely suggests that with proper data and appropriate machine learning pipeline, we can dynamically allocate resources to cellular networks and thus make it energy efficient and ensure robust performance even in times of surge.

Thanks for going through. Feel free to connect with me over LinkedIn or drop an email at darshil3011@gmail.com, in case of any queries/doubts. I will be more than happy to help !

For customised AI solutions, visit Think In Bytes.

References :

XGBoost an efficient implementation of gradient boosting

The "XGBoost" algorithm has been triumphant in many Machine Learning competitions. Introduced in 2014 and since then…

datascience.foundation