Using Data Science to detect unhappy customers before they leave.

Image for post
Image for post

Introduction

Nowadays, many companies switched their business model from a one-time fee to a monthly or annual subscription. The customers have the right to cancel their subscription at any time, or, in some cases, downgrade to the free subscription model. On the other hand, companies want to keep their customers at the paid level.

Usually, the customers who leave have some signs that they are about to do so. These signs differ from service to another; for example, for a telephone company, the leaving customers usually call the support more frequently, they submit some complaints, or they rarely use the service. These are some indicators that the company will lose this customer soon! …


Image for post
Image for post

Water is the most precious resource on earth; all living organisms depend on water to live, and it forms 2/3 of our planet. Despite its importance, there is a shortage of fresh water in most of the world’s urban cities. Hence, conserving water is a strategic choice for almost all humans.

To put water conservation plan, we must know the amount of water consumption in each sector (industry, agriculture, domestic, …). In this study, we have analyzed a dataset of a sample city, that is found on the Kaggle website. The city is Sonora, Mexico which is a medium-size city.

In this study, we would like to answer the following…


Image for post
Image for post

Time series are an important form of indexed data, which is found in stocks data, climate datasets, and many other time-dependent data forms. Due to its time-dependency, time-series are subject to have missing points due to problems in reading or recording the data.

To apply machine learning models effectively, the time series has to be continuous, as most of the ML models are not designed to deal with missing values. Hence, the rows with missing data should be either dropped or filled with appropriate values.

In time independent data (non-time-series), a common practice is to fill the gaps with the mean or median value of the field. However, this is not applicable in the time series. To understand the reason, let’s consider a temperature dataset. The temperature value of February is very far from its value in July. This is also applicable to sales dataset that has some seasons with high sales, and others with low or regular sales. …

Dr Mohammad El-Nesr

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store