The ever-increasing trend that is the world of Data Analytics

Anahita Bilimoria
Keubik Technology
Published in
7 min readMay 10, 2019

Data has been preserved since times immemorial.

Ledgers have turned to Excel sheets, manual entries have turned into logins and there is still an exponential increase in the amount of data accumulated by man daily. Driven by the need to preserve proofs and document the past, the techniques of storing data may have significantly improved over the years, but the intent has always been stark clear; to aid in case future situations demand it. The fear of a deranged outcome has compelled our race to push boundaries and manipulate data to make our decisions for us. This has led not just to a vast opportunity in the Technological world, but in any and all aspects of human development. The study of data has opened doors to Artificial Intelligence, Machine Learning and Business Analytics. More so, tucking these domains under one umbrella has now become mandatory, owing to the dependence these entities have over each other, moreover on Data Analytics.

Studying patterns and finding a common variable has been the source of a number of achievements man has gotten, ageing back to Alan Turing cracking the Enigma code. Analytics takes this method further and sprinkles it on every business decision that companies are entitled to. Studying a spatial image of the Arctic every few months paints a disappointing picture of how adversely Global Warming is hitting Earth. So much as marking a map will highlight every attack the ISIS has made on Paris in the last five years. There are a number of ways to carry out these studies, some of which are listed below:

  • Regression: It is a technique used to analyse the relationships between a dependent variable (occurrence) and the effect it can/ does have on the outcome. There are various types of regression, Linear and Logical to name a few. They are used to study relationships between dependent and independent variables as well.
  • Refining: Refining involves cleaning data, removing redundant data, lessening variability of data and trying to turn it into one integrated source.
  • Clustering: Clustering is a technique of grouping clusters of data that are more similar to each other, than data outside the cluster. It is used to further apply certain attributes to variables having similar characteristics.
  • Predictive Analysis: Predictive analysis uses past patterns/ behaviour to determine future decisions.
  • Sentiment Analysis: Sentiment analysis can be achieved in a number of ways, generally through keyword analysis (occurrences of particular phrases/ words/ forms of sentiment) to make decisions of sustainability/ growth and reception of a product/ service.
  • Visualization: Visual representation of the data/ results over a period of time is called visualization. It is used to determine the peak moments of the analysed data and structure significant observations.
  • Segmentation: It involves dividing a database into groups of observations that are similar in specific ways. However, segmentation is only spot on with almost binary characteristics. It is a tightly-segregating technique whereas clustering is a loosely-segregating technique.

Structuring data and giving it a skeleton is primarily what is most beneficial in Analytics. Merely storing raw data is a baby step in an ongoing process of making informed decisions. A direct reference to analysing data is the human mind, the older we grow, the more refined our databases get. Similarly analysing data is simply a step by step process with a shifting finish line. Doing so involves four steps:

  • Descriptive Analysis (What happened)
  • Diagnostic Analysis (Why did it happen)
  • Predictive Analysis (What is likely to happen)
  • Prescriptive Analysis (Achieving the outcome we have foreseen)

Now that we have an overview of the process of arriving at insights, let's delve further into the following study and see how they are used.

As per a study carried out by Liam Morgan in April 2019, the data captured by the World Health Organisation from 1985 to 2015 was studied and organised to calculate the Global Suicide Trend.

Insights

  • The peak suicide rate was 15.3 deaths per 100k in 1995
  • Decreased steadily, to 11.5 per 100k in 2015 (~25% decrease)
  • Rates are only now returning to their pre-90’s rates
  • Limited data in the 1980s, so it’s hard to say if rate then was truly representative of the global population

By Continent

Insights

  • European rate highest overall, but has steadily decreased ~40% since 1995
  • The European rate for 2015 similar to Asia & Oceania
  • The trendline for Africa is due to poor data quality — just 3 countries have provided data
  • Oceania & America’s trends are more concerning

By Sex

Insights

  • Globally, the rate of suicide for men has been ~3.5x higher
  • Both male & female suicide rates peaked in 1995, declining since
  • This ratio of 3.5: 1 (male: female) has remained relatively constant since the mid-90s
  • However, during the ’80s this ratio was as low as 2.7: 1 (male: female)

By Age

Insights

  • Globally, the likelihood of suicide increases with age
  • Since 1995, the suicide rate for everyone aged >= 15 has been linearly decreasing
  • The suicide rate of those aged 75+ has dropped by more than 50% since 1990
  • The suicide rate in the ‘5–14’ category remains roughly static and small (< 1 per 100k per year)

Gender differences, by Continent

Insights

  • European men were at the highest risk between 1985–2015, at ~ 30 suicides (per 100k, per year)
  • Asia had the smallest overrepresentation of male suicide — the rate was ~2.5x as high for men
  • Comparatively, Europe’s rate was ~3.9x as high for men

As a country gets richer, does its suicide rate decrease?

It depends on the country — for almost every country, there is a high correlation between year and GDP per capita, i.e. as time goes on, GDP per capita linearly increases.

Looking within a country and asking “does an increase in wealth (per person) have an effect suicide rate” is pretty similar to asking “does a countries suicide rate increase as time progresses”.

This was answered earlier in — it depends on the country! Some countries are increasing with time, most are decreasing.

Instead, we ask a slightly different question below.

Do richer countries have a higher rate of suicide?

Instead of looking at trends within countries, let’s take every country and calculate their mean GDP (per capita) across all the years in which data is available. Then let’s measure how this relates to the countries suicide rate across all those years.

The end result is one data point per country, intended to give a general idea of the wealth of a country and its suicide rate.

There is a weak but significant positive linear relationship — richer countries are associated with higher rates of suicide, but this is a weak relationship which can be seen in the graph below.

The 5% highest risk instances in history

We need to define a demographic as a year in a particular country, for some combination of sex & age. e.g. ‘United Kingdom, 2010, Female, 15–24’ would be a single demographic/point on the jitter plot below.

In order for a demographic to be in the top 5% for historic suicide rates, it would require a suicide rate exceeding 50.7 (per 100k) in that year.

Insights

  • 44.5% of these ‘high risk’ instances occurred between 1996 and 2005
  • 53.5% were in the 75+ age category
  • 96.9% were a male demographic
  • Of the 3.1% (42 instances) that were for women, 41/42 of these were in the 75+ demographic
  • The highest suicide rate for a demographic in any year is 225 (per 100k) — that’s 0.225% of the entire demographic committing suicide in 1 year
  • Two of the most consistently at-risk demographics seem to be men in South Korea & Hungary

These graphs help us observe how asking questions can pull out specific answers that can be then turned into decisions as attempts at actions of corrections.

All in all, we can conclude that the rate of churning data into information and further into knowledge could be the grounds of another breakthrough in the advancement of man. As it gets easier to use the tools that analyse and create insights for us, we can land on more accurate decisions not only for our organisations but also for our way of living. All thanks to man’s vintage habit of hoarding records.

Dataset and Liam’s insight references: https://www.kaggle.com/lmorgan95/r-suicide-rates-in-depth-stats-insights?utm_medium=email&utm_source=intercom&utm_campaign=datanotes-2019

--

--