Change is the only constant in Data Science domain

Abhishek Anupam
4 min readSep 11, 2022

--

With some first-hand experience and having witnessed Data Science domain expand like a wildfire, mature up and become a worldwide buzz word in last one decade, let me throw some light on this momentous advancement.

Mathematics and Statistics form the basis of Artificial Intelligence and Machine Learning domain. As a subject, they have been around since centuries with all its complexities in terms of depth and applications as we see now. But the other cogs in the wheel of data science domain were missing that was stopping it from growing this fast. Those are availability of data, data storage capacity and computational power.

In the last 3 decades, all these 3 capabilities have literally exploded. With the technological advancements:

  • It became easy to capture data from real world through machines (sensors), user interfaces and devices like computers and mobile phones
  • It became possible and progressively easy to transmit / transfer data over internet. This gave birth to IoT (internet of things) and now IoE (Internet of Everything — adding streaming human data also to the league)
  • It became exceedingly cheaper to store plethora of data with the humungous data centers and cloud storage. Now any one can store GBs of data without any significant cost
  • To make things even more conducive, the rise in computational power made it possible to performance tune the storage by enabling implementation of complex data structures like Hash maps, LSM tree, B — tree and many more
  • Needless to mention, computational power in combination with data storage prowess, brought in Big Data technologies and set the stage for grand play of AI & ML

Another factor thwarting the advancement of data science practice was availability of right mix of talent with the technology companies. Even now the industry has scarcity of people who can understand the mathematics and are equally daft at leveraging technology to apply the mathematics in real world scenarios.

Everything which I have listed above took time to develop and during its course we have seen technologies coming and giving way to newer ones. Let me highlight the contribution of internet in this scheme of things. With the help of internet, people were able to form communities which led to the opensource revolution and currently forms the backbone of AI & ML domain. It opened the ground for best brains across the world to innovate and put it on the table for everyone to use and build on it.

just a couple of decades back, to be able to gather the tools and people to store and process the data, it required significant fixed and ongoing investments. Only established firms like banks and consumer goods majors could manage to setup analytics processes in-house. This led to the success of SAS as a popular analytics and data science tool. It continues to be used extensively by banks, insurance companies and some consumer goods companies. But, because it was a well-guarded licensed product, there was space for more innovation to make this available to common people. The wind of opensource movement enabled the rise of R and Python to enable both academia (R is still most loved by statisticians, researchers and related folks like actuaries) and smaller groups experiment with statistical models. Python currently is used by large enterprises to build anything and everything around AI & ML.

The different algorithms (from linear regression to deep learning) also turned mainstream in a similar pattern, majorly following the trail of advancement in computational power. While quite in the beginning, business applications of data science were limited to traditional models like regression and decision tree, gradually it was possible to apply more tree-based algorithms. There were more and more use cases were decision-makers deemed these algorithms fit and useful. Now, it’s the same happening with computer vision and deep learning algorithms and that too on high velocity and veracity datasets. Even conservative and highly regulated industries like Healthcare are also warming up to application of AI & ML to solve business problems.

I feel we are yet to reach the cusp. The real disruption is round the corner awaiting some on-ground problems to be solved like lack of

  • High-quality talent
  • Risk appetite from P&L owners
  • Belief from business decision makers on this effective, though opaque technologies

We need explainable AI and people who understand it, at all ranks for it to achieve its full potential. I am sure it’s going to happen in a decade’s time, and I feel myself lucky to be part of this domain at such an interesting juncture.

--

--

Abhishek Anupam

Tech enthusiast. Seasoned in leading Data Science projects across Insurance, e-commerce and Healthcare