Questions, Always and Anywhere, and Answers, Today and Tomorrow

Suraj Regmi
Probability and Statistics Stories
3 min readOct 13, 2019
Photo by Emily Morter on Unsplash

There are questions, always and anywhere. In the world always ready to embrace more efficient ways to do something, the change is a never-ending process. The always never dies, never contents, just as how the sky does not end. Similarly, in any of the fields, in any of the places, and in any of the circumstances, there are questions asked, anywhere, summoning the most creative of the answers. In the world brimming of never-ending and ubiquitous questions, data is the new answer.

Photo by Luke Chesser on Unsplash

By equating the human experience with data patterns, Dataism undermines our main source of authority and meaning, and heralds a tremendous religious revolution, the like of which has not been seen since the eighteenth century. In the days of Locke, Hume and Voltaire humanists argued that ‘God is a product of the human imagination’. Dataism now gives humanists a taste of their own medicine, and tells them: ‘Yes, God is a product of the human imagination, but human imagination in turn is the product of biochemical algorithms.’ In the eighteenth century, humanism sidelined God by shifting from a deo-centric to a homo-centric world view. In the twenty-first century, Dataism may sideline humans by shifting from a homo-centric to a data-centric view. — Yuval Noah Harari, Homo Deus: A History of Tomorrow

Questions, good. But, how do we answer the questions that are really difficult to answer, both in terms of infrastructures and time? Yes, data helps in finding out the answers but how can we choose the most efficient way? For most of the practical problems requiring data and data analysis, the most challenging part is figuring out how data is going to help find out the solution and engineering out the feasible data solution. Suppose the poverty rate figures became so important in one of my tasks and I want to calculate the poverty rate. The naive way is to ask all of the people of Nepal some questions and come up with the poverty rates. The solution begs the question of its feasibility: how costly will that be and how much time will that take?

One of the solutions to this problem is sampling. We can take some samples of the total population and come up with statistic values. Then, using those statistic values, we can estimate the parameters (like the population poverty rates). Easy it may sound, it has its own limitations, assumptions, and complications. The samples we take should be random in some way. There are many sampling methods like simple random sampling, systematic sampling, stratified sampling, convenience sampling, and cluster sampling. We should choose the sampling method in such a way that the sample can be representative of the population data. While choosing the sampling method, feasibility, cost, and representativeness should be taken into consideration.

The modern solutions using big data sources like social media data, telecom data, night light data, satellite maps data, etc have gained attraction nowadays with the rise of state-of-the-art techniques like machine learning, deep learning, etc but the lack of universality of the solutions limits the straightforward use and testing of the solutions in the field. However, it is exciting to see the ongoing research on the use of such modern solutions to find answers to difficult questions. Should such universal solutions arise with the advancement of AI and the other technology tools, we can get the answers way more easily and frequently than we have been today.

Human beings have solved a lot of problems, coming to information and artificial intelligence era from the stone age. Our mind, able to think, create, communicate and analyze stories, has done the wonders. The more wonders are unraveling out tomorrow, with Questions, Always and Anywhere, and Answers, Today and Tomorrow.

This blog is the part of probability and statistics series, so this blog will be followed up with another blog, “Sampling Distribution of the Sample Mean — Playing with UN World Population Data”, and many more. Stay tuned!

--

--

Suraj Regmi
Probability and Statistics Stories

Data Scientist at Blue Cross and Blue Shield, MS CS from UAH — the views and the content here represent my own and not of my employers.