Data Science and Blockchain are Opposite Philosophies

Venkata Pagadala
6 min readFeb 10, 2019

--

If we can call Data Science a science, why not call it a philosophy as well? I would defer the debate on what can be called as a science or philosophy for another time. Right now I want to throw some light on how these two areas of work have more oppositeness than similarities.

Someone mentioned to me recently that, uncertainties, randomness and probabilities belong to Statistics, a branch of Mathematics.

It got me curious enough to inspect into the matter. I believe these things should belong more to Physics than Mathematics. It is not because Mathematics and Computer Science can’t even produce uncertainties and randomness. Also not because Probability can’t even be talked about without describing a physical experiment, unlike most other branches of mathematics, which deal only with abstract entities.

It is because, Physics is opposite to Mathematics in a most fundamental way. I know you can’t wait to say that Physics uses Mathematics as a modelling language, but not opposite to it. Continue reading.

The mathematics of cuts

The oppositeness arises in the subject they inspect into. Physics observes and tries to understand the physical world made of continuum. Mathematics observes and tries to understand the imaginary cuts in that continuum. These cuts are called numbers, geometrical objects, variables, equalities and so on.

Accepting continuum as an uncut entity was relatively recent in mathematics, and it still lacks tools and language to do away with the cuts entirely. This is also due to refusing to accept infinitesimals as the first class citizens of mathematics.

Cuts are opposite to what they cut into. A cut exists only because there is something to cut. That means there is always an uncut continuity between the cuts. Thus cuts never fill the continuity.

That statement goes against some foundational topics in Mathematics, starting with Real analysis. Historically, Mathematics has refused to recognise the need to model the physical reality of uncut continuum. Dedekind, Cantor, Hilbert and several others have built a foundation, based on cuts which is devoid of continuum and uncertainty.

Naturally, the nature has mocked the model, by throwing multiple infinities at them. The Continuum Hypothesis gets added to the list of great problems in Mathematics. It asks a question whether there are more infinities between the first two infinities.

Lines are not made of points

People were told that a line is made of points. How many of them? Hmm.. 2^∞ - which is supposed to mean a lot more than infinite where infinite itself is a larger entity than everything. The meaning of multiplicity, order, equality and being more than or less than are poorly defined in these contexts. But still the diagonal argument of Cantor is hinges on “how many” or “one-to-one correspondence”.

Since Mathematics shoves all uncertainty into the units, it can work with clean numbers that are made of fine cuts and points. Rational and irrational cuts were defined with conditional statements. A half of something is obtained by splitting a thing into two parts such that both parts are equal. A square root of something is an intermediate point such that it is equidistant from both ends in an exponential space. These cuts are just as imaginary as the bounds are. Some cuts are called rational, and considered more precise cuts than the other cuts called irrational. This was due to cuts in the exponential space (algebraic irrationals) not quite aligning with the cut in the linear space (rational). I believe the irrationality is mutual or relative. The rational cuts are just as irrational for a observer from the exponential space.

Thus Mathematics largely shied away from the reality of the physical world and chased the non-existent cuts in the continuum (and called them with strange names). This was fine to some extent, as a modelling language of the physical world.

However, this was taken too far to deny the continuum, the very subject it was supposed to model.

The physics of the continuum

Physics, on the other hand, was concerned about interpreting and describing the physical world, using patterns and models. The standard model is being groomed to be the ultimate model of the nature.

However, physics has its own shackles. The quantum world challenges the very evolution of human logic and reasoning. This was made worse by using a modelling language called mathematics, that doesn’t recognise the uncut continuum. Uncertainties abound.

Are things distinguishable? Does cause precede its effect? Are spatial locations just probabilities? Does something exist if unobserved? What makes something to collapse to reality? What is information? Is the clock running the time or time running the clock? What if nothing moves in the Universe — does time still exist? Questions arise at a faster rate than being answered.

But still, physics stays clear of “line is made of points” kinds of philosophies and just gets its work done with numbers with a sufficient precision. Irrationality of the numbers never mattered. Because physics was more concerned about what lies between the cuts instead of the cuts themselves.

Back to blockchain and data science

Cryptography, the foundation layer for blockchain, is an exact science. It is an incarnation of mathematics of cuts, certainties and numbers. There is no room for approximations, generalisations or patterns. The hashes match or they do not. There are no similar hashes or relatively closer hashes. Data science is exactly the opposite thing — approximations, similarities, patterns and generalisations. The data points can be closer, clustered or be too far from each other. The content can be compared to training data sets.

In other words, data science is like physics, about constructing the missing continuum using the data points. Blockchain is similar to finding the rational cuts in the continuum. The cuts need to exactly match.

There are similarities too. Of course, both blockchain and data science descend from a common ancestor. Who is that great grand ancestor? Who else? The ultimate mystery of the universe — the randomness and uncertainty.

Randomness and uncertainty

I get overwhelmed with thoughts by the very mention of these two things. These two (are they really two?) are the root of existence. When mathematics and physics dig deeper to find about things that rule the realms of irrational numbers and quantum worlds, they are greeted by the hosts — uncertainty and randomness.

Uncertainty is truth, nature, extent and existence. A measure is a partial collapse of uncertainty to a reality or a state of certainty called knowledge. Complete certainty, such as points, numbers, geometrical objects, including the Euclidean or Riemannian space are incarnations of the certainty or knowledge. The deterministic knowledge is good for modelling the reality, only as long as the un-modelled uncertainty is acknowledged by the model.

The cut or point doesn’t cause any harm as long as we allow it to have its own fuzzy non-zero anonymous existence in its natural habitat, the continuum. If we force it to a zero size or explicit existence, it attacks back, like a cat forced to a corner. It shows off it’s omnipotent incarnations as meaningless multiple infinities.

The touch of randomness

Randomness touches blockchain and data science in, again, opposite ways. It is central to cryptography in a constructive way. More entropy (randomness) is desirable for creating better cryptographic objects.

But randomness is the reason why a scenario exists for statistics and data science to interpret. More randomness would mean lack of patterns and ineffectiveness of the data models. Machine learning and content recognition are the result of similarities in data.

Romancing with the unpredictable probabilities and patterns, the data science has its own thumb rules, superstitions and a mystic land of high dimensionality which one should stay away from.

Marrying data science with blockchain

Marriages are made in heaven, and probably broken in hellish circumstances. Being opposite entities is a quality for playing the complementing roles, or as components of a full picture.

In the real world, blockchain and data science address two different, non-competing concerns. One safeguards data from invalid changes, and the other tries to understand the data. The data to be understood, exists only because it was evolved through valid changes. The validation of the changes is only meaningful, because the data need to be understood and used.

--

--