A Mathematical Philosophy

Ideas on a rigorous foundation for AI thought.

TraderCat
The Startup
7 min readFeb 11, 2020

--

From the formative theories of Aristotle and Confucius to the revolutionary ideas of Kant and Marx, the study of philosophy has served as the underlying structure of society. After all, answers to the fundamental questions about the human experience have a profound influence on the way we live.

With the recent hype surrounding machine learning and the gradual rise of artificial intelligence, a crucial question emerges: how exactly will philosophy be understood by the AI of the future? What kind of ego will they develop, what vein of ethics will they select, and how will this affect us?

These unknowns fascinate our imaginations, with films like The Matrix capturing fears of an impending dystopia. Though such an outcome is unlikely, much effort is already going towards addressing this concern. For instance, the non-profit OpenAI, as part of its mission statement, works towards ensuring that AI will be safe for and beneficial to humanity. What this means is hard to articulate: after all, the ethics and morals of human civilization have been developed over the culmination of eons of natural selection, trillions of diverse lives, and a myriad of historical events. At this point, it is difficult to do much more than simply molding AI incentives to fit our human philosophy.

While such an approach may be a pragmatic short-term solution, as time passes and AI becomes more complex, a deeper solution is required. AI must be able to develop its own philosophy from first principles, and these first principles must be unquestionable. We have no guarantee that our future AI overlords would take Mill’s greatest happiness principle for granted; however, we can guarantee that they will not find (nonzero) three perfect cubes that sum to zero.

Instead of the traditional approach to philosophy, then, the way forward might be a mathematical approach, by developing a rigorous language for philosophy. Perhaps we may even be able to prove a theorem that the AI of the future will come to a conclusion that is “safe and beneficial.”

This essay will not attempt to create such a framework (of course), though it will introduce some thoughts that hopefully will kindle rich discussion, and with it, insightful ideas.

Extracting the Discrete from the Continuous

A couple weeks ago, one of my longtime friends was telling me about his decision to study phase transitions for his PhD. While he was concerned with the mathematical puzzles underlying statistical mechanics, the first thought that came into my mind was that everything is a phase transition. When ice melts into water, two different states —solid and liquid—somehow blend. Take an ice cube dropped in a cup of water, for instance. From a distance, it is easy to distinguish between the two, but zoom in a bit and it becomes fuzzier: at what point do the H₂O molecules become loose enough to “melt”?

Here are some more thought questions:

  • When exactly does a sperm and an egg that join together into an embryo become a new living entity?
  • When exactly does a light that is supplied electricity start emitting enough light to count as turned on?

The ambiguity of transition is ubiquitous. For most things, we tend to make crude but practical approximations: an entity is either living or dead, a light is either on or off. One reason is for simplicity: the world is too complex for us to enumerate every single possibility. There are, after all, some 10⁸⁰ atoms in the Universe. Another reason is for comfort: the notion of a continuity between states is not one that we, as humans, deal with well. Anything could be validated by a slippery slope argument. And this works perfectly fine most of the time: whether this H₂O molecule is water or ice is not relevant.

Before we dismiss these phase transitions as unimportant, however, let us investigate them a bit further. Going back to the ice and water example, we could parametrize the state at any given point by the number of hydrogen bonds between the molecules. An H₂O molecule in ice has 4 bonds, and an H₂O molecule in water has roughly 3.4 bonds. Moving from the ice cube to the water surrounding it, the graph of the bond:molecule ratio might look something like:

Phase transition of ice melting to water.

The graph is usually flat, signifying an easily-categorized state. There is, however, a thin region where the graph is steep, signifying a phase transition. In this format, the melting process is easy to understand: as the hydrogen bonds break from increased temperature, the curve slowly moves left, and after some time, what was once ice becomes water.

Everything is a phase transition.

One important type of approximation is that of language. Between every pair of words — good and bad, man and woman, orange and black — there is a phase transition. In some situations, finding the correct word to match a given meaning could be as clear as day. In others, no single word seems quite fit the bill, instead requiring a longer explanation (or the invention of a new word).

Words to approximate “goodness.”

While traditional philosophical methods have a propensity towards discrete approximations, modern mathematical machinery should allow us to not only work directly with continuous models of the world, but develop a scheme for extracting the appropriate discrete approximations for any given situation.

Evaluating Scenarios

Most questions require some evaluation of some scenario, whether in examining an action or comparing two outcomes. In classical philosophy, most of this evaluation is qualitative, leaving it up to the reader’s internal models to agree or disagree with the statement, e.g. “war is bad in that it begets more evil than it kills.” While qualitative arguments are perfectly fine, there are many scenarios in which quantitative reasoning excels, such as when dealing with continuity.

One area where such evaluation is used is income inequality. Ethics postulates that the rich should give to the poor. A good metric for evaluating wealth distribution might be the amount of money the rich needs to give the poor until no more giving is due, a.k.a. the Gini coefficient. Another metric is the Theil index, which uses a formula commonly seen in physics and information theory. These numbers associated with the different distribution scenarios provide a concrete method of comparison: one that provides a bit more nuance in some areas and a bit less in others than qualitative evaluations like “there are people in this town who are starving.”

The two mathematical properties.

Both the Gini coefficient and the Theil index are based on basic mathematical properties of the underlying distribution: one on the L₁ norm and the other on entropy. As a result, these metrics arise naturally.

This “organic” nature of mathematical metrics could prove to be very powerful in keeping AI stable. Take the light being emitted from stars as an example. Recall from chemistry class that photons are emitted from atoms when an electron jumps down an energy level; this occurs randomly and the wavelength of the photon is fixed. However, the spectrum of the light emitted by the star is a continuous curve with a shape that only depends on temperature!

Emission spectra of the Sun and of Hydrogen.

In the case of income inequality as an ethical problem, perhaps there is some optimal distribution. Both pure communism and pure capitalism are unsavory; there is some trade-off between equality and ambition that, if formulated into a differential equation, could yield a solution.

Of course, we are nowhere near a quantitative model of the world, but if we do get close, we may find a natural equilibrium for philosophical questions. And even if rigor to the standards of mathematics may not be possible, rigor to the standards of physics could.

A Parting Thought

It would be nice to have some sort of nice “visualization” for all things, much like in our ice and water example from earlier. With the world being so large and complex, a simple graph cannot capture everything we need. To remedy this issue, we could interpret the simple ice and water graph as a projection of a higher-dimensional structure, concerning only the relevant features. This hints at a potential mathematical representation of philosophy: one very large smooth manifold that, depending on the question at hand, we can take appropriate cuts from.

With a mathematical representation, we could apply quantitative evaluation methods, such as those used in income inequality, to other problems, including ones that are traditionally addressed with qualitative methods. We have been doing just this for many slices of this “universal” model — machine learning is a prime example. The big question is, how can we, like when piecing together 3-D protein structures from images in cryo-EM, obtain a more complete rendition of this “universal” model from the many small slices that we have. Perhaps the solution, in the spirit of reinforcement learning, is from the bottom-up. But perhaps, the key to doing so is by developing it from the top-down.

--

--