The Science of Counting

Mark Temple-Raston
9 min readJun 30, 2023

--

M. Temple-Raston, PhD, Precision Alpha

Counting is a special class of measurements, where discrete events, units or states are counted exactly as natural numbers. In any context, business or life, counting is always exact and independent of the person or system counting.

When faced with uncertainty, probabilities are calculated by defining and counting “states”. For example, a set of binary states could be called the “up” and “down” states, or, a set of bins and the integers in each that define a probability distribution. Through the counting of states the probability of future measurements can be calculated, enabling informed forecasts.

Probabilities can often be calculated exactly. For physicists and economists, probabilities are embedded in the expected value. Once the expected value is worked out analytically, the probabilities can be peeled off. The mathematical machinery used to calculate expected values is the same used in machine learning. E.T. Jaynes (2003) provides a systematic, comprehensive and very readable blueprint for what he calls a “scientific reasoning robot”, that we could today call machine learning. Jaynes offers valuable insights into how probability theory and the logic of science can be used in many practical business situations, including in manufacturing, inventory, transport and communication.

To better understand machine learning, our focus will be on the science of counting. The science of counting is familiar to undergraduates in physics because it is the foundation for electronics. Both the theory of electricity (electronics circa 1850) and chemistry are based on the science of counting, contributing significantly to the second industrial revolution in the late 19th century.

In the next section, we delve into the science of counting by assigning counting constraints to Machine Learning for general time-series measurements. Plotting the input and output time-series for real-world data (financial markets) exposes the richness of the time-series data and its non-equilibrium nature. Notably in financial market time-series, plots reveal the presence of dissipative structures (stable, repetitive patterns observed in non-equilibrium systems). Dissipative structures possess self-regulating characteristics, contributing to their stability and dynamics.

Section three develops forecasting for the science of counting.

Section four discusses contemporary operational challenges for conventional machine learning that the Science of Counting can resolve.

Presented as web services, the analytical challenges of the science of counting and Machine Learning can be abstracted away. Science of counting web services provide model-free tools for data scientists and business analysts that focus on understanding the reasons for the dynamics of their time-series, not on model building.

The Science of Counting

Machine Learning, in general, enforces constraints on maximum entropy as a variational problem to be solved. Science itself is based on a very modest “faith” statement: I believe what I measure. Without belief in the truth of the measurements taken and their intelligibility, science is meaningless.

By using counting measurements as the constraints on maximum entropy, Machine Learning and Science are elegantly united.

In general, numerical estimates, regressions, simulations, or mathematical bounds can be employed to estimate a probability. However, the simplest, non-trivial example of scientific machine learning is exactly solvable: the science of counting.

In the science of counting the input data (events, states, or units) in time-series are enumerated by humans exactly; so that no error bars are present on the input data (see Figure 1a below). For a two-state example, record the number of heads and tails for coin tosses, n+ and n-, respectively. By keeping track of the heads and tails on each toss, a path in the grid of all possible paths is determined. Notice that only the intersection points of the grid are important; there is nothing of value between the intersection points. Therefore, there are no error bars on the measurement data because humans can count exactly.

For a two state example, record the number of heads and tails for coin tosses, n+ and n-, respectively. By keeping track of the heads and tails on each toss, a path in the grid of all possibilities is determined. Notice that only the intersection points of the grid are relevant; there is nothing in between the intersection points. Therefore, there are no error bars on the measurement data because humans are able to count exactly.

Machine Learning that implements “I believe what I measure” applied to counting measurements, produces a unique expression/function for the energy. The energy is a derived measurement. In Figure 1b, the path of heads and tails measured in Figure 1a is used to evaluate the energy function. Because the input has no error bars, the calculated energy also has no error bars! The energy of course does not have integer values like the input data, but nonetheless is free of error bars. Therefore, the science of counting presents precise scientific measurements that can be plotted for better understanding.

Shows how science of counting measurement data is used to calculate important scientific information without error bars. Next we introduce a change of coordinates so that the mechanical energy defines a zero-offset. The difference between the measured energy and the mechanical energy is the non-mechanical energy. When humans are responsible for changing the energy, the non-mechanical energy defines group emotions. The original meaning of emotions in French was for groups, for example, the “emotions of the City”.

Unlike conventional AI, there is no model, or, even room to introduce parameters. The science of counting is the math that implements our counting measurements as truth.

There are surprising implications. As we shall now show, the science of counting contradicts a modern dogma that science and emotions are in conflict, so that each undermines the other. Emotions, in fact, are seen to be integral to science, and to complement mechanics. That is, emotional displacements are already in the science and do not need to be artificially added. Emotions (more precisely, a rollup of all emotions) are exactly calculated from input time-series data, that generalize sentiment analysis to a spectrum of like or dislike values (or, pleasing or displeasing, or, bullish or bearish, and so on, depending on context).

To see emotions more clearly, we change the coordinates in Figure 1b to make the plots more expressive. In the two-state system in Figures 1a and 1b, equilibrium occurs when the probability of heads equals the probability of tails. In equilibrium, all states are equally likely.

By subtracting off the equilibrium energy from the total energy, we are left with the displacement energy from equilibrium.

The displacement energy is the net energy entering or exiting the system. When the displacement energy is equal to zero (∆E=0, x-axis in Figure 1b), then there is no energy entering or exiting the system, the total energy is constant (conserved), and therefore defines a mechanical system. We can also say that the system is “objective”. Objectivity means that measurements must be the same for every observer regardless of time — so that no net energy is allowed into or out of the system due to any observer. Objectivity implies mechanics, and mechanics implies objectivity.

When the displacement energy does not vanish (∆E≠0), the dynamics is neither mechanical nor objective. Not objective is what subjective means. Therefore, there are two natural components to the total energy: a mechanical/objective component, and a non-mechanical/subjective component.

When the subjective displacement energy is due to human behavior alone, then the displacement energy can be called (group) emotions. The rollup of all emotions (the displacement energy) for a group is measured exactly. Note that the original meaning of emotions in French was as group emotions, as in “the emotions of the City”.

Our tour of what the science of counting brings to analyses can be confirmed through scientific induction with real time-series data. The plots in Figure 2 present derived measurements for the Science of Counting from financial market time-series data. The input data consists of six months closing prices for Pepsi (PEP). See closing prices in Figure 2a. The plots below display only the last three months. Like coin tosses, the measurements for PEP are the numbers of up and down days.

Fig. 2a plots the closing price time-series for Pepsi (PEP). Six months of history is used, but only the last three months are plotted. In Fig. 2b the displacement energy from mechanical equilibrium is calculated. Energy is seen to flow in and out of the system. In Figure 2c, two sets of next day probabilities are plotted: mechanical (short) and thermal (tall, Crooke).

Figure 2b is the displacement energy from equilibrium. Clearly the system plotted in Figure 2b most of the time is not in equilibrium (∆E≠0). We see energy flowing into the system in November, and then energy flowing out starting at the beginning of the year and deepening.

In Figure 2c, two sets of next day probabilities are plotted: mechanical (short) and thermal (tall, Crooke). Our derived measurements for financial market time-series on both the NYSE and NASD indicate that markets are rarely in equilibrium, and usually far from equilibrium. Away from equilibrium, thermal probabilities routinely exceed 0.65 and account for large price movements.

Finally in Figure 2d below, two important derived measurements for the symbol General Electric (GE) are plotted: the free energy (blue) and temperature (red). Free energy is that part of the energy available to do price work, and as expected the temperature increases as price work is done. The plot of free energy and temperature in Figure 2d also measures the temperature at thermal equilibrium (rectangle and arrow). Non-equilibrium oscillations around thermal equilibrium are observed in Figure 2d — also known as dissipative structure, or a heat engine.

Non-equilibrium thermal oscillations in financial markets are common place. In January 2022, a systematic analysis of both the NYSE and NASD was undertaken, and approximately 1/3 of the symbols in both exchanges presented overt non-equilibrium thermal oscillations.

Forecasting

We summarize what we have learned from our analysis so far that could improve forecasting:

  • Exact probability should lead to improved forecasting over one that has only estimated probability,
  • Dissipative structures have elevated thermal probability (calculated exactly) that can dominate the system dynamics and their stability improves forecasts,
  • The temperature of the reservoir, T_R, in which the system operates should improve a forecast compared with one that discards or neglects the information.

Field testing indicates that the thermal probability is largest when T_R is set to the correct value. T_R is an important value for accurate forecasting. The previous section showed how to measure T_R.

With a value for the reservoir temperature, T_R, the exact forecasted probability based on learning can be brought to play: the exact probability is fed to a state machine to produce an output state. Add the generated output state to the historical record and repeat the process. From this an individual forecast is produced for any finite horizon. An ensemble forecast is created by generating multiple individual forecasts.

In Figure 3 below, the science of counting produces an individual forecast for the same six months of ZIM closing prices as in previous sections. We plot the last 30 days and then the individual forecast to a horizon of 30 days (total of 60). The dotted vertical line is where the individual forecast begins. The state machine with the probability computed by the science of counting behaves consistently as we move from past to future.

Figure 3. Individual forecast for both thermal and non-thermal probabilities, temperature and free energy. The Expected Return is based on the start date.

Operational Benefits

The consistency and transparency of counting on which businesses rely, is lost in conventional formulations of machine learning.

The science of counting returns transparency to machine learning by focusing on what we know best, counting measurements.

With the deductive transparency of the math, conventional problems with machine learning vanish. Notably, in the Science of Counting there are:

  • No model parameters to estimate,
  • No model biases to be concerned about,
  • No iterative model development processes to design, implement and accelerate,

simply because the science of counting is model free. Table 1 summarizes the key benefits realized from the Science of Counting.

Table 1. Compares Conventional Machine Learning with Science of Counting.

There is still bias to defend against, however. The bias is now entirely “data bias”. While data biases don’t exist for closing prices, they may well in other time-series data. Data biases can be unintentionally introduced through the data acquisition process. In short, the tasks that remain for the data scientist and analyst using the science of counting resembles more the world of the experimental scientist, rather than the theoretician (model builder).

Bibliography

[1] E.T. Jaynes, Probability Theory: The Logic of Science, Cambridge University Press (2003).

[2] G.E. Crooks, Phys. Rev. E 60, 2721–2726.

[3] D. Kondepudi, I. Prigogine, Modern Thermodynamics: From Heat Engines to Dissipative Structures, Wiley (2015).

--

--

Mark Temple-Raston

Founder, CIO and Chief Data Scientist for Decision Machine and Precision Insight. 20+ years on Wall Street. PhD, particle physics, Cambridge University.