Thinking Before Eating

Developing a More Rigorous Heuristic

Sean McClure

Published in

NonTrivial

20 min readMar 28, 2023

Check out the NonTrivial Podcast to listen to a high-level discussion on this topic.

Quality and Decision Making

Quality is important in our lives. We want to consume and create quality things. On the consumption side we want to read good books, watch engaging entertainment and meet interesting people. On the creation side we want to write good articles, make reliable software and give memorable presentations.

When it comes to creating things we have more control over the quality of what we produce. Creating works as an open system, where the inflow of information allows us to decrease the entropy of what we build over time. We can fold-in learnings to improve the quality of what we make. In open systems quantity has an equivalence with quality since the more we do the better our work becomes.

But when it comes to consuming we don’t get to control the quality. We have to assess the quality upfront, say at purchase time. Things we consume are created as a closed system, replicated with no inflow of information. Whatever level of quality was decided on by the producer is what we get.

Closed systems see their entropy either remain the same or increase. Since replication locks-in an object’s ingredients at sale time one might assume replicated objects fit the latter scenario; that they don’t increase in entropy. But if we compare high quality to low quality things we notice the entropy of the low quality object had to be increased in order to satisfy the replication process. So there is a sense of increased entropy in things made for scale.

To understand this intuitively we can think of mass production. Mass production cannot work by merely taking the local craftsman’s creation and making more of it. There are concessions that must be made in order to make the assembly line possible. As I will argue in this article, those concessions relate to increasing an object’s entropy, and in turn decreasing its quality.

We all have an intuitive sense of quality when it comes to the things we consume. The local craftsman who makes coffee tables can source local wood and use high quality glue and nails. There is a level of dedication we know will not be found in something made for the masses.

But such binary rules (shop locally or buy cheap) for quality are not that useful. Everyone has a requisite level of quality they are willing to pay for. For some, the local craftsman charges too much to justify the purchase of a table that I will only use once in a while. I want food to be healthy, but life needs to be lived, and I’m not about to never eat the odd mass produced snack.

This means we cannot define quality in dualistic terms. There is some level of quality that is deemed acceptable. Knowing where to strike that line is the challenge. We need a definition of quality that helps with decision making. We need a definition that corresponds to a heuristic we can use in the complex, nontrivial grey area that is our real world.

Challenges in Defining Quality

Defining quality is not easy. It generally means an object that will go a long time without breaking. This definition works even for things like writing, entertainment and people, as long as we’re thinking in informational terms. If someone can read an entire essay without the main argument breaking then it would be deemed quality work. A movie with a continuous plot is better than one that doesn’t “add up”, and a person who never deviates from their system of values might be considered a higher quality person than one who does (e.g. code switching).

Even food works under the breaking definition, since quality food is food that doesn’t diverge from the experience and/or health effect it was made for. Again, we’re not talking about physical breaking (obviously food breaks physically when we chew it); informationally, quality food does not “break” from its intended purpose.

The deeper notion at play here is the ability of an object to solve the problem for which it was intended. We can thus think of an object that gets created as a solution that maps to the problem is was created to solve.

**Figure 1** *A solution maps to a* problem.

One way to capture the notion of an object solving its appropriate problem is through category membership. If something breaks then it is no longer a member of its original category (a broken table is no longer a table). As long as the table provides a flat surface to work on it belongs in the table category. As long as the cracker provides a portable method to consume cereal grain it belongs in the category cracker.

But category membership doesn’t help in decision making since at purchase time both high quality and low quality versions of an object belong to the same category. At purchase time, a knockoff purse is a purse just as much as a Blue Crocodile Hermes Birkin Handbag.

Price might be a decent signal to quality, but it’s hardly reliable. The information content of price includes far more than the material used. It includes the psychology that underlies supply and demand.

What makes assessing quality challenging is that at decision time we don’t know when something will bounce out of its category, and yet that’s the only piece of information that really matters.

A More Robust Definition (intuition)

This leads me to what I believe is the only definition of quality that is robust to the examples I have given. A high quality object is one that is more likely to maintain its category membership over time. While this might sound no different than the category membership definition above, what makes it usable is the probabilistic aspect of the definition. We now are thinking in terms of the odds of something deviating from its category, which potentially connects it to a heuristic.

I say heuristic, rather than “analysis” or “prediction”, since under complexity it makes more sense to think of probabilistic reasoning in terms of the inborn human ability to assess difficult situations. I am not interested in taking a bunch of data and trying to come up with failure times across real world objects. That’s too unrealistic. We are talking about making decisions in real life, in the moment.

But how can we assess the likelihood of something remaining in its category?

This is only possible if there is a notion of degradation in the mapping between solution and problem. There has to be some sense of partiality to the connection between a solution and its problem. In other words, a low quality object should be expected to have a weaker mapping between itself and the problem it solves.

It’s important to note here that degradation does not bounce an object out of its category. A solution with a degraded mapping still belongs to its category, but has an increased likelihood of bouncing out by virtue of a “weaker” link between solution and problem. The knockoff purse should be considered to have a degraded mapping between itself and its problem, despite still being called a purse.

But what causes an object to have a degraded mapping between solution and problem? It is not something that can be detected at the surface, since again, both high quality and low quality things are members of the same category at purchase time. It must be something internal.

This brings us back to the concessions that must be made when an object goes from the local craftsman’s workbench to the assembly line. Consider again the mass production of the coffee table. A mass produced table doesn’t just solve the “table problem”, which is to provide a flat surface to work on. It also solves a number of other problems related to mass production. A mass produced table must be easily transported. This means it must be easily packed into tight spaces, be relatively lightweight and easily reassembled. A mass produced cracker isn’t just solving the problem of providing a crisp wafer for the convenient consumption of cereal grain. It’s also solving the problem of having a long shelf life, remaining intact during transportation and having consistency in flavor.

The following figure shows the concept of additional problems a mass produced object must satisfy, with the cost being a degraded mapping between solution and problem:

**Figure 2** The concept of degraded mapping between solution and problem, caused by a solution solving more than its original problem.

Comparing the above scenario to high quality counterparts, we know that high quality things only solve the problem for which they were intended. A high quality table does not need to be transported easily or reassembled countless times. Your grandmother’s homemade crackers don’t need to have a long shelf or consistent flavor.

But if the category doesn’t change during a loss in mapping, what is it that changes? The degradation in mapping must be due to a change in the internal description of the solution category (since the top-level category remains invariant).

We can understand the internal description of a category by simply referring to the definition of category. A category is a division of things regarded as having shared characteristics. In other words, categories are things we create by abstraction. We notice shared properties among superficially different objects and group them under the same category.

It is those objects that get grouped into a category that represent the category’s internal description. We can thus think of the internal description of our solution category as being made of lower level constituents, which are some group of objects.

**Figure 3** The Solution category formed by some lower-level description that forms the category.

If we are talking about coffee tables, the internal description is either those of the local craftsman:

…or those of the global furniture company:

Thus, we can consider the internal description of an object as being represented by a set. I will call this an object’s description set.

I said that the degradation in mapping between solution and problem must be because of a change in the internal description of the solution category. That change must account for the ability to solve additional problems, and this means there must be additional configurations available to the system.

In the description sets above, the local craftsman’s coffee table can only solve one problem (if we don’t let our imagination wander too much). But the global furniture company’s description set does far more than just provide a flat surface. It can also be easily packed into tight spaces, is relatively lightweight and can be easily reassembled. This is only possible by virtue of the object being able to internally have more configurations. Just think of more configurations as having more internal variety to meet additional problems. For example, if the coffee table needs to provide a flat surface and be easily reassembled then the table must be more than just a solid piece of oak. The table would need parts that are commensurate with reassembly.

In order for there to be more possible configurations in a description set it must be more heterogenous than a description set with less possible configurations. Heterogeneity is the state of being diverse in content; obviously more diversity leads to more possible configurations. So we can say that heterogeneity of a description set must increase in order to solve the additional problems that accompany the concessions made for mass production.

Heterogeneity and its relation to the number of possible configurations connects an object’s description back to the concept of entropy, which I alluded to at the beginning. Entropy measures the extent to which the probability of the system is spread out over different possible microstates, for a set of macroscopic variables. The more microstates available the greater the entropy.

This means a description set that is more heterogenous, and thus higher in entropy, has more ways it can achieve the same high-level category. This is why the mass produced coffee table can solve additional problems. It can maintain the table category while “leaking” its description into additional problems. An increase in entropy in an object’s description set allows it to “sneak” in additional purposes, while still under the guise of its original category.

**Figure 4** Increased entropy in a category’s description makes it possible for additional problems to be solved.

Compare a simpler Pork Leek Dumpling to the mass produced version:

**Figure 5** Comparing a homemade dumpling to a mass-produced dumpling, in terms of their internal descriptions.

The higher quality dumpling, which imparts the right flavor, texture and nutrition (to those who know dumplings) has simple ingredients. But the concessions a company must make to scale the dumpling must have a more heterogenous mix of ingredients to contain the entropic capacity to solve mass production problems (while still being called a dumpling).

So far there seems to be a fairly simple heuristic coming out of this discussion. The less ingredients something contains the higher its quality. This is often true for food, and can also be argued for things like coffee tables, books, entertainment and software. Books that don’t layer on the rhetorical flourish, entertainment that gets to the point and people who have only a few friends or core values could be considered higher quality. Articles that aren’t pedantic, software that minimizes “spaghetti code” and presentations that do more with less. Less ingredients does lend itself to a more direct mapping between solution and problem.

But there are cases where this is not necessarily true. If my Grandma decides to make shortbread cookies using white flour, margarine, refined sugar and non free-range eggs, it could be argued these are less healthy than cookies made with almond flour, honey, free range eggs, grass-fed butter, vanilla, coconut oil, and Himalayan sea salt. In other words, just because there are fewer ingredients does not mean something is higher quality.

It would be good to have a more rigorous understanding of why fewer ingredients do often mean better quality, and why and when there are times where this might not be the case. A way to think about the ingredients of the things we consume, be it coffee tables, books, crackers or people (as in befriending them). Something deeper than just “less is more” that allows us to strike a mark within the grey area that lies at the intersection of quality and real world decision making.

Degradation as Low Categorical Invariance

Our quality heuristic should satisfy the following conditions:

generally defend the notion that less ingredients often mean higher quality;
handle situations where more ingredients could in fact be the better option, and;
hint at the likelihood of an object bouncing out of its category.

To satisfy these 3 conditions I suggest the best way to think about quality is in terms of categorical invariance. Categorical invariance is the degree to which the inner description of an object remains in its category under perturbation. Perturbation is any deviation of a system or process from its regular or normal state. If an object has a better chance of remaining in its category despite some transformation on the system, it is more categorically invariant, and thus more robust over time. A low quality object, while still having the same category as a high quality object, should have lower categorical invariance since it would be more susceptible to breaking out of its category due to an external influence (e.g. stressor).

The nice thing about categorical invariance is we can connect it to the concept of entropy, and thus to the level of heterogeneity in the description set of an object. This means we have something real we can anchor our heuristic on, by assessing the heterogeneity of an object’s description.

Let’s see how this might play out mathematically.

A Mathematical Treatment

To ground the intuition of the previous section in mathematical terms we need to unify the concepts of partial functions, categorical invariance and entropy. Partial functions will be discussed in the context of a restriction structure imposed on an otherwise total function between categories. Categorical invariance will capture the notion of likelihood of failure (i.e. degraded mapping due to poor category membership), and finally entropy will connect everything to a physical reality that works with a heuristic.

To begin, a partial map between categories is a partial function. A partial function is one that is only defined for part of a category’s domain. Thus a partial function from solution to problem f:S -> P is much like a total function from solution to problem, except that f(x) may not be defined for every object x in S.

**Figure 6** The difference between a total function and a partial function. A partial function provides a mathematical representation of a “degraded mapping” between solution and problem.

In categorical terms we can say that the description set of an object (like a table) is the description set of a category (table), and has a restriction structure imposed on it via the phenomenon of degradation. If we consider the mapping between solution and problem to be a “degraded functor”, we can formalize the degradation in terms of a span (as per a restriction category):

To ground such abstract formalisms in physical reality we ask what physical cause underlies the creation of a subset of an otherwise full description set of a category. This amounts to asking “where does the Bₓ in figure 6 go?” or “what causes it to change?”

It’s important to remember that when we say something belongs to a category we are also saying that it solves a specific problem. If we remove enough lines from the total function in Figure 6 we will eventually remove the object from its original category, because it will no longer fit the definition of solving its problem. We are after the middle ground, where an object is still a member of its category, but has a higher likelihood of breaking out of its category. For some intuition, think of a rope with only 3 strands versus one with 4 strands. They are both members of the category rope, but the 3 strand rope has a higher probability of breaking.

This bring us to categorical invariance. An object with high categorical invariance has a higher probability of remaining in its category, while that with a lower categorical invariance has a lower probability or remaining in its category. We can model categorical invariance in a fashion similar to Vigo’s method, where he uses the same term “categorical invariance” to capture what might be considered a more general notion of information (compared to Shannon’s information theory), particularly as it applies to learning¹.

As such, let us represent a category’s invariance in terms of so-called “partial invariants.” Partial invariants are symmetries that exist when a group of elements undergoes a transformation along one or more dimensions.

Using Vigo’s example, if we have a triangle that is black and small, a circle that is black and small, and a circle that is white and large, and apply a shape transformation, we will be left with a perturbed set and some partial invariance, shown by the symmetry in the following figure (both the original set and perturbed set have 2 objects unchanged):

***Figure 7*** *The concept of partial invariance when comparing an original and perturbed set. Symmetry shown in dotted outline.*

In the above example, categorical invariance is a measure of the partial invariance of the category with respect to the shape dimension.

We can represent these partial invariants as a vector of discrete partial derivatives:

…where the double lines indicate where a change has occurred in the category with respect to a change in each of its dimensions, as per:

p in (2) is the number of objects in the category. Note that the partial derivative transforms each object with respect to its ith dimension, and will evaluate to 0 if the object is still in the category after the transformation (i.e. no change), and evaluates to 1 otherwise (changed). In other words, the expressions inside the double lines give the proportion of objects that have not changed in the category.

The overall categorical invariance ϕ of a category can then be expressed as:

The specific form of equation (3) relates to considering the Euclidean distance between structural or logical manifolds. See Vigo¹ for details.

At this juncture we have 1) formalized the notion of a degraded mapping between solution and problem as an imposed restriction caused by a loss of connection between sets, and 2) referred to a mathematical expression for the concept of categorical invariance. Our last step is to relate these ideas to entropy. Remember, we want a physically real anchor with which to operate our quality heuristic.

Entropy is closely related to complexity. For example, within the context of algorithmic information theory we can consider the descriptive complexity of our category’s description set in terms of Kolmogorov complexity. The Kolmogorov complexity of a string is the length of the shortest program that outputs that string; colloquially, the smallest description one can make of the string.

Vigo’s “structural complexity” Ψ is an expression that captures this basic behavior¹:

Since the numerator p is the number of objects in the category, and the denominator is the categorical invariance ϕ from (3), we can see that a category’s structural complexity Ψ is directly proportional to the description set’s cardinality, and indirectly proportional to the category’s degree of invariance. Thus, the number of items in a category’s description will increase a category’s structural complexity, BUT this will be diminished by the degree of categorical invariance of the category.

We can now relate our mathematical edifice to entropy. Compare again the description sets for the local craftsman and global furniture company:

The description set of the global furniture company (DG) has higher cardinality, leading to higher structural complexity Ψ. The Kolmogorov complexity of DG is higher since it requires a longer description to summarize (cannot be compressed to a length as short as DC). Of course cardinality alone cannot produce an increase in complexity, unless the increased size of the description set is concomitant with increased heterogeneity.

To understand heterogeneity as it relates to a category’s description we have to keep in mind how categories are made. We create categories by grouping instances across one or more dimensions. In Figure 7, partial invariants could be expected because there was a level of cohesiveness (homogeneity) in the original description set. This is why certain transformations are unable to change the overall description; cohesiveness is a kind of redundancy that ensures symmetry under transformation.

Looking at the description set for the global furniture company (DG) we see a lack of cohesiveness in the description. It is more heterogeneous, not because there are more elements in the set (although this increases the chances a set will be more heterogeneous), but because it is perceptually more difficult to compress DG as much as DC.

Let’s be clear about what’s happening here. Notice in Figure 7 that the small black shapes have more in common with each other than the large white circle. This cohesiveness is what makes it compressible. When it comes to real world objects the dimensions need to relate to the problem being solved. Nails and glue are similar if we consider “bondability” a dimension. These are both used to keep things stuck together. While being different ingredients, nails and glue represent a kind of redundancy. These can be expected to act as partial invariants under perturbation.

We can even do a quick tally. How much can we compress DC? I would say to a cardinality (size) of 2. Why? Because glue and nails can be considered 1 element in terms of bondability, which leaves a 2 element set {wood, nails-glue}. How much can we compress DG? I would say to a size of 4. We already have nails-glue as 1 element, but sawdust, wood veneer and cardboard have distinct purposes (solid surface, structural support and wood appearance).

This works as a heuristic because it isn’t some slow analysis, it’s a quick (“system 1”) type of thinking applied to real world objects. I can quickly review the ingredients of a coffee table and make an assessment as to its categorical invariance, and more broadly its structural complexity.

Again, it’s important to note that it was an object’s heterogeneity that led to poorer compressibility. Let’s take the counter example of a shortbread cookie with less ingredients but also less quality. You love your Grandma’s baking, but it might not always be the best thing to eat. Quality doesn’t mean homemade, it means something that maps to its purpose with good efficacy.

If Grandma’s cookie only has 3 ingredients compared to a mass produced “health cookie”, which has 4 ingredients, we might assume Grandma’s is higher quality.

But let’s use our heuristic:

***Figure 8*** *Example of attempting to compress an ingredient list based on redundancy, when defined in terms of cohesion across one or more dimensions.*

In this scenario we can argue that the cookie with a larger list of ingredients is in fact higher quality, since its description set can be compressed. In Figure 8, flour, sugar and butter play distinct roles in Grandma’s cookie, and don’t appear to share anything in common. But the cookie on the right has 3 ingredients that all have to do with flavor. Along the flavor dimension they represent a chance for compressibility.

Note that in the case of food, “quality” could refer to “healthy” since we expect food to impart nutritional value rather than support easy baking or aesthetics.

In the previous section I noted that we want our quality heuristic to satisfy a set of conditions related to the number of ingredients and the likelihood of an object bouncing out of its category, and to do so in relation to entropy. We also wanted to ensure that both obvious and surprising cases were covered. By grounding these ideas on a formula like structural complexity (4), or something similar, I argue that we achieve this.

Making Decisions

So what can we take from this in terms of real life decision making? As mentioned towards the beginning, it’s not useful to make dualistic decisions around quality. The requisite quality I want for a table may not be the local craftsman’s. There might be times when a mass produced snack bar is my only option in a rush. What really matters is the ability to assess the quality of something that we had no say in creating, using whatever information is available.

That information comes to us in terms of an object’s ingredients. We can usually determine either directly, or with a little research, what goes into the things we consume. I can find information on table manufacturing. By law, food and beverage companies must tell us what’s in the products we purchase. We can visually inspect the number of pieces on that garment or that spice rack. I can think twice about buying a book with a ton of footnotes compared to a clean volume that limits its jargon. A movie that jam-packs A-list stars can warn me of producers solving more problems than just trying to make a good movie. We can assess people by their number of acquaintances or the size of their to-do lists. A casual review of one’s codebase can hint at some serious code quality issues upfront.

We can improve our use of heuristics by developing a deeper appreciation for what it is we’re doing when we make decisions naturally in life. Keep in mind, it’s not about the size of the ingredient list, it’s about the heterogeneity. Humans are naturally good at finding connections between things. This is how we make sense of our world. We create abstractions by spotting patches of cohesive meaning between superficially disparate things.

Getting better at spotting the cohesiveness in a list of ingredients is something we can all do, but it’s also something we can develop as a skill. Did you immediately think of nails and glue as having something in common?

I hope the bit of rigor I introduced in this article brings some increased awareness to what quality is and how we assess it.