# Should Span Be Regular?

In my last post, I talked about the concept of Westie types (yes, I am trying to make this happen), what Regular means, and which of them are Regular. I went through an explanation for why means for `span` is not Regular and potentially why it should be. After lots of resulting conversations with several people (thanks Zach Laine, Nicole Mazzuca, Eric Niebler, John Shaw, Tim Song), I thought it was necessary to write a follow up with more details and more argument.

In Fundamentals of Generic Programming, Stepanov introduces four axioms to describe the equivalence between copy construction/assignment and the equality operator required for the Regular concept. However, the definition in that paper was insufficiently precise — it was too handwavy, in a way that made it difficult to carefully reason about different types. I saw lots of objections to either the axioms, the resulting arguments I made from them, or both. Eric Niebler filled in the gaps for me in the following way:

• Define the value of an object as what its `operator==` compares. For instance, the value of a pointer is its address. The value of a reference is the value of the object it refers to. The value of an `int` is… well, its value.
• In the fourth axiom
`T a = c; T b = c; zap(a); assert(b == c && a != b);`
We are defining `zap()` to be some operation that changes the value of the object. This axiom must hold for all `zap()`s, and the function itself does not have to live within the constraints of the C++ world.

This model makes it possible to clearly evaluate all types. For `string_view`, for instance, the value is its underlying range of `char`s. Even though it only provides `const` access to those, we can still say we have some `zap()` that changes one of those `char`s — and such a `zap()` would trigger the `assert` (because `b` would change along with `a`, and so `a != b` would fail). Thus, `string_view` is not Regular. By the same argument, `span` is not Regular.

More interestingly, we can evaluate a type whose comparison does not take into account the complete object. For instance:

Is this Regular or not? The value of a `Foo` is just its `x` member (this is what its comparison actually checks). It’s easy to see that the first three of Stepanov’s axioms hold. Does the fourth? A `zap()` that changes the value of the `x` member would cause `a != b` to be true (because we’d subsequently compare the now-different `x`s) and it would in no way affect either `b` or `c`, so `b == c` would continue to hold. Hence, `Foo` is Regular. Even though `y` isn’t compared! We’ll get back to this one later.

I’m going to take a brief aside and talk about references. References are not Regular, for at least the obvious reason that they aren’t default constructible. But even if that weren’t a requirement, references would fail the 4th axiom — if we had three references to the same `int`, for instance, and we change the value of one (which is to say, changing the underlying `int`'s value), we would change the value of all three of them. This fails the “independence” part of the assert.

Matt Calabrese gave a great talk at CppNow 2017 entitled “An Approach to Dealing with Reference Types in the Generic Programming Paradigm” — which, given the title, is obviously super relevant to this topic. In the middle of the talk, he introduces how to do assignment for `variant` assuming Regular types. And then, at 51:55 in the talk, he introduces a request to have a `variant` that contains a reference type. The example goes on to show that if we just simply use the types’ provided semantics (references are assignable — it’s just deep assignment):

We could run into trouble when we do this:

The problem here is the semantics of this program are extremely path dependent. When the `user_input()` is `false`, we put the variant in an empty state — this is fine. When the `user_input()` switches to `true` the first time, we construct a new reference. Also fine. But when it’s `true` the second time, we assign through to the initial reference. Completely different behavior dependent on what state we happened to be in before. This makes for code that’s extremely difficult to reason about.

This, to me, is a very compelling example. While the specific scenario is a little odd and contrived, the general flow is something that could easily be run into — and it makes for a good argument against a hypothetical proposal to allow reference types in `optional` and `variant` with deep assignment semantics, even though that’s what reference types’ assignment semantics are.

Back to `span`. I’ve heard from numerous people (including Matt) that `span` should be Regular (that is, its value should be its pointer and its length, not its underlying objects — this choice makes it satisfy all four axioms). However, what I have not yet heard is an example for the problems of an Irregular `span` that is as compelling as the above example for a deep-assigning `optional<T&>`.

My last post purported to have such an example — it was a counting sort which required the elements in the Range to be Regular:

And the post provided an example where this algorithm produces subtly unexpected behavior: if you provide a range of `span`s to different vectors, that nevertheless have the same value, the result of the sort is two `span`s to the same vector (whichever one comes first).

However, does this example really demonstrate a problem with `span` failing to be Regular?

Consider the `Foo` type from earlier in this post. `Foo` is a Regular type, as I showed, so we should have no problems. Let’s find out. Does it `counting_sort()`?

Do we end up with five distinct `Foo`s?

No. We end up with 5 copies of `Foo{1, 1}`.

So if this example is to be considered problematic for non-Regular `span`, it’s also problematic for Regular `Foo`. And likewise it would also be problematic for Regular `float`. If you exclude `NaN`, `float` is a Regular type, but positive and negative zero compare equal to each other — which is a distinction that this algorithm would erase.

As a result, I’m not sure that this algorithm actually demonstrates that there is a problem with `span` having deep comparison semantics. It merely demonstrates that if a type’s value is not entirely encompassing of all of its observable traits, some of those might be collapsed.

Consequently, I am left with no examples demonstrating a problem with `span` having deep comparison semantics.

Consider this post a request, in good faith, to everyone who thinks that `span`'s comparisons should be anything but deep (whether `span` should be Regular and have shallow comparison semantics or should not be comparable at all): Give me a compelling example that shows why this is bad.

By compelling example, I would prefer a negative example (that is, like Matt’s CppNow example, code that a programmer would expect to work but doesn’t). But I’d also be very interested in positive examples (that is, code that you could write if `span` were Regular, but can’t).

First one to do it gets a beer in Rapperswil!