The Foundations at the Core of C++ are Wrong — Part 6

Ianjoyner
9 min readJul 4, 2024

--

This is the sixth of a series of articles examining Bjarne Stroustrup’s writings about the core foundations in C++. The first part is here with links to the other parts:

Part 5

Origins of Class

The C++ class concept has, in fact, proven itself as a powerful conceptual tool.” (1986 page 7) (1997 page 9)

I agree that the class is a very powerful tool, but the class is NOT a C++ concept, that is not the origin of class or something unique to C++. Classes are the core in OO. Classes in C++ are botched onto the side, rather than being the core concept. C++ did not invent this concept, it just does it badly as more of an afterthought. That might be a benefit to C, but it is very bad for OO, undermining and compromising the whole concept. C++ started as C with Classes. Where did the idea of the class come from? It came from Simula 67 by Kristen Nygaard and O.J. Dahl. Read about the origins in part III of:

https://seriouscomputerist.atariverse.com/media/pdf/book/Structured%20Programming.pdf

The idea of inheritance also also came from Simula 67 (67 is 1967, the year Simula was released). While some early implementations of Simula were not efficient, it was based on a real structured language ALGOL. The inefficiencies of Simula were an implementation problem, not a language problem, although subsequent languages made improvements like removing the ‘inner’ statement to call an implementation in a subclass, to ‘super’ to call from a subclass to the default in a parent class.

The structure of ALGOL with nested procedures suggested the class. C was not a real structured language — it compromised structured programming just adopting the syntax to avoid assembler ‘coding’. In C++ the adoption of classes was also a compromise and has done much damage to OO.

Towards the end of ‘Thinking about Programming in C++’, and still in the third edition, Stroustrup claims: “Remember that much programming can be simply and clearly done using only primitive types, data structures, plain functions, and a few classes from a standard library. The whole apparatus involved in defining new types should not be used except when there is a real need.” (1986 page 8) (1997 page 16)

This is completely wrong. Defining types and how they operate is central to programming. Even BCPL, which had no abstract types, had machine-oriented types of word, byte, bit and a structure. The programmer had to map abstract types to these, numbers into words, characters into bytes, booleans into bytes or bits, etc. It was up to them.

B followed BCPL in not having abstract types, and Dennis Ritchie’s work was mainly adding simple abstract types, although even these were not that abstract, implementation float instead of the more abstract real. While a computer implementation of real cannot match the infinite set of reals (even between 0.0 and 1.0), we should still think of real as an abstract type with the same set of operations (large data set, few operators), and not as the underlying representation of floating point (or other representations of reals as fixed point, etc).

The most basic type of booleans was forgotten or considered too simple, requiring programmers to invent their own, or accept false as being anything with a zero value (in its type) or true being non-zero (this just saved Thompson typing ‘if x = 0’ instead just ‘if x’, or rather ‘if (x)’ for ‘if (x == 0)’ in C).

It is the job of language design to define the basic abstract types clearly and to provide clear ways to specify user-defined and extensible types. The user-defined type in OO is the class, defining the set of values in the type and the operations on those values (an algebra). The extensible aspect is inheritance, which may restrict values and/or add more operations. Smalltalk makes even basic types class-based objects. Eiffel has a slightly different view mapping basic types to a machines efficient built-in types, but still having an abstract class view.

There should be no other need for abstract types or modules other than the class. Yet C++ retains struct (which B removed from BCPL, but Ritchie put back in C), as well as C’s primitive modules in header files, lately something called modules, unions (polymorphic memory blocks), typedef, namespaces for carving up C’s global space. This is why the class in C++ is a compromised concept since programmers may use these other forms in preference depending on their own ‘coding’ style. The class is watered down in C++ by these other forms that the class should have replaced. But then remember that if Stroustrup removed anything there would be howls of outrage.

The simplicity of OO with class = abstract type = module, subsuming all other forms is lost in C++. Class as abstract type and module really is a ‘powerful conceptual tool’, but that power is lost in C++.

Programming and computing is not pushing bits around in computer memory, it is dealing with abstract types and the operations on them to achieve an outcome. That is real computational thinking. OO systems should be completely started with and arranged around classes, which (with genericity) are type generators. Classes define the values of a type and the operations on those values. This is also the definition of a mathematical algebra.

Although small example, exercise programs can be achieved without classes, and wrapping up a small program in a class might seem arduous, you don’t need to move much past 100 lines of program before multiple different concepts are in a single program, each of which can benefit in being in its own class. Stroustrup advises pushing types to the side as well, when types are the foundation of modern programming. Classes are type generators. It is not a one-to-one relation — a single class may generate many types when used with genericity.

Types define compatibility between objects. It is not as simple as saying the objects must be exactly the same type, inheritance defines weak compatibility in parent classes and stronger compatibility in subclasses. That is objects that can be classified as a parent class have less commonality so weaker compatibility and objects based on subclasses have more commonality, but will not be compatible with objects of other subclasses.

OO is only bolted onto the side of C++, and programmers are even told to just use a class and types if really needed. This is why C++ has completely subverted OO far better than any non-OO language could. It is just the semblance of looking OO that is the danger. Like Simula completely inverted the structure of ALGOL, OO turns software design on its head. Instead of the procedure being the fundamental structure, the class becomes the base. But in languages like C++, the class is only used as an option.

Some entities in a program will be class based and hence statements about commonality, compatibility, or non compatibility can be made, and furthermore checked by a compiler. However, entities not based on classes won’t be able to have such reasoning, either by programmers or automatically in the semantic analysis of a compiler or other tools. This makes C++ a weak language and particularly a weak OO language.

Inheritance is a curious case because we could have a strategy of not using inheritance until it is completely obvious inheritance would be better structuring. Inheritance seems a defining characteristic of OO so many programmers are taught inheritance first. They then bend systems around inheritance using it in inappropriate ways, not understanding the strength of the ‘is a’ relationship, often being used where the weaker ‘has a’ or ‘contains’ relationship is required. This results in implementation inheritance rather than a type taxonomy, from which implementation reuse naturally arises. This is not putting inheritance in a language as an afterthought as it is with C++.

The first things programmers that are taught seem to be what they want to use. Programmers taught goto have difficulty with the higher abstractions of loops. Programmers taught C pointers have problems understanding the abstractions of references and links (to objects) why C pointers (to memory locations) are so problematic and why pointers are not really part of computation, defining locations and access paths, not data types. Inheritance taught early (since it seems to be a defining quality of OO) means programmers misuse it.

Smalltalk-72 did not have inheritance, and Alan Kay does not see inheritance as an essential property for OO, even though there is a common (but perhaps wrong) definition of OO as classes + inheritance. By Smalltalk-80 (the public release of Smalltalk), inheritance had been introduced. While inheritance is important in type taxonomy, it is not a mechanism that should be used for reuse of implementation code (that follows as a natural result). If I read Alan Kay correctly, OO is about the dynamic operation and interaction of objects via messages, not the static structuring and taxonomic organisation of classes and inheritance.

Classes are the core of OO, and not a bolt on to be used only if we feel like it. C++ is the mix of old procedure-oriented programming having scopes and global environments. In OO scope is restricted to a class (static) or object (at runtime) and there should be no global environment resulting in C++ being such a messy mixture. C++’s view of classes as being an optional extra is part of the reason why C++ has so effectively undermined OO.

Kay’s vision is for the interaction between objects alone, not the interactions of objects within a context that may change the nature of the interaction. Global environments have state that changes objects interactions and this results in programs that are difficult to reason about and debug. Students who have only written small example programs as exercises to learn find it hard to understand why there are prohibitions on globals, gotos, and pointers. It is made worse when we have languages like C++ that do little to reduce the use of these things and then have advocates that defend such languages and deny what is bad about them.

Globals set up dependencies between distant areas of programs. Dependencies result in inflexible software. These are exactly the kinds of difficulty OO was trying to avoid, but C++ just went and put it right back in. Class and object scope is encapsulation. Globals break encapsulation. Globals are not encapsulated or protected, they are able to be changed from anywhere. While anything ‘global’ is powerful (not just in programming) it is the unrestricted and unconstrained nature that makes them difficult to reason with.

Reasoning about software requires locality. We can control the local. Global corporations can move their tax base to low-taxing jurisdictions or countries where laws of fairness do not apply. Globals are difficult to control. The global climate crisis is difficult to control because we have too many rogue countries. Global warming is a global crisis and we are all dependent on it. That is the kind of thing we want to avoid in programming. We want our national boundaries to encapsulate us from what we consider bad in other countries. But nature does not respect national boundaries.

C++ namespaces are just a way of better organising globals, but not banishing globals as they should be. They are like country borders and nature does not recognise or respect borders. Birds and weeds can cross the border unimpeded. Global spaces are like uncontrolled borders. Encapsulation is to protect the contents of an object so that the routines in the object have control over that object. Globals are in no-man’s-land and are unprotected and uncontrolled.

For more on Globals:

https://archive.eiffel.com/doc/manuals/technology/bmarticles/joop/globals.html

Barbara Liskov noted that ALGOL had a good idea and a bad idea. The good idea was blocks (scope) and the bad idea was blocks (that is not a typo!). See the video in this post:

https://www.quora.com/Are-C-scopes-function-based-like-in-JavaScript-or-block-based/answer/Ian-Joyner-1

ALGOL has arbitrarily nested blocks to many levels deep. This feature of ALGOL led to classes in Simula 67. That is the origin of the idea of the class. As a simplification, C only has (what it calls) functions (which are named and called blocks, not true functions), and one huge global block (perhaps this simplification was made in B/CPL). Apart from the interaction of objects with external state in a dynamic system, globals bake in static dependencies between all parts of software, making software difficult to maintain and update, that is inflexible. Modules are to remove that, and hence classes are modules (which is not what C++ classes are). But with C++ this is not strict but left to programmers who might not feel like it. If they don’t understand the real benefits of classes, they won’t feel like it. As an optional add on C++ won’t help understand the real benefits of classes.

Part 7

--

--