The Foundations at the Core of C++ are Wrong — Part 2

Ianjoyner
9 min readJul 2, 2024

--

This is the second of a series of articles examining Bjarne Stroustrup’s writings about the core foundations in C++. The first part is here with links to the other parts:

https://medium.com/@ianjoyner/e1ea529bcf60

System Programming and Application Programming — completely different problems

Just because you can use something, doesn’t mean you should.

System languages can be used for any programming because they have all general-purpose language facilities. But it is the extras in system programming (often dangerous and insecure) that means system languages should not be used much above core machine requirements. System programming is to handle the problems in the platform itself, the machine is the problem. System programming is to provide clean abstractions on which applications run.

When system programmers don’t understand abstraction or what abstractions are required in different domains, poor systems will result. Application programming is to deal with the real world domain of problems. Machine and platform details detract from that. This is why the C approach to wider programming is wrong, and C++ just exacerbates any problems of insecurity, difficulty to get working, inflexibility, and lock in.

Why give application-level programmers facilities they should not use? It means the system must constantly check that a program is not doing something untoward. System programming by its very nature of low-level, unsafe, and insecure programming to control aspects of the machine should be restricted.

System languages can be, but should not be used for application programming.

A little anecdote. When I was young I taught a course in Burroughs system ALGOL to COBOL programmers at a customer site. They were very resistant and in the end did not learn much. I could not understand why. ALGOL was a much nicer and modern language than COBOL. But Burroughs system ALGOL is a system languages with extensions for events, interrupts, etc, that problem domain programmers very rarely need if the system programmers have done their job right.

In retrospect I can see they were right and I was wrong. I had thought it was just the ‘language I’ve learnt’ effect. That was a part of it, but they had little need for system-programming facilities. COBOL (for all its verbosity, horror, and limited facilities for general programming beyond simple read-a-record, process, write-a-record paradigm) actually did what they needed. To do that you had to bend ALGOL a fair bit, rolling your own (particularly records in Burroughs ALGOL), even though the Burroughs extended form put in all the things that standard ALGOL 60 lacked. While ALGOL could do everything, it was not with the convenience of COBOL for business transaction processing. Burroughs ALGOL had many system features like event handling and interrupts (where are those in C?). But business programmers really didn’t need any of that neat stuff. So let’s repeat the above.

System languages should not be used for application programming. Details are exposed that should not be, and not just because applications don’t need them but because they are unsafe and insecure.

We now have far better languages than FORTRAN and COBOL for high-level programming. We can do much better than using system languages, especially primitive coding languages such as C and C++.

C++ did not embrace ‘modern’ or high-level approaches, rather the primitive approach of C, just adding syntax. The structured approach of OO is not only compromised by any use of C-style structuring around globals, but mostly OO will be thought of by programmers as tacked on the side, just using the odd class here and there, not using classes to completely structure a system. That completely compromises OO. It might be an improvement to C to use a class occasionally, but it misses the real benefit for entire structuring and organisation of programs. Furthermore it hinders programmers from really understanding what OO is about.

While both system and application languages have much in common they have a different emphasis — system languages exposing low-level and machine-oriented features whereas general-, application-, or domain-oriented languages are oriented to the problem. Adding application or domain features to a system language (like structured syntax or OO) does not make a system language more suitable for those areas. It is the system additions that are inappropriate because they are the details that should have been abstracted.

Again while system languages can be used for general programming, they should not be. Even languages that must interact more closely with hardware like embedded systems or games (maybe) can do better by having domain-oriented languages.

The fundamental principle here is ‘Separation of Concerns’.

Based on C — An Old, Primitive System Language

C was chosen as the base language for C++ because it (1) is versatile, terse, and relatively low-level; (2) is adequate for most system programming tasks;…” (1986 page 4).

C is versatile because programming is versatile. All programming languages are versatile. To think one is more versatile than others is to not understanding programming (this is related to Turing Completeness). C throws everything in. But that is not good for separation of concerns. As C.A.R. Hoare once remarked: “A language is characterized not only by what it permits programmers to specify, but even more so by what it does not allow.

Alan Perlis (first recipient of the ACM Turing Award for computing): “A programming language is low level when its programs require attention to the irrelevant. While, yes, this definition applies to C, it does not capture what people desire in a low-level language.” With C the versatility comes at the expense of ‘Separation of Concerns’.

We could say that assembler or machine language is versatile because computation is versatile, at least if you want to do anything computable. There is no machine with a magic instruction that can do something others can’t. Similarly, there is no language with some magic statement or instruction that makes it more computationally capable than others or machine code. But C and C++ often talk as if they must have something magic that makes them powerful where others aren’t. It is programming that is powerful, not any language. Languages are just window dressing helping programmers organise and express their thoughts better.

Yes, C is terse (perhaps that should be tersatile). Minimal is good. Economy of expression is good. But replacing keywords with symbols is irrelevant. We now have key symbols. It is just syntactic lexical substitution for something that is still necessary, like demarcating the start and end of blocks.

C is terse to the extent of sacrificing clarity. Using ‘=‘ for assignment is wrong and confuses assignment with equality. This is just to save typing an extra ‘:’. Ken Thompson was obsessive about this since he was typing on really old clunky teletypes (find the photographs) where even holding a shift key was considerable effort. The result is a trap waiting for programmers to fall into, and many do. Good syntax means traps don’t exist to fall into. This is what good language design is about.

C’s terseness gives rise to operators like ++, which are primitive, mapping to DEC machine’s auto increment operations. That might sound good, and is attractive until one understands (and that can take a while) that ++ is a side-effect operator allowing more than one update per statement which can lead to all kinds of problems, including undefined behaviour. It seems some teachers love setting questions on what the result of some really complex expression involving ++ would result in. That is not teaching programming, it is teaching how to avoid traps that should not be there.

The difference between languages is how clear they are for expressing program semantics (that is meaning, ie., what the program does), how they support thinking and reasoning, how they support programmers, how they support the whole project lifecycle. C was not designed specifically for that.

C is low level, but not just low level, it is primitive. In fact, for low-level operations it is lowest-common-denominator, hiding some (rather than abstracting) low-level semantics of platforms, exposing others. C is intended as a language for system programming, and C++ inherits that. System languages should not be used for general software, which now is 99% of programming (as it should be).

We want systems that do something useful, and just managing themselves is not productive. System programming is to provide an abstract and rational platform for application programming. System programming resolves (maps) requests for logical resources to physical resources. It should not be spread throughout application programs. This results in lock in — and C and C++ have achieved that for all the wrong reasons. Low-level and primitive programming results in lock in. Lock in results in reluctance to modify software since we can break working software. The meaning of ‘soft’ in software is compromised.

We have compilation because it is easy to translate from a high level to a low, more primitive level. Low levels lack semantics — it is difficult to work out what programs do. It is much more difficult to reverse engineer from machine or assembler code to a high-level representation, or to work out what an executable is doing from the machine code or a symbolic representation of it. This is why several high-level languages target C or C++ as intermediate code.

It is harder to reverse from C or C++ to a true high-level language. Using C or C++ as intermediate code does not mean that C or C++ are the basis of other languages or those languages are dependent on C or C++. Good languages do not depend on translational or operational semantics, that is expressing their meaning in terms of the target platform, be it machine code or C/C++ code. They are only using C/C++ as a common assembler.

Coding in assembler results in lock in to the processor architecture. Coding in C and C++ results in lock in to C-based platforms.

But what systems programming tasks is C not suited for? C is not suited for really low-level hardware-oriented stuff. For that C needs some assembler (which can be inline). We don’t need assembler in general at all. Assembler hides semantics — that is the reverse of abstraction. Abstraction is interpretation and meaning.

We should have small platform-specific languages tied to the semantics of the platform that directly express what the platform does. C’s primitive facilities hide low-level semantics. Only parts of the software that rely directly on the hardware semantics should be written in such languages, keeping higher-level software independent and flexible, avoiding lock in. We should not rely on memory mapping, raw addresses (pointers) and defines (to put in some primitive semantics). Sure, macros are powerful and fun, but they are only a cheap-and-nasty way of making a language extensible.

Remember Alan Perlis (first recipient of the ACM Turing Award for computing): “A programming language is low level when its programs require attention to the irrelevant. While, yes, this definition applies to C, it does not capture what people desire in a low-level language.”

Based on C — An Old, Command-and-Control Language

This is the even more important reason why C is a poor basis for an OO language — it is command and control. The procedural paradigm says start in a main program like ‘main’ in C. You explicitly call what you want, and execution proceeds this way, always actively calling what needs to be done..

Object-oriented programming inverts this style of programming. Instead of calling, you are called. This is a passive kind of programming. At first, it is hard to understand, but it is a profound difference in how software is developed. A programmer writes a module. That is called. You don’t care where from. This is the ‘don’t call us — we’ll call you’ paradigm.

Rather than centralised control, control is distributed. OO is a passive way of programming. Modules in programming mean we don’t care how we are called, or where we are called from. Our module should be independent of context. You just program the bit you have been given to do. You focus on that and get it correct. In return, your module should not depend on other modules. Interactions are clear through interfaces forming the connections between modules, not depending on anything external in a global environment.

Programmers like to be in control and thus the OO paradigm is against their instinct. This is why C++ is likely a popular language because it retains this command-and-control feeling. But it is wrong and this damages OO at the core of the language. In early stages, programming is taught as command-and-control. But this is for simple programming and the exercises a course can set. To teach large-scale complex software development programmers must learn about modules, independence, and passive programming.

Much system programming is like this, providing functionality to other programs which you don’t know what they might be doing. ‘Write this file to disk’ — the system does not care what the file is.

The connections between program components are looser and more independent. The programmer’s job is to write modules correctly and efficiently. A program is correct if all components are correct. In object-oriented programming the class is the module.

System correctness of all components, from single statements or expressions, even subexpressions being correct is the basis of Hoare Logic. If any component is incorrect, the system is incorrect, even if a bug might be latent for a long time. Bugs are often like landmines — be careful where you step. But we would rather the landmines weren’t there, lying patiently in wait for a victim. Proof of correctness or incorrectness is recursive. This is the basis of Design by Contract which forms the basis of some OO languages, even more than classes.

C++ retains the old command-and-control thinking from C. Programmers might feel comfortable with that, but it misses the point of OO. C++ really does little to help programmers to find the new way of thinking.

Part 3

--

--