Programming Language Evaluation Criteria Part 1: Readability

ngwes
8 min readNov 9, 2019

--

Icons by Freepik, FlatIcon

In order to understand the various constructs of a programming language and its capabilities, it is useful to know some evaluation criteria.

In this article I will present and comment you the main evaluation criteria of a programming language.
Be careful though, in choosing a language for a certain task, it is not enough to consider how much I will explain, but also the functionality of the language within the specific task.
For example, Python is certainly an excellent language for performing numerical computation or data analysis, but certainly not good for hard real-time systems. Or, the C language is excellent if used in the programming of microcontrollers, but it becomes unusable to perform comlplex numerical calculation.
Based on the objective to be achieved, we can identify languages ​​that can be used and the degree of support for the problem to be solved; the choice between which ones to use (or start learning about) can be based on what I’m about to tell you.
Before we start, let me say that everything I tell you here is freely inspired by the book “Concepts of Programming Languages”, by R. W. Sebesta.

Before we begin discussing the concepts of programming languages, we must consider a few preliminaries. First, we explain some reasons why computer science students and professional software developers should study general concepts of language design and evaluation. (Robert W. Sebesta)

All languages have elements in common, which can be exploited to make a comparison between them. You’ll want comparison criteria, to use with respect to these elements in common.

Fortunately, Robert explains some of them. Which?
• Readability
• Writability
• Reliability
• Cost

Each of these criteria is determined and influenced by a certain number of language qualities, determined and implemented during the design of the language.
Therefore, design determines language characteristics that make it good for a specific task resolution.

Icon by Gregor Cresnar, FlatIcon

Readability

The first and most obvious criteria is certainly readability.
Before the ‘70s, when computing resources and memories were tight, programming was based on efficiency. As a result, the implementation of programming languages ​​reflected this characteristic.
Even today, we can see the effort in obtaining efficiency, looking at the names of some libraries or functions of the C language. Names like “stdio” or “strcmp”, are common. Although the meaning is easily identifiable, today we would not see the reason for this type of names and, indeed, we would criticize their use, as they are difficult to read.

From the ‘70s onwards, we began to understand that one of the main components of software production was not the first writing, but the modification (and errors correction); let’s say maintenance.This understanding was part of the emerging concept of software life-cycle: software has a life of development, articulated in different phases. Among this phases, maintenance is the most critical.
Focus had shifted, from efficiency to programmer productivity.

Ease of maintenance is strongly influenced by readability. Why? Who maintains software is different from who wrote the software and, therefore, it’s fundamental that he can understant software behavior and the reason for some choices.

Remember to consider readability in the application context. A language is easily usable when applied to contexts for which it was designed. For example, using a language designed for data analysis to create a DBMS, will surely make the code difficult to read.

Simplicity
Simplicity and readability certainly go togheter. A language with many constructs can be complex to understand, to learn, and, above all, boring.

Why is this a problem? Because a programmer will learn a subset of language constructs.
Can this be okay? Let’s look at it with an example.
Suppose that Bob has learned and used a small set of constructs in his project; he has been working for the past 6 months and his product is tens of lines of code and dozens of files. The moment Alice, who has learned a different subset of constructs than Bob, a year later, has to mantain Bob’s work, she’ll have a lot of difficulty understanding what Bob had in mind. Hours and hours will pass on Stackoverflow, cursing Bob for being a bad programmer.

In simple words: this could happen when a language makes it possible to do the same thing in different ways.

Another example of simplicity problem: possibility of doing operator overloading. Although it is a very useful feature in a language, it can cause confusion if the reader does not know how the overloading was done.

However, excessive simplicity can be a problem too.To understand why, let’s take an example with legos. We can make the most imaginative and complex constructions using only five basic blocks. Certainly, however, we would lose a lot of time completing a complex construction and would use a large amount of blocks. Instead, if we had more complex pieces available, perhaps ready to represent the roof of a building, or a wall, we would lose less time, we would have more simplicity in building construction and we would use a lower total number of pieces.

Another example, let’s look at assembly code.
The assembly of most modern processors uses a RISC ISA. This means that, to perform complex tasks, you need a combination of several instructions; at least two memory access operations and one of ALU type (logical arithmetic unit). A high-level instruction translates into more low-level instructions; as if we were dividing a large instruction into simpler instructions.
Following data and control flow of long, little instructions, program can be complicated.

C code:
int square(int num) {
return num * num;}

Assembly Code RISC-V gcc 8.2.0
square(int):addi sp,sp,-32sd s0,24(sp)addi s0,sp,32mv a5,a0sw a5,-20(s0)lw a4,-20(s0)lw a5,-20(s0)mulw a5,a4,a5sext.w a5,a5mv a0,a5ld s0,24(sp)addi sp,sp,32jr ra

So, too little simplicity can lead to understanding difficulties, too simplicity can lead to reading difficulties. Remember, virtue is between extremes.

Orthogonality
Orthogonality definition is a little cumbersome: the possibility of combining the primitive constructs of a language, in a small number of ways, in order to construct data and control structures.
If every possible combination of primitives is allowed and makes sense, then we have total Orthogonality.
Lack of orthogonality leads to exceptions in the behavior of language.

For example, suppose we didn’t have the ability to define array pointers in the C language; this would prevent us from constructing very useful data structures, although there are both pointer and array in the language.
Or, imagine the situation in which you are forced to use two different operators to make a sum either between integers or between floats. The sum operation is conceptually the same, but we cannot use the same operator on different constructs (int or float).

You can see that a certain amount of orthogonality is fundamental for readability.
Even in this case, however, be careful. Too much orthogonality can become complex to manage. Complexity grows as the number of language primitives increases, because the possible combinations increase.
A small number of constructs and a limited use of orthogonality allows to obtain a good simplicity.
Simplicity is the foundation of readability.

Let’s take a look to simplicity lead by Orthogonality with pure functional languages.
These languages ​​are “simple” because they can perform any task with a single construct, the function call, inserted, at most, in other calls.
The computation proceeds through continuous calls to functions, applied to certain parameters.
On the other hand, in imperative programming (consider it counterpart of the functional one), the computation is performed through variables, assignments and algorithms. There is a difference in terms of simplicity.

However, as you can understand at this point, everything has a cost. This kind of simplicity cost is the lack of efficiency.

Data Types
Adequate ways to define data types and data structures are also important for readability.
Imagine not having the Boolean type; you would be forced to use numeric values ​​that take the logical value of True and False.


Bool IsEnd = True;
Int IsEnd = 1.

This approach creates confusion in the reader, as he doesn’t know what value to give to the number ‘1’, whether logical or numerical. He should rely on a good practice of naming variables, which does not always belong to all programmers.

Syntax
Readability is influenced by the syntax of the elements of a language, that is, by how the elements of a language appear. For example, how the special words of the language appear (while, class, for, …).

Different readability of the for loop in Python, compared to C:

Ptyhon:
for x in range (0,3):
print “hello world”
C:
for (int i = 0; i<3, i ++) {
printf (“hello world”);
}

Writing in Python is more compact and descriptive and this is one of the reasons why this language is often used by academics.

Moreover, this example allows you to see the difference in the creation of compound statements: in C (and its descendants) pairs of brace brackets are used, in python indentation is used. Which of the two is more readable? There are conflicting opinions.

There are other languages ​​that use keyword pairs to explicitly indicate the beginning and the end of a compound statement. For example, something like:


for x from 0 to 3:
print (“hello world”);
end

Although I am not sure that this syntax belongs to a specific language, this is more or less common in all functional languages ​​(or the ones that implement the functional paradigm).

The structure is much closer to that of the written language and, for this reason, more readable. Nevertheless, generally, such readability places a greater burden on the compiler / interpreter. The more we move away from the machine language, the more complicated a good translation of the code is.

Another important issue, the expressiveness of a simple statement.
For example, the use of grep in UNIX systems is obscure to anyone who approaches such systems. Only the most experienced users know that in the editor ed, the /regular_expression / command, allows you to search for a match with a regular expression in a string; they also know that preceding a command with g, makes it a global command; they also know that putting a p after a command causes the command output to be printed.
So g / regular_expression / p, that is grep, prints all the lines in a file, that contain a specified substring, through the regular expression

Another syntax problem: the possibility to use the special words of a language as the name of variables; if so, readability would be affected.

As you will have read from the title, this is only one part of what I would like to tell you. In the next article I will talk about writability, reliability and costs.
If you are interested, read on …

--

--

ngwes

I’m a Computer Science engineer, in love with programming improvement books. In my free time, I’m a swimmer and a volleyball fan