On comprehensiveness of technical manuals

Dmitriy Kim
12 min readAug 27, 2021

--

Like, what if I had a manual, say, including a description of some primitive data types say, integers, part of which can be signed integers, and part unsigned. They are marked like Sint32, Sint64, Uint32, Uint64 respectively. Ok, I can say, although there’s a lot of assumptions, it’s informative. There’s an implicit assumption, based on the type name, that it’s 32-bit 64-bit types respectively. Signed and unsigned respectively. Is there enough information for me to correctly pass parameters to a function defined this way? Or there’s something missing? And if there’s something missing where would I go to find that information?

Ok, let’s say, I can deduce the format well enough based on those type names. Although there is still one nuance — how the structure of signed and unsigned variables is organized? Like, because the sign can be stored in one bit, or the sign can be stored in one byte, it depends.

Like, let’s say, I program in a totally different language with different data types that organized and structured differently. So I would need to make my types fit up to bytes and bits. In this context, I would need to know exactly how the signed integers are structurally different from unsigned integers. In other words, how the sign is stored, does it occupy an extra byte, or it’s a high bit of the high byte of the variable.

By the way, in the latter case, it implies that signed types can store a narrower range of values than unsigned types. This reminds me that it would be cool to include the available range of values in the manual. Although I can make a lot of implicit assumptions, like, probably the high bit stores the sign and 32-bit integer variables can store numbers from 0 to 2 power 32, but it’s better to be certain. After all, there are some exotic languages out there and various exotic methods to store and organize data.

Where would I get this information? I would probably take a look at the type definition itself. After all, let’s assume that most of the stuff is written in C, and C has clear unambiguous definitions of types, about which everything is well-known, including their length, etc. In other words, there must be some definition of this type that can be traced back to some fundamental C type or whatever.

There are more serious problems in the manual as well. Say, there’s the type called a simple descriptor. I don’t know wtf is the descriptor. There are two aspects that have to be clarified. First of all, from the technical point of view, what is the structure — length, etc. — of this type? Like, a descriptor can be one byte long or it can be ten bytes long, or hundred, it depends. In this case, there is nothing I know about this type from a technical point of view. So the first thing I need to clarify is how the descriptor type is structured — what’s its length, for example.

Where do I get this information? Simply, it’s defined somewhere in code, and there’s somebody who knows where and how it’s defined, so I ask where I need to dig, then I dig and unearth the truth.

The second problem with the descriptor type is related to conventions. If it’s a descriptor it would be useful to know what exactly it describes, and, more generally, where it’s used. For example, there is such type as HWND in Windows MFC and it’s used to define variables containing windows handlers. In fact, it’s just an integer type, I mean, I could just as well use a type like UInt or WORD (or DWORD I don’t remember exactly) to define a window handler. The special type of HWND is defined to add more clarity. In other words, if I see a variable defined as HWND I know that it’s more likely than not a variable containing a window handler. Which makes my code more understandable.

Similarly, if I see some type called descriptor, I assume there are some conventions related to it. That this descriptor type is the type made to describe some specific variables — descriptors — related to some specific functionality. So I need to find out, what exactly this type is for — what those descriptors describe — what are the conventions related to this type? For example, maybe it’s only intended to store process handlers or file handlers, and defining variables of this type for other purposes would cause a lot of confusion among programmers who’ll have to deal with this code later.

Deeper in the polygons…

More complex types can be constructed using primitive types described above

The complex types include

Structures (struct) variant types (union) synonyms (typedef) named constants (const) fixed size array (array) sequences of variable size (sequence)

First, there’s almost no valuable information, additionally, it’s utterly confusing. First of all, why the constant declaration is even on this list? Second, type descriptions contradict type declarations. Case in point: variant types (unions) — what on earth could be that?

What I know about programming (or what I’ve just read) union types are normally a way to organize variables in a sort of a structure containing different primitive types. Ok, maybe the type name isn’t so confusing, after all.

But considering the multiple variations of type implementations in different languages, it would be cool to learn some specifics about this specific implementation.

For example, technical details — how the data in the union is stored, are variables aligned or not? Speaking of this particular type and the fact that it implemented differently in different languages, I would like to know what particularly does it mean here. For example, how it’s different from the structure (struct)? And, yes, whether the variables are aligned or not. Because well, once again, I can write in a totally different language and when I call a function requiring as a parameter a variable of this type, I’m compelled to manually compose a data structure to pass it to this function. I would need to meticulously craft it, and I would be wondering — if there are some types of different lengths in a structure or union, are they packed tightly or each position of the structure has a fixed length? If they are tightly packed how do I know when one variable in the structure ends and another begins?

Once again, we have the following complex types

Structures (struct) variant types (union) synonyms (typedef) named constants (const) fixed size array (array) sequences of variable size (sequence)

By the way, another question, can complex types include variables of complex types (can I nest structures for example, which would be cool) or they can only include variables of primitive types? It should be clarified.

Another question, how come that the constant declaration is among types — constant is not a type — or if it’s there for a reason, it requires some soul searching and investigation, etc. Anyway, what would I do, practically, in this case? Most likely the constant is here by mistake, it’s 99% probability. I just need to check that there’s, indeed, no such type defined as const (it would be insane if it were) and then just exclude it from this part of the manual.

Ok, speaking of composite types and trying to navigate this sea of confusion and trying to find some useful landmarks and guidelines.

Let’s fire out all the questions that come up:

  • How structures are different from unions?
  • How sequences are different from arrays?
  • Are sequences are like linked lists or what?

Ok, I read “fixed-size arrays,” which means that the size of the array is defined in advance and cannot be changed later. Like, array(100) of integer or something like that.

By the way, speaking of which, I assume, based on my experience, that arrays can only contain the values of one type — like, it can be an array of signed integers 32 bit or an array of unsigned integers 64 bit or whatever. I presume most likely here it’s also the case (it’s sort of convention about arrays) But can I be a hundred percent sure?

Another question is this fixed size condition is totally unbreakable. In other words, I really really cannot change the array size, after I defined it? Or can I? For example, in VBA you can define an array of fixed size then change it dynamically. Although VBA is a freak example, but nevertheless.

Speaking of “sequences of variable size,” I have a sort of a hunch that they are in fact dynamic arrays. Although I cannot be sure, and I cannot guess, rely on my intuition and so on, can I? I need facts and reassurances.

So to sum it up, what I think the manual needs here, first of all, is the descriptions of each type. Normally, for example, when I read about some type in a language new for me, say, array, what do I want to know, first of all?

First I want to know how I declare it, the syntax in other words, like, it can be A(100) in VB or PHP or A: array 1 to 100 in pascal, in any case, I need to know the specific syntax used in a specific language. Because syntaxes differ.

Similarly, I want to know how I describe operations syntactically like, should I write A(1)=x to assign some value to the first element of the array? Or I need to write it in a different way?

Speaking of which, it would be cool to have a working example of code containing all the aforementioned nuances. Where to find such code? Two options: either find it among sources, where these types are likely used in some way or to write it yourself, which can be simpler, apart from the fact that I would also need to test it somewhere somehow.

But, nevertheless, let it be an example, say,

UInt32 A(100) // declaration of the array with size 100 containing the elements of primitive type UInt32

A(1) = 50 // assigning value 50 to the first element of the array

By the way, what happens if I try to assign the value bigger than the primitive type used in the array? What if I just assign a variable of a different type? What will happen then? What if I get outside the array boundaries and assign a value to the element 101 for instance? Will the system crash or there’s some protection? What are the initial values in a newly created array? Can I expect it having something uniformed like zeros or there’s some garbage, or, worst of all, there are zeros most of the time, which lulls my vigilance until, once in a while, there’s garbage (well, because zeros are not guaranteed, it’s just a coincidence)

Here we come to the realization that we need a sort of description/notes section where we will describe the type-behavior in unusual situations? As well as what you must not do under any circumstances. Where to get this information? I presume it will require a bit of interrogation of people who came up with this type. Or rather people who develop tests, they should know. If the testing process is well-organized there would be specific tests allowing to answer each of the questions above.

Ok, to sum up, the previous deliberations, for complex types we need type descriptions, syntax, (including all variations) examples, and notes, explaining the borderline cases, limitations, and what a normal, sane person must never do with this type.

By the way, before we proceed, let’s get back for a moment to our primitive types, say SInt32. If I understand/remember it correctly it’s something from MFC and I already encountered cases where people are confused about what particularly this type means. No wonder.

Once again, if we don’t tell it specifically, all we can rely on are our assumptions. I can assume that S means signed, Int means integer, and 32 means 32 bit. So I deduce that it’s the signed integer of 32-bit length where the highest bit signifies the sign — in other words, whether it’s a positive or a negative number. But it can be anything else, just as well. What if the highest BYTE signifies the sign, not the highest bit? Or maybe it’s 32 BYTE type? Shit happens. So the point is, basically, I need to explicitly tell all that.

For example, the SInt32 type is the signed integer type with the length 32 bit, in which the highest bit signifies the sign. Something like that.

Ok, keep thinking about our composite types. Synonyms and unions came from C. Sequences — it can be anything basically. One of the definitions boils down to something like linked lists.

Once again, previously I asked myself a question — how structures are different from unions? It’s a valid question and many readers would probably be also thinking along those lines. Which makes me think that it would be a good thing to add in the type description (we have type descriptions at this point) that, so and so, the key things making unions different from structures are: <things>

Ok, wrapping things up

Speaking of primitive type descriptions.

  1. I would add an explicit description such as SInt32 is a signed type with a length of 32-bit where the highest bit signifies the sign. Otherwise, the reader would have to guess it based on the type name.

Where do I get this information? There must be a declaration of this type based on more standard C types in the source code, and I can find it. Or I can find and ask a person who introduced this type. Or, in this particular case, I can look it up in MFC technical guide.

2. I would add the range of values the variable of the given type can be assigned, say, SInt8 can contain values from -127 to 127.

3. Maybe a brief note, what happens if I try to assign a value too big for this type:

For example, if you try to assign a value bigger than 127 to a SInt8 variable, say, 10000, it will partly erase the content of variables next to it, and, generally, it can lead to memory violation and system crash (If, say, it’s a part of the kernel)

How do I know this? If I know that this type is based on the standard C type and this is what happens when you try to assign an incorrect value in a C program, I can safely assume that it will happen with this custom type as well. Although depending on the system architecture, there may be some special protection against, say, memory violations, and this is something I need to find out. It’s likely in the system specifications or system core designers are likely to know about it.

4. The descriptor type needs a more detailed explanation

  • First, its technical characteristics — length, the range of accepted values, (the same as in 2)
  • Second, I need to know why there was introduced a special type, called descriptor, and respectively the context of its use. It was definitely defined for a reason, and there must be special cases when it’s supposed to be used.

For example, the descriptor type is used to store the file handlers.

This information can be procured from system developers who came up with this type and use it. For example, I may ask in what specific cases it’s used now, and what the users of the interface, for example, are supposed to use it for.

Composite types

5. First of all, what constant declaration does among the type descriptions. Most likely it’s here by mistake. I would check if that’s the case and remove the const from the list.

6. Each type needs at least a brief description, explaining what essentially this type is. Otherwise, for example, I would be forever wondering what’s the difference between struct and union.

Where do I get this information? Either the developers who came up with the original composite type or if this type is identical to some more well-known conventional type — from the respective documentation.

7. Each composite types needs a description of its syntax: how a variable of this type is declared and used

For example, an array can be declared in the following ways.

SInt32 A(10); Sint32 A(1,10);

To assign a variable to an element of the array, you must use the following syntax

A(1) = 10;

8. I would add a working example, in which the described type is used.

For example :

Int log {
SInt32 counter(10);

counter(s)++;

return counter(*s);
}

9. I would add a notes section describing various nuances related to dealing with this type. For example, there can be clarified the following questions.

  • Can I only use primitive types in a struct, or I can use complex types as well. For example, can I build nested structs?
  • Should all the elements of an array be of one primitive type or it can contain elements of different types?
  • Can an array with the initially defined size be expanded later on?
  • What happens if I try to address an element with the index beyond the array boundaries? For example, I try to address 1000 element in an array with a size of 10
  • What happens if I assign to the array elements values of the wrong type? For example, I try to assign a 64 bit variable to an array consisting of 16-bit elements.

10. Depending on the context of the document, some technical details of the composite types may need deeper descriptions.

For example, the variables of the struct type are stored sequentially without gaps between them, aligned at a 64-bit boundary.

In other words, I make it clear that each position in the struct has a fixed size — 64 bit — no matter what type of variable it’s used for.

This information can be useful for those who need to construct the respective data structures manually — they’ll know exactly how they should be organized

--

--