Exploring the Ambiguous Nature of the sizeof Operator
In a past article I wrote about the topic of C declarations and how to apply the precedence rules to interpret them. For example, in a C declaration, brackets have a higher precedence than an asterisk. So the following declaration may be mistakenly interpreted as a pointer to an array:
But if you try to use it as such, C will complain, because it is actually an array of pointers.
This example of ambiguity in the C syntax is just one of many. Let’s take the sizeof operator as another example.
p = n * sizeof * q;
Does that statement have one multiplication or two?
You are correct, there is only one multiplication. Why? Because had the answer been two, it would have failed as an example of ambiguous C syntax and I wouldn’t have used it in this article. But the other, more important reason, is that sizeof is an operator and it takes as its operand the thing to its right. And that thing is interpreted as a pointer named q.
The statement becomes more clear when it is written with parentheses to remove ambiguity.
p = n * sizeof(*q);
If q points to an integer, I could get the same result by writing the following:
p = n * sizeof(int);
But what if I removed the parentheses in this case? They’re optional, right?
p = n * sizeof int;
No bueno. This will cause an error.
When sizeof’s operand is a type, it must be enclosed in parentheses. But when it is a variable, parentheses be damned!
So now we have both ambiguity AND inconsistency!
Why is sizeof inconsistent with it’s requirement for parentheses?
As best as I could find out, the C specification requires parentheses for a data type to remove ambiguity in case there are more variables and/or operators following it.
Wait… so the parentheses are required to remove ambiguity??
If that’s the case, what about this statement?
p = sizeof (int) * n;
I’m not so sure that isn’t ambiguous. Does this dereference n and cast the value as an int which is then used as the operand for the sizeof operator? Or is int used as the operand for sizeof, which is then multiplied by the value of n?
The answer is the second option. Extra parentheses would be helpful to make that more clear, but they aren’t required by the C compiler.
I guess the lesson is to always use parentheses with the sizeof operator. Or really, I suppose the larger lesson is to use them in any case where ambiguity exists. It’s one less attack vector for bugs that, if you’re lucky, will cause a compile error, and if you’re unlucky, will compile without error.