С interview questions. Bit-fields
A Bit fields provide convenient access to individual bits of data. They allow you to create objects that are not multiples of a byte.
A bit field cannot exist by itself. It can only be an element of the structure. Bit fields have the following form:
struct <name> {
<type> <name>: <size>;
...
<type> <name>: <size>;
}
As a <type> field, int
(both signed and unsigned), `__Bool` or implementation-defined type can be used.
C99 standard (6.7.2.1):
4. A bit-field shall have a type that is a qualified or unqualified version of_Bool
,signed int
,unsigned int
, or some other implementation-defined type.
The <name> is an arbitrary identifier, and <size> is a positive integer that must not exceed the length of <type> in bits:
C99 standard (6.7.2.1):
3. The expression that specifies the width of a bit-field shall be an integer constant expression with a nonnegative value that does not exceed the width of an object of the type that would be specified were the colon and expression omitted. If the value is zero, the declaration shall have no declarator.
struct point
{
uint32_t x:30; // Valid
uint32_t y:33; //Invalid
};
We also can use calculation for bit-field width:
struct point
{
uint32_t x:5;
uint32_t y:1 + 1 + 1;
};
Packing of bit fields
The compiler tries to pack the maximum number of bit fields into the size specified in the <type> field. But if the bit field does not fit into the size <type>, an additional variable is allocated.
C99 standard (6.7.2.1):
10. An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified
Example 1:
struct byte {
uint8_t a: 5;
uint8_t b: 3;
};
This bit field will be packed into a single variable uint8_t
, and thus the structure's size will equal 1 byte.
Example 2:
struct byte {
uint8_t a: 5;
uint8_t b: 4;
};
This bit field will be packed into two variables uint8_t
, i.e. the size of the structure will be 2 bytes because the b
the field does not fit in the remaining space of a
field.
Field Selection <type>
Let’s look at a few more examples:
#include <stdio.h>
#include <stdint.h>
struct byte {
uint16_t a: 5;
uint16_t b: 3;
};
struct byte1 {
uint16_t a: 5;
uint16_t b: 4;
};
struct byte2 {
uint16_t a: 10;
uint16_t b: 3;
};
struct byte3 {
uint16_t a: 15;
uint16_t b: 4;
};
void main() {
printf("sizeof(struct byte): %d\n", sizeof(struct byte));
printf("sizeof(struct byte1): %d\n", sizeof(struct byte1));
printf("sizeof(struct byte2): %d\n", sizeof(struct byte2));
printf("sizeof(struct byte3): %d\n", sizeof(struct byte3));
}
sizeof(struct byte): 2
sizeof(struct byte1): 2
sizeof(struct byte2): 2
sizeof(struct byte3): 4
These examples clearly show the importance of choosing the right field <type> for each bit field. For example, if you change the type from uint16_t
to uint8_t
in the byte
structure, this will allow the compiler to pack the entire structure into 1 byte, instead of 2:
#include <stdio.h>
#include <stdint.h>
struct byte {
uint8_t a: 5;
uint8_t b: 3;
};
void main() {
printf("sizeof(struct byte): %d\n", sizeof(struct byte));
}
sizeof(struct byte): 1
However, with the byte1
structure, this focus will no longer work, because even if you replace the type uint16_t
with uint8_t
, the compiler will still not be able to pack all the bit fields into the size of uint8_t
, because 5 bit + 4 bit = 9 bit.
#include <stdio.h>
#include <stdint.h>
struct byte1 {
uint8_t a: 5;
uint8_t b: 4;
};
void main() {
printf("sizeof(struct byte1): %d\n", sizeof(struct byte1));
}
sizeof(struct byte1): 2
We also cannot reduce the occupied size of the structure byte2
, because it is not impossible to change the type from uint16_t
to uint8_t
for the bit field a
, because its size is larger than 8 bits. And changing the type for the bit field b
will not do anything, because there is enough free space left in the bit field a
to pack both of these fields into a new uint16_t
.
#include <stdio.h>
#include <stdint.h>
struct byte2 {
uint16_t a: 10;
uint8_t b: 3;
};
void main() {
printf("sizeof(struct byte2): %d\n", sizeof(struct byte2));
}
sizeof(struct byte2): 2
And finally, consider the last example with the structure byte3
. Here, replacing the type of the bit field b
with uint16_t
with uint8_t
will also not change the size of the structure. Because of the size of the field a
, the compiler will not be able to fit the entire structure in 16 bits, and therefore the field a
will have a size of 2 bytes as expected, and the field b
will have a size of 1 byte. And the compiler will add one additional byte to the end of the structure.
#include <stdio.h>
#include <stdint.h>
struct byte3 {
uint16_t a: 15;
uint8_t b: 4;
};
void main() {
printf("sizeof(struct byte3): %d\n", sizeof(struct byte3));
}
sizeof(struct byte3): 4
Unnamed zero-size bit-fields
Unnamed zero-size bit fields have a special function — they disable packaging.
C99 standard (6.7.2.1):
11. A bit-field declaration with no declarator, but only a colon and a width, indicates an unnamed bit-field.108) As a special case, a bit-field structure member with a width of 0 indicates that no further bit-field is to be packed into the unit in which the previous bitfield, if any, was placed.
If you want the compiler not to pack two adjacent bit fields, then you need to use an unnamed zero-size bit field. This field forces the compiler to “indent” to the border of the field of the specified type <type>.
Consider the structure for example:
struct byte_1 {
uint8_t a: 5;
uint8_t b: 3;
};
struct byte_1 byte1 = {.a=0b11111, .b=0b111};
printf("sizeof(struct byte_1): %d\n", sizeof(struct byte_1));
sizeof(struct byte_1): 1
As you can see, the compiler packed all the bit fields into 1 byte as expected. Now we will add unnamed zero-size bit fields of various types to this structure and see what will change.
Unnamed field uint8_t
Let’s add an unnamed zero-size field of type uint8_t
:
struct byte_2 {
uint8_t a: 5;
uint8_t : 0; //8 bit aligned
uint8_t b: 3;
};
struct byte_2 byte2 = {.a=0b11111, .b=0b111};
printf("sizeof(struct byte_2): %d\n", sizeof(struct byte_2));
sizeof(struct byte_2): 2
As you can see, the size of the structure became equal to 2 bytes due to the fact that the unnamed field does not allow the compiler to pack the bit fields into 1 byte and, therefore, for the bit field b
, the compiler was forced to allocate a separate memory cell of type uint8_t
.
Unnamed field uint16_t
Let’s add an unnamed zero-size field of type uint16_t
:
struct byte_3 {
uint8_t a: 5;
uint16_t : 0; //16 bit aligned
uint8_t b: 3;
};
struct byte_3 byte3 = {.a=0b11111, .b=0b111};
printf("sizeof(struct byte_3): %d\n", sizeof(struct byte_3));
sizeof(struct byte_3): 3
The byte_3
structure contains a uint16_t
unnamed zero-length field and accordingly shifts the b
field to the next 16-bit boundary and it turns out that the bit field already occupies 3 bytes:
Unnamed field uint32_t
Let’s add an unnamed zero-size field of type uint32_t
:
struct byte_4 {
uint8_t a: 5;
uint32_t : 0; //32 bit aligned
uint8_t b: 3;
};
struct byte_4 byte4 = {.a=0b11111, .b=0b111};
printf("sizeof(struct byte_4): %d\n", sizeof(struct byte_4));
sizeof(struct byte_4): 5
The byte_4
structure contains a uint32_t
unnamed zero-length field and accordingly shifts the b
field to the next 32-bit boundary and it turns out that the bit field already occupies 5 bytes:
When the alignment of an anonymous field does not work
However, if the address of the field is already a multiple of sizeof(<type>)
bits, then the unnamed zero-length field will not add a shift:
It doesn’t make sense to consider anonymous fields uint8_t
because they are always aligned along the border of their type. Therefore, we will focus only on examples with anonymous fields of types uint16_t
and uint32_t
Take the structure of byte_3
, but this time, between the bit field a
and the unnamed bit field, we will add a few more fields of the size necessary to align the unnamed bit field to the boundary <type>.
struct byte_3 {
uint8_t a : 5;
uint8_t a1: 3;
uint8_t a2: 8;
uint16_t : 0;
uint8_t b : 3;
};
struct byte_3 byte3 = {};
byte3.a = 0b11111;
byte3.a1 = 0b111;
byte3.a2 = 0b11111111;
byte3.b = 0b111;
printf("sizeof(struct byte_3): %d\n", sizeof(struct byte_3));
The byte_3
structure has the uint16_t
unnamed field of zero length and, accordingly, should shift the b
field to the next 16-bit boundary. However, given the fact that the b
the field is already aligned along the 16-bit boundary, the anonymous field does nothing and the size of this structure will not change.
sizeof(struct byte_3): 3
It will be the same with the unnamed field uint32_t
:
struct byte_4 {
uint8_t a : 5;
uint8_t a1: 3;
uint8_t a2: 8;
uint8_t a3: 8;
uint8_t a4: 8;
uint32_t : 0;
uint8_t b : 3;
};
printf("sizeof(struct byte_4): %d\n", sizeof(struct byte_4));
sizeof(struct byte_4): 5
An unnamed field of non-zero size
Only unnamed fields of zero size disables the packing of bit fields. For an unnamed field of non-zero size, ordinary packing rules work.
struct byte_1 {
uint8_t a: 5;
uint8_t b: 2;
};
struct byte_2 {
uint8_t a: 5;
uint8_t : 1;
uint8_t b: 2;
};
struct byte_3 {
uint8_t a: 5;
uint16_t : 1;
uint8_t b: 2;
};
struct byte_4 {
uint8_t a: 5;
uint32_t : 1;
uint8_t b: 2;
};
printf("sizeof(struct byte_1): %d\n", sizeof(struct byte_1));
printf("sizeof(struct byte_2): %d\n", sizeof(struct byte_2));
printf("sizeof(struct byte_3): %d\n", sizeof(struct byte_3));
printf("sizeof(struct byte_4): %d\n", sizeof(struct byte_4));
sizeof(struct byte_1): 1
sizeof(struct byte_2): 1
sizeof(struct byte_3): 1
sizeof(struct byte_4): 1
The structures byte_2
, byte_3
, byte_4
contain anonymous fields with a length of 1 bit, which are packed into fields a
, as the b
field. This type of field can be used to reserve a certain number of bits. For example, when working with hardware protocols, they often have such reserved bits.
Alignment of structures with bit fields
If there are regular fields with bit fields in the structure, then the first bit field will be shifted to the <type> type boundary.
Let's look at the next structure
#include <stdint.h>
struct example {
uint8_t a;
uint32_t b: 32;
};
void main() {
struct example ex = {};
ex.a = 0x55;
ex.b = 0xaaaaaaaa;
}
According to the rule described above, in the example
structure, the b
the field will be aligned to a 4-byte boundary:
However, if you specify the number of bits less than 32 for the b
field, then the compiler may (or may not) allocate a type other than uint32_t
for this field. For example, in the field b
in the form: uint32_t b: 8
, a field of the type uint8_t
will be allocated, and accordingly there will be no alignment:
#include <stdint.h>
struct example {
uint8_t a;
uint32_t b: 8;
};
void main() {
struct example ex = {};
ex.a = 0x55;
ex.b = 0xaa;
}
The order of the individual bits in the bit fields.
Look at an example:
struct byte {
uint8_t b0: 1;
uint8_t b1: 1;
uint8_t b2: 1;
uint8_t b3: 1;
uint8_t b4: 1;
uint8_t b5: 1;
uint8_t b6: 1;
uint8_t b7: 1;
};
The question arises in what order will the fields b
be located in the memory cell:
Like this:
b7 b6 b5 b4 b3 b2 b1 b0
or like this
b0 b1 b2 b3 b4 b5 b6 b7
And the problem is that the bit packing order is not defined (more precisely, it is implementation-defined).
Let’s take the next structure and look at it at different byte-order CPUs.
#include <stdio.h>
#include <stdint.h>
union b {
uint8_t raw;
struct {
uint8_t b0: 1;
uint8_t b1: 1;
uint8_t b2: 1;
uint8_t b3: 1;
uint8_t b4: 1;
uint8_t b5: 1;
uint8_t b6: 1;
uint8_t b7: 1;
};
};
union b byte = {.b0=1};
void main() {
printf("b1 = %p\n", byte.raw);
}
For ARM (LE):
b1 = 0x1
For PowerPC(BE):
b1 = 0x1
A little unexpected because if you look at the assembly files for these two architectures, we will see the following — for ARM (LE):
For PowerPC(BE):
As you can see, the bit sequence is different for different architectures.
For ARM (LE), the 0th bit in a byte has the maximum right position, i.e. offset 0, and for PowerPC(BE), the 0th bit has the maximum left position, i.e. offset 7.
But at the same time when printing printf("b1 = %p\n", byte.raw);
the information is output the same for both CPUs:
b1 = 0x1
Why?
Because here the compiler comes to our help. Let’s add the setting of the bit b0
to one to our code and look at the assembly code again:
ARM (LE):
As you can see here, the compiler generates quite logical code and sets the 0th bit.
PowerPC(BE):
And then the compiler, knowing that the code is generated for the BE architecture, sets the 7th bit, because for this architecture it is considered zero. And as a result, it turns out that thanks to the compiler, we do not notice any differences in working with bit fields for LE and BE. However, firstly, as mentioned above, the order of packing bits is not defined by the standard, and therefore you can not rely on it. And, secondly, if you need to work with data received from a processor with a different byte order or with some hardware bit protocol that you communicate with over the network, then the compiler will not help you here and you need to take this into account when working with this protocol through bit fields.
Signed and unsigned bit fields
Bit fields can be either signed or unsigned. Here lies one nuance that I could not understand for a very long time — why do bit fields need a signed data type? And this is because I perceived bit fields precisely as a combination of simple bits, and this is fundamentally wrong because bit fields in the c language are a way of working with data types of a size not a multiple of a byte.
I.e., a record of the form:
struct half_byte {
uint8_t a:4;
};
It means not just a set of 4 logically connected bits, it means a data type of 4 bits in size, and the data type can be either signed or unsigned.
C99 standard (6.7.2.1):
9. A bit-field is interpreted as a signed or unsigned integer type consisting of the specified number of bits.
That is, if the type uint8_t
the range of values is [0,255], for the type int8_t
the range of values is [-127,127], then for the bit field uint8_t a:4
the range of values will be [0,15], for int8_t a:4
- [-8,7]. Let's show this by example:
struct half_byte_unsign {
uint8_t a:4;
};
struct half_byte_sign {
int8_t a:4;
};
struct half_byte_sign s_hb = {};
struct half_byte_unsign u_hb = {};
int main(void)
{
printf("i\ts_hb\tu_hb\n");
for (uint8_t i = 0; i < 20; ++i) {
s_hb.a = i;
u_hb.a = i;
printf("%d\t%d\t%d\n", i, s_hb.a, u_hb.a);
}
}
i s_hb u_hb
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 -8 8
9 -7 9
10 -6 10
11 -5 11
12 -4 12
13 -3 13
14 -2 14
15 -1 15
16 0 0
17 1 1
18 2 2
19 3 3