Method member pointers in C++

Wazzup, dudes?

I decided to write an article about pointers to member methods. Recently when I tried to make some compile oriented things, I crashed into the need to understand how member method pointers work. These pointers work like not usual pointers, they don’t have the ability to be casted to void pointer, and they often have size more than 8 bytes. I found not so much info about this theme on the internet.

Also I found such scary writeups, which do not explain what happens in reality and why, but just try to force a programmer to blindly follow the rules.

Let’s checkout what here happens and why.

Let’s look at code.

Output: 16
Size of the pointer is bigger than 8 bytes. In some compilers it is not like this, for example the Microsoft compiler strips member method pointer to 8 bytes in some cases. In the last versions of clang and gcc compilers member method pointer took the size of 16 bytes.

I think, that compiler developers could not change the usual pointer to something else without any special reason. Let’s check out why they did this.

Let’s watch such code in C++. It is a basic example of calling the member method pointer.

By compiling it with such command:
clang++ code.cpp -c -emit-llvm -S -O3 -fno-discard-value-names
We get the following LLVM IR:

LLVM IR is an intermediate representation between the machine code and C++ in the Clang compiler. It allows the compiler to produce the optimisations which do not depend on the specific CPU architecture, and gives us an ability to understand what is happening on the specific compilation stage, also it is more human-readable than assembly language.
It is possible to get more information about LLVM IR at Wikipedia, official LLVM site and Clang.

What we have:
- If we look at the first line, it is possible to see that member method pointer is a structure { i64, i64 }, but not the usual pointer. This structure contains two i64 elements, which are able to contain 2 usual pointers. By this reason we can understand why we can’t cast member method pointers to usual pointers. We can’t cast 16 bytes to 8 bytes without data loss in the common case.
- In the entry block which starts in the 5th line, we can see that this pointer is adjusted. It means that the compiler adds the value of the second element of structure to this pointer, and later in the memptr.end block it passes it into the method call.
- Something strange happens in the entry block on the 14th line with the 1st element of the structure. Compiler computes the expression similar to this one: bool isvirtual = val & 1. Compiler assumes that the member method pointer is virtual if the number is odd, otherwise it assumes it is not virtual.
- If a member method pointer points to a non virtual method, the value of the first element is assumed as to be a pointer to function, which called later. These assumptions happen in the memptr.nonvirtual block.
- If a member method pointer points to a virtual method, it is a little bit harder. First compiler subtracts 1 from the 1st element of structure, and assumes that this value is offset in the virtual table, which is taken from the value of pointer to this. It happens in memptr.virtual block.

On this step we understand that the following data exists inside the member method pointer:
- Is it virtual or not
- Pointer to the method (if not virtual)
- Vtable offset (if virtual)
- this adjust

How method calling happens in C++.
Method of the class does have an invisible first parameter — pointer to this, which is passed by the compiler when a call happens. Other arguments passed after in the same order they were before.
If we wrote this code in C++, it would look somehow like this:

To understand the adjust value, let’s see this example:

And compile it with command:
clang++ code.cpp -c -emit-llvm -S -O3 -fno-discard-value-names
Output:

As we see, member method pointer points to the same function. But the adjust value is different because class B is located inside class C. C++ compiler needs to know the offset from the base class to pass this in the class method.

What is bad in this implementation:
- Size of pointer is quite big, even if adjustment is not a case in gcc and clang
- Every time compiler checks if method is virtual, even if we know that it is not

What we can do:
- Use static method, which accepts the instance of class
- Forget about the existence of member method pointers, and handle the problem somehow other way

Other:
- There are advices on the internet to use std::bind, std::function and similar library functions. I checked them, and didn’t find the existence of any optimisations for member method pointers.
- Also I don’t have a technical possibility to fully use Microsoft compilers, so I didn’t say much about it. But I checked a little bit online compilers, and saw that MSVC can analyze class structure and remove the adjust value field if it is not needed.

Also I implemented a trick, which allows us to remove a check to test if the method is virtual for clang and gcc.

In this case the compiler won’t check if the method is virtual, and call in instantly as non virtual. It is just an example of unusual optimisation, and it shouldn’t be used in reality.

Also I made a small program, which outputs the data about the member method pointers and helps to understand their internals, while I was writing this article. It works in gcc and clang. Code is here.

After making this investigation I understood how member method pointers work and how to deal with them. Hope it will be useful for someone.

--

--

helloooooooooooooooooooooooooooooooooooooooooooooo

helloooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo