C++ — Tips for applying inline in Visual Studio

HyunsuYu
More Deeper C++ Programming Language
5 min readJul 2, 2024

Select an inline function

In the case of MSVC, the function to be expanded in-line is selected through the following process

  1. Check Build Options
  2. Navigating functions for which inline extension is requested
  3. Perform cost-benefit analysis

A closer look at each step is as follows:

1. Check Build Options

1.1. inline function extension options:

  • Request to inline expansion : /Ob{1/2/3}
  • Request to ban inline expansion : /Ob0

1.2. Optimization options:

  • Request to inline expansion : /O{1/2/x} = Apply /Ob2
  • Request to ban inline expansion : /Od

2. Navigating functions for which inline extension is requested

2.1. Specifier

  • Request to inline expansion : inline, _inline, __inline, _forceinline, __forceinline
  • Request to ban inline expansion : __declspec(noinline)

2.2. Attribute

  • Request to inline expansion : [[msvc::forceinline]], [[msvc::forceinline_calls]]
  • Request to ban inline expansion : [[msvc::noinline]], [[msvc::noinline_calls]]

2.3. Pragma

  • Request to inline expansion : #pragma auto_inline(on), #pragma inline_recursion(on)
  • Request to ban inline expansion : #pragma auto_inline(off), #pragma inline_recursion(off)

Compilers cannot apply inline extensions to functions in the following cases:

  • If a function or function caller is compiled into /Ob0
  • If a function or function caller uses different types of exception handling
  • If the function contains variable argument lists
  • If it is not compiled as /Ox, /O1 or /O2
  • If the function is recursive and ‘#pragma inline_recursion(on)’ is not set at the same time
  • If the function is a virtual function and polymorphism is applied
  • When a call is made through a function pointer pointing to a function
  • ‘naked’, ‘__declspec(noinline)’ specifier declared together

If the compiler is unable to extend the function declared ‘__forceinline’ in-line, it generates a Level 1 alert, except as follows:

  • Compiled to /Od or /Ob0
  • It is defined in a library or other type of external source

The recursive function can only apply the inline extension to the depth specified in ‘#pragmaline_depth’ (up to 255), beyond which the recursive function call is treated as a call to the function instance
Whether the inline extension is applied to the recursive function or not is controlled by ‘#pragma inline_recursion’

As a side note, the constructor is implicitly the subject of an inline extension

3. Perform cost-benefit analysis

In general, it is expected that online expansion can lead to the following performance improvements:

  • Call function (including parameters that place and forward the address of the object in the stack)
  • Preserving the caller’s Stack Frame
  • Set up a new stack frame
  • Return value communication
  • Restore previous stack frames
  • Return
  • Reduce object file Size
  • Reduce the number of virtual memory paging
  • Reduce instruction
  • Increase instruction cache hit rate

However, online expansion is expected to cause the following performance degradation if abused:

  • Increase object file size
  • Increase the number of virtual memory paging
  • Decrease instruction cache hit rate

Most of these performance improvements/decreases will be skewed to either side at the assembly level, when the code made for the body of the function inline is shorter than the code made for the typical function call

Cost-benefit analysis takes the above factors into account comprehensively through static analysis and applies the inline extension to the function only when the corresponding inline extension is expected to cause performance improvement

Summury of select an inline function

‘inline’ and ‘__inline’ specifier ask the compiler to insert a copy of the body of the function at each location where the function is called. The key here is that this is just a request, so the compiler can ignore the request as much as possible, and the function being inline does not completely decompose the function itself, but rather uses a copy of the body of the function, so the original function itself remains the same

For MSVC, inline expansion only accepts inline requests when it is deemed to be a cost-benefit through cost-benefit analysis

Of course, if you don’t like this cost-benefit analysis of MSVC, you can ignore the cost-benefit analysis process with ‘__forceinline’ specifier
However, even if the cost-benefit analysis process is skipped, it is not necessarily guaranteed that inline expansion will be applied

And none of those functions can be inline when the /clr compilation option is applied on MSVC
In addition, ‘_inline’ and ‘_forceinline’ are synonymous with ‘__inline’ and ‘__forceinline’, respectively, unless you specify the compiler option /Za (do not use language extensions) for compatibility with earlier versions

Apply inline extension

Once the application of the inline extension is confirmed, the inline extension is applied slightly differently depending on whether the function is a recursive function

If a function whose inline extension is confirmed is not a recursive function, a copy of the body of the function is inserted into the calling position of the function

However, if the function confirmed to apply the inline extension is a recursive function, the inline extension can only be applied to a certain depth according to ‘#pragma inline_depth( [n])’ and beyond that, a typical function call is applied
The default value for ‘#pragmaline_depth([n])’ is 16, and n can be set from 0 to 255
That is, if there is a recursive call beyond inline_depth, the inline extension may be applied as follows:

inline int GetFac(const int n)
{
return (n > 2) ? n * GetFac(n - 1) : n;
}

If there is “GetFac(int)” as in the code above, and “GetFac(6)” is called, there is a possibility that it will be handled as follows:

If the recursive depth within the function is less than or equal to line_depth

In this case, instead of calling the function of “GetFac(6)”, there is a possibility that the calculated “720” will be inserted in Compile Time

If the recursive depth within the function is greater than line_depth

In this case, instead of calling the function of “GetFac(6), there is a possibility that the result calculated in Compile Time, such as “6 * GetFac(5)”, is only almost halfway calculated

Materials recommended to read

--

--

HyunsuYu
More Deeper C++ Programming Language

As a game developer with Unity and C# as the main players, I've been working on a number of game development projects and side projects