Showing posts with label class hierarchy diagram. Show all posts
Showing posts with label class hierarchy diagram. Show all posts

Thursday, September 18, 2014

Devirtualization in C++, part 7 (Enforcing One Definition Rule)

This time I am going to write about a feature that I implemented as a side effect of the devirtualization machinery: the verification of One Definition Rule on types.

One Definition Rule (ODR) is one of more interesting cases where C and C++ differs in very subtle way. Wikipedia describes it as follows:
In short, the ODR states that:
  1. In any translation unit, a template, type, function, or object can have no more than one definition. Some of these can have any number of declarations. A definition provides an instance.
  2. In the entire program, an object or non-inline function cannot have more than one definition; if an object or function is used, it must have exactly one definition. You can declare an object or function that is never used, in which case you don't have to provide a definition. In no event can there be more than one definition.
  3. Some things, like types, templates, and extern inline functions, can be defined in more than one translation unit. For a given entity, each definition must be the same. Non-extern objects and functions in different translation units are different entities, even if their names and types are the same.
Some violations of the ODR must be diagnosed by the compiler. Other violations, particularly those that span translation units, are not required to be diagnosed.[1]
This means that C++ type names are global and should not be re-used in different ways in different source files. (In C doing so is perfectly valid and common, types across units are considered equivalent if their structure is.)

One of motivations for ODR is to make name mangling possible. Type names are used, for example, as program wide identifiers to distinguish functions and methods of same name but taking different types of arguments (function overloading):
struct A {int a;};
int getval (struct A a) {return a.a;}
struct B {char a;};
int getval (struct B a) { return a.a;}
This compilation unit will result in exporting two function symbols:  _Z6getval1A and _Z6getval1B. A and B comes from name of the argument's type.

Now, obviously, if another unit defines completely different structure A and completely different getval taking it as argument, the function will also be called _Z6getval1A and things will fail miserably. General violations of ODR can not be diagnosed by compiler working on single translation unit in isolation. If one is lucky, ODR violation turns into linker error about duplicated symbol name. With presence of inline functions however this will not happen and one of the two functions will be eliminated from program. This may result in invocation of a function on wrong data crashing program in random ways.

Wednesday, August 20, 2014

Devirtualization in C++, part 6 - asking user for help

In previous posts of the series I described most of the new GCC devirtualization machinery. To cut the story short, GCC implements two kinds of devirtualization: the full and speculative one.

Full devirtualization replaces given polymorphic by a direct call (that can get inlined later).  Speculative devirtualization is a weaker form turning polymorphic call to a conditional testing whether the target is the predicted one and going by direct call path if it happens to be so and doing the indirect call otherwise.

Whole performing speculative devirtualization the compiler can play somehwat unsafe and make assumptions that are not valid 100% of cases. The main assumption made is that there are no new derived types introduced by code not visible to compiler. With link time optimization this is usually true, but not necessarily so - new derived types can be introduced by libraries or plugins. While compiler can generally attempt to track what instances may not be introduced by external code, this analysis is difficult and it is a question whether it can be implemented with reasonable precision within statically optimizing compiler.

The following example shows speculative devirtualization:
struct A {virtual void foo() {}};
void test (struct A *a)
{
  a->foo();
}
 Here GCC 4.9 produces:
test(A*):
        movq    (%rdi), %rax
        movq    (%rax), %rax
        cmpq    $a::foo(), %rax
        jne     .L5
        ret
.L5:
        jmp     *%rax
Instead of:
test(A*):
        .cfi_startproc
        movq    (%rdi), %rax
        movq    (%rax), %rax
        jmp     *%rax
If the target of polymorphic call happens to be a::foo(), the speculatively devirtualized code is significantly faster saving the call overhead (and enabling further optimizations).

Monday, February 17, 2014

Devirtualization in C++, part 4 (analyzing the type inheritance graph for fun and profit)

Last time I described the construction of the type inheritance graph, a main datastructure used by special purpose devirtualization algorithms. Now it is time for some real optimization fun! In this post I will review the most elementary transformations and will show some real world data.

Sunday, February 9, 2014

Devirtualization in C++, part 3 (building the type inheritance graph)

Having covered the basic optimizations done by the C++ frontend and the way how generic middle-end optimizations may devirtualize. It is a time to dive into an wonderful world if special-purpose middle-end devirtualization techniques.

Topic of optimization of polymorphic calls is seemingly well covered by the literature with active research since the advent of object oriented programming (see, for example, DF Bacon, PF Sweeney: Fast static analysis of C++ virtual function calls and the numerous papers citing thisone). Pretty much all of the techniques are based on constructing and analyzing the type inheritance graph.

Unfortunately most of the research papers make assumptions that are not met in modern C++ programs. Here the situation is complicated by many C++ features including interesting implementation details of multiple and virtual inheritance partly described in this post and ability to make dynamic type changes. Important is also the fact that the compiler hardly ever see the whole program at once. Even with the link-time optimization one has to take into account the presence of shared libraries. Because shared libraries are supposed to be upgradeable without recompiling programs that use them, the compiler can not make any analysis beyond what it is given in the header files and as explained last time sometimes not even that. Moreover shared libraries are also often dynamically loaded as plugins. Contemporary C++ programs are really a lot more dynamic than most of the research assume.

Today I thus cover the seemingly simple topic of building and walking the type inheritance graph and show details one has to think about. It may seem exhausting, but so was my way to gain all this wisdom. :) I promise that next time I finally get to some real optimization fun.