ABI
Abstract
This page is a work in progress.
Table of
Definitions
ABI
In general (not considering C++), an Application Binary Interface is an interface between compiled code modules. “Compiled” implies the modules contain machine code, rather than human-readable source code.
When we consider C++, the ABI for a platform defines how a compiler generates machine code for several language features. The Itanium ABI provides a convenient definition; it is both general enough to be correct and specific enough to be interesting:
[The C++ ABI is] the object code interfaces between different user-provided C++ program fragments and between those fragments and the implementation-provided runtime and libraries. This includes the memory layout for C++ data objects, including both predefined and user-defined data types, as well as internal compiler generated objects such as virtual tables. It also includes function calling interfaces, exception handling interfaces, global naming, and various object code conventions.
We look a bit closer at some of these language features below.
REVIEW(sean): Is Itanium’s list of language features sufficiently exhaustive for our needs?
Note: above we say, “the ABI for a platform”, rather than “the C++ Standard ABI”. Indeed, an ABI is only relevant when considering specific compilers and hardware, and the C++ Standard does not attempt to standardize an ABI for all conforming platforms. It does, however, enforce certain restrictions that have impacts on any ABI, such as alignment requirements for standard types. CITATION NEEDED.
Memory Layout for Objects & Virtual Tables
Do base class members come before or after the derived members? Multiple inheritance: left to right? right to left? pack anything cleverly? Note this is where the Empty Base Optimization takes place.
Calling Conventions
For Itanium, For trivial types, same as C For non trivial types: 1. Allocate space for the class on the stack 2. The caller invokes the copy constructor 3. The address is passed as a normal argument 4. The caller invokes the destructor.
This convention is arbitrary. The only way compiled modules know to behave this way is by adhering to the convention. A precompiled binary cannot (easily? at all?) change its convention without recompiling.
Triviality of class can determines whether a struct is passed in registers or on the stack; defaulting destructors can break this ~Foo() = default
vs ~Foo() {}
.
Exception Handling
Name Mangling
Other
Vtable layout, RTTI Layout Exception Handling
One Definition Rule (ODR)
ODR in the Standard ODR in cppreference
“Why can’t the linker catch these?”
ODR-used
I’ll make the argument that you already know what ODR-used means, just with a different intuition. To ODR-use a name is to use it in some way that requires the definition to be present at link time. If you’ve ever gotten a linker error, that means you’ve ODR-used an entity, and the linker could not find the definition.
Naturally, then, forward declarations
1
2
3
4
void foo(); // like this,
class T bar(class U u); // or this,
class V; // or this,
using X = Baz; // or this (since Baz may be incomplete at this line)
would not constitute ODR-use. By contrast, each of these lines:
1
2
foo(); // calling a function
T my_t; // declaring a variable of a specific type
does constitute ODR-use. And of course, this is not exhaustive. How many ways can you think of to cause a linker error? They’re all examples of ODR-using a name.
In summary, you can think of “ODR-used” as “possibly causing a linker error”.
TODO: Cover the cases where this intuition is misguided. TODO: Add some examples with static data members, that’s where stuff gets a bit wonky.