Clang recently got a new calling convention that makes these tail calls much cheaper (avoids the need for the caller to preserve some registers). I can never remember the name - it’s either preserve_all or preserve_none (whose perspective is the preservation from?).
This will make [[musttail]] + preserve_none a winning combination when used together, particularly when making non-tail calls to fallback functions that use a regular calling convention, because all the arguments to [[musttail]] functions can stay pinned in the callee-save registers.
preserve_all also exists, and has existed for a while. You could use it on fallback functions to help the tail calling functions avoid spilling. But this always seemed like an unsatisfactory solution to me, because it's too intrusive (and requires too much diligence) to tag a bunch of regular functions as preserve_all. It's much more practical to tag all the core interpreter functions as preserve_none.
I see signs that Google itself is using preserve_none internally, since the public protobuf repo has PROTOBUF_CC (Calling Convention) but it is defined as nothing
#define PROTOBUF_CC
Is there any chance of this getting out into the wild or is it too dangerous for us mortals?
Since preserve_none is Clang-only and only available on recent compilers, it would introduce an ABI hazard between the core library and generated code. We don't want to cause crashes if you compile protobuf with one compiler and generated code with another.
On 1: this small amount of overhead matters because the amount of work you do on each opcode can be tiny. The extra jump could be 20% of your runtime!
On 2: yes, this helps the indirect branch target predictor. In a real program you’ll often get repeated sequences of opcodes (increment, then compare) that can be predicted. These branch predictors can be pattern based.
Why do you think that? How complicated could it possibly be? Also as a person becomes more of an expert in a specific thing, they make progress much faster (if they remain motivated).
Also it’s just rewriting the logic. One could theoretically write unit tests for every function to make sure the inputs/outputs match.
> A linked list element won't move once allocated.
only pertains to languages like C. In a virtual machine language like java, this isn't a property that can exist (there's no such thing as an address - at least as far as the language is concerned).
A reference in Java can essentially be thought of as a pointer into a virtual address space. The JVM can move around the location of an object in the system's actual memory, but as far as anything holding reference is concerned nothing has changed.
And a reference to an object that's in an array won't be invalidated if that object is moving around the array or even removed entirely. It's still a C concern.