Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I understand that it does so when the exact type is observed - a direct call on a concrete type. But I was wondering if it performs whole-program-view optimization for interface calls. E.g. given a simple AOT-compiled C# program:

    using System.Runtime.CompilerServices;

    var bar = new Bar();
    var number = CallFoo(bar);

    Console.WriteLine(number);

    // Do not inline to prevent observing exact type
    [MethodImpl(MethodImplOptions.NoInlining)]
    static int CallFoo(Foo foo) {
        return foo.Number();
    }

    interface Foo {
        int Number();
    }

    class Bar: Foo {
        public int Number() => 42;
    }
On x86_64, 'CallFoo' compiles to:

    CMP byte ptr [RDI],DIL ;; null-check foo[0]
    MOV EAX,0x2a ;; set 42 to return value register
    RET
There is no interface call. In the above case, the linker reasons that throughout whole program only `Bar` implements `Foo` therefore all calls on `Foo` can be replaced with direct calls on `Bar`, which are then subject to other optimizations like inlining.

In fact, if we add and reference a second implementation of `Foo` - `Baz` which returns 8, `CallFoo` becomes

    ;; calculate the addr. of Bar's methodtable pointer
    LEA    RAX,[DevirtExample_Bar::vtable]
    MOV    ECX,0x8 ;; set ECX to 8
    MOV    EDX,0x2a ;; set EDX to 42
    ;; compare methodtable pointer of foo instance with Bar's
    CMP    qword ptr [RDI],RAX
    ;; set return register EAX to value of EDX, containing 42
    MOV    EAX,EDX
    ;; if comparison is false, set EAX to value of ECX containing 8 instead
    CMOVNZ EAX,ECX
    RET
Which is effectively 'return foo is Bar ? 42 : 8;'.

Despite my criticism of Go's capabilities, I am interested in how its implementation is evolving. I know it has the feature to manually gather a static PGO profile and then apply it to compilation which will insert guarded devirtualization fast-paths on interface calls, like what OpenJDK's HotSpot and .NET's JIT do automatically. But I was wondering whether it was doing any whole-program view or inter-procedural optimizations that can be very effective with "frozen world single static module" which both Go and .NET AOT compilations are.

EDIT: To answer my own question, I verified the same for Go. Given simple Go program:

    package main

    import (
        "fmt"
    )

    func main() {
        bar := &Bar{}
        num1 := callFoo(bar)

        fmt.Println(num1)
    }

    //go:noinline
    func callFoo(foo Foo) int {
        return foo.Number()
    }

    type Foo interface {
        Number() int
    }

    type Bar struct{}

    func (b *Bar) Number() int {
        return 42
    }
'callFoo' compiles to

    CMP        RSP,qword ptr [R14 + 0x10]
    JBE        LAB_0108ca68
    PUSH       RBP
    MOV        RBP,RSP
    SUB        RSP,0x8
    MOV        qword ptr [RSP + foo_spill.tab],RAX
    MOV        qword ptr [RSP + foo_spill.data],RBX
    MOV        RCX,qword ptr [RAX + 0x18] ;; load vtable slot?
    MOV        RAX,RBX
    NOP
    CALL       RCX ;; call the address loaded from the vtable?
    ADD        RSP,0x8
    POP        RBP
    RET
    LAB_0108ca68                                    XREF[1]:
    MOV        qword ptr [RSP + foo_spill.tab],RAX
    MOV        qword ptr [RSP + foo_spill.data],RBX
    CALL       runtime.morestack_noctxt                 
    MOV        RAX,qword ptr [RSP + foo_spill.tab]
    MOV        RBX,qword ptr [RSP + foo_spill.data]
    JMP        main.callFoo
It appears that no devirtualization takes place of this kind. Writing about this, it makes for an interesting thought experiment what it would take to introduce a CIL back-end for Go (including proper export of types, and what about structurally matched interfaces?) and AOT compile it with .NET.

[0]: VMs like OpenJDK and .NET make hardware exception-based null-checks. That is, a SIGSEGV handler is registered and then pointers that need to throw NRE or NPE either do so via induced loads from memory like above or just by virtue of dereferencing a field out of an object reference. If a pointer is null, this causes SIGSEGV, where then a handler looks if the address of the invalid pointer is within first, say, 64KiB of address space. If it is, the VM logic kicks in that recovers the execution state and performs managed exception handling such as running `finally` blocks and resuming the execution from the corresponding `catch` handler.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: