I don’t have extensive experience on this yet, but I believe this is also a solved issue (using custom fonts). Serve the font-family from the same source as the website, preload only the primary font-style (say “normal”), and custom-pick the font to just the Latin subset. That should be fast enough that almost none will notice, except for the pedantic developers like us (personally, I can forgive that).
Henceforth, let the others (styles, variable, etc) kick in as needed.
You can also subset your fonts; e.g. if your content is in a language that uses the Latin alphabet, then you only need to include those characters in your font. Between that, variable fonts, and WOFF2, I've managed to get Inter down to 50kB (plus another 50 if you need real italics).
Partly, the answer is “tough”. As a designer, you don’t and aren’t meant to have pixel-level control over the screen contents. Web is not print. Don’t ask for the PostScript standard fourteen. (Somehow this lesson comes through much better for reflowable ebooks.)
Partly, I am willing to admit that web fonts are still nice when you can get them. But they’re too unwieldy to block on (slow connections exist; font foundries are assholes[1]; etc.), and we don’t really have a solution (the problem with FOUC is not the unstyled content, it’s the layout shift).
While I'm absolutely not a design-should-rule-all person, I think there's quite a range between "pixel-level control" and "you can't choose which font to use".
If we'd reliably have the top 50 google fonts on every OS, there'd be a lot less webfonts used.
system-ui
Glyphs are taken from the default user interface font
on a given platform. Because typographic traditions vary
widely across the world, this generic is provided for
typefaces that don't map cleanly into the other generics.