I think it was Microsoft that proposed an approach. They'd modify games to continually speculatively execute and render every user input. So that might be they render you beginning to run left/right/forward/back as well as jumping and shooting. When you actually change the input, a local device can switch streams and start speculatively executing all over again.
It probably works best if the game engine cooperates. But that's not necessary. You can just split processes on the OS and run each different bit of user input in a different process with no cooperation from the process. (Though I admit this might be tricky on current hardware and heavy games.) Given enough compute and bandwidth, you could do this continually.
In theory, with unlimited compute/bw this means you can have local latency (just the cost of input/stream switching) because you could speculatively execute every possible input to the game, all the time, out to the latency duration. In practise, it'll probably prune things based on the likely inputs and only speculate a bit out. This is probably enough to provide a smooth experience for most users that aren't playing competitively.
If you think about a game as a mapping from a limited set of user inputs to a 2D image, some optimizations start coming out, I suppose.
But that sounds almost impossibly computationally expensive for 3D games and the like. Furthermore most game inputs aren't discrete but continuous, making the problem even hearder.
They tested with Doom 3 and Fable 3. I don't recall the specifics but I'm gonna guess that the actions people take are really quite limited, so with a bit of work you can probably guess what they're going to do enough to make things playable.
It probably works best if the game engine cooperates. But that's not necessary. You can just split processes on the OS and run each different bit of user input in a different process with no cooperation from the process. (Though I admit this might be tricky on current hardware and heavy games.) Given enough compute and bandwidth, you could do this continually.
In theory, with unlimited compute/bw this means you can have local latency (just the cost of input/stream switching) because you could speculatively execute every possible input to the game, all the time, out to the latency duration. In practise, it'll probably prune things based on the likely inputs and only speculate a bit out. This is probably enough to provide a smooth experience for most users that aren't playing competitively.
If you think about a game as a mapping from a limited set of user inputs to a 2D image, some optimizations start coming out, I suppose.