On a reasonable machine (e.g., x86, x86_64, ARM, ARM64), in a reasonable context (i.e., not the very tippy top stack frame), the one-past-the-end read is harmless. The value is unused, and the read itself is not going to segfault because it's going to read the frame pointer or return address out of the stack frame.
... unless your implementation does bound checking at runtime. This actually happen with the new crop of sanitizes or even with the new hardware accelerated bound checking features of modern intel cpus.
It would be a shame if people had to disable critical safety features to workaround crashes caused by such "harmless" out of bound reads.
That's also taking the view that the code is executing pointless busy work as the result is meaningless -- seems fair enough to optimise the whole thing away to me under those circumstances, although I suppose the compiler ought to put in some "random" return value. Four would probably do nicely as it's already been proven to be the best random number.
When you say "optimize the whole thing away", do you mean that it's reasonable to skip that particular load or reasonable to skip the entire loop? Anton and I (and I presume 'swolchok') are all for reordering the code to avoid the the unnecessary load, and if it had some benefit, would probably be fine with replacing it with some random or non-random number.
The part that makes us cringe is for the compiler to reason backward from the undefined load to removing the entire loop, even for the values that are within range. While accepting that compiler would be spec compliant in doing so (replacing main() with "return 0" would also be spec compliant), we question whether that really makes for a better/faster/safer user experience. Essentially, we think that the clear intent of the code should have greater influence on the particular optimizations that are chosen.
I do agree that this isn't quite so straightforward, but I also can't see that it's necessary for the compiler to guess what is meant here. Even supposing the array is initialised elsewhere there is a read past the end of the array -- I hope we'd both agree that a segfault in response is perfectly reasonable.
If that's the case then maybe forcing a segfault is better than optimising the loop away, but I do err on the compiler's side here. As others have pointed out, the compiler warns that undefined behaviour is invoked by the code and it probably doesn't actually do what the author thinks it does. The diagnostic also isn't required by the standard.
The clear intent of the code certainly isn't clear to me -- whatever calculation is done involving a value read from past the end of the array only tells me that the return value isn't used in any meaningful way, and if that's the case then why not eliminate the code?
[EDIT] Ok, just re-read the code, I guess the value is actually unused so the function return value is presumably OK, but I think the option of the segfault is still reasonable for the compiler to do.
A segmentation violation when doing the out-of-bounds access would be a possible behaviour of C* code.
Concerning the intent: Note that the result of the out-of-bounds access is not being used. Looks to me like SATD() just produces the sum of absolute values of d.