Is making me twitchy, I mean I'm sure we're on two's complement machines and there is no C compiler in the world where those two could be unequal but still subtract to zero* , but still, how about one of
x |= (MAC received[i] != MAC computed[i]);
x |= (MAC received[i] ^ MAC computed[i]);
Otherwise, Good stuff!
* Possible break: If we have a separate sign bit, then we might have 0 and -0, which would subtract to give 0. Don't know of any machine/compiler that does that though.
Your != proposed change likely results in a branch, which is what we're trying to avoid. The ^ version is fine, but the original subtraction is too as long as they're unsigned.
Are you sure? All architectures I am aware of have mechanisms to do this in a branchless way. On my machine:
int main(int argc, char argv) {
int x = 0;
x |= (argc != 1);
return x;
}
Relevant portion of assembly (no optimizations):
cmpl $1, -20(%rbp)
setne %al
movzbl %al, %eax
orl %eax, -4(%rbp)
Most other architectures have one of:
1) condition codes that you can use to predicate an "or $r1, $r1, 1" (e.g., ARM)
2) compare instructions that put a 1 or a 0 in a result register, which can be directly ORed in. (e.g., MIPS)
Of course, that code only works if you are either not writing in C89 or if your compiler and ISA conspire to always return exactly one from true boolean expressions. It is my understanding that the C89 standard only requires "nonzero," so you might need "a |= (b != c) ? 1 : 0;*" instead.
YMMV depending on compiler and architecture, but the 2 or 3 platforms I tried without passing any opti flags were branchless.
And why is the original way of not completing the for cycle for some keys wrong? Is the difference between checking the whole array and checking only lets say first byte measurable? Especially when considering that this would probably be done over a network?
The current state of the art in remote measurement (which is in its infancy; the first papers to apply basic signal processing on measurement over IP networks are just coming out) suggest that the timeable thresholds are in the tens of microseconds over a WAN.
But that doesn't matter, because over switched "local" networks (read: from one end of a data center to another), the thresholds are in nanoseconds; everything is timeable. Attackers will spend the $30 to get a machine in the same hosting center as you as soon as the other attacks get less trivial.
Yes, it is measurable. Remote timing attacks are scarily practical -- even a single cycle difference can be a problem, because an attacker can many many attempts and compute the average time in order to kill the noise.
x |= MAC_computed[i] − MAC_computed[i];
Shouldn't one side or the other (of the minus operation) be MAC_received[i]?