How is it that random data would only slow it down? If I add a random field to my response of variable length, say, from 0 to 50, with completely random characters, it should completely throw off this attack. The length of the output will change from request to request.
I suppose, given infinite time, you could send the same request over & over and map the variance of content lengths, and get an idea of what the actual content length was before random padding? But the compression seems to throw that off even more AFAIK - because the data we pad with is random, it could very well accidentally compress well because of the rest of the data in the response, further throwing off any guesses.
Edit: From the pdf on breachattack.com:
While this measure does make the attack take longer, it does so only slightly.
The countermeasure requires the attacker to issue more requests, and measure the
sizes of more responses, but not enough to make the attack infeasible. By repeating
requests and averaging the sizes of the corresponding responses, the attacker can
quickly learn the true length of the cipher text. This essentially boils down to the
fact that the standard error of the mean in this case is inversely proportional to p
N, where N is the number of repeat requests the attacker makes for each guess.
I think it is still discoverable because adding random data leads to the following two distributions:
Attacker gets secret wrong, page is size: original page + (zero to fifty) + length of incorrect secret
Attacker gets secret right, page is size: original page + (zero to fifty)
With a sufficiently high number of observations the attack with the right secret has a mean value that is lower than the attack with the incorrect secret.
At least that's what it appears to me. I could be wrong.
I suppose, given infinite time, you could send the same request over & over and map the variance of content lengths, and get an idea of what the actual content length was before random padding? But the compression seems to throw that off even more AFAIK - because the data we pad with is random, it could very well accidentally compress well because of the rest of the data in the response, further throwing off any guesses.
Edit: From the pdf on breachattack.com: