%n, also known as a terrible misfeature. It’s useful in Xscanf because it lets you get the number of characters read, but it’s useless and outright dangerous on the Xprintf functions - Xprintf already returns the number of characters written, and allowing %n in printf has enabled a whole class of terrible printf format string bugs that had no reason to exist.
Of course, most of the time the only reason why you have a %n in your format string is because you passed arbitrary, unsanitized user input to it. If you're doing something like that, scanf isn't going to work, either, because the any attacker can just scribble all over your stack. And it's not totally useless: it can be used for aligning things because it writes the number of characters printed into a variable.
macOS, for one, has changed printf to SIGILL when passed a dynamic string that contains %n, though, which mitigates most of the issues present with format string attacks.
> And it's not totally useless: it can be used for aligning things because it writes the number of characters printed into a variable.
If you're going to align anything, you will (probably) need to write some code to compute the right alignment. At that point, you will need to split your printf into pieces, so why not just use printf's return code, which tells you the number of characters written?
glibc on Linux started hardening printf by replacing calls to printf with calls to __printf_chk, which will abort if the format string contains %n and sits in a writable segment of memory. The implementation is somewhat horrific - printf opens /proc/self/maps and range-checks the format string's address, but it works and does provide the necessary security.
It's much less common to pass unsanitized input to scanf than to printf, in my experience, because the most common error case of printf(buf) (instead of printf("%s", buf)) is just not something that you can do with scanf.
----
By the way, which version of macOS did they introduce the SIGILL hardening, and is it documented? I'm on 10.12.6, and the following program:
#include <stdio.h>
#include <string.h>
int main() {
char buf[] = "Hi %d Bye\n";
strcpy(buf, "Hi %n Bye\n");
int foo = 424242;
printf(buf, &foo);
printf("foo was set to %d\n", foo);
}
runs without complaint (Apple LLVM version 9.0.0 (clang-900.0.39.2)).
> If you're going to align anything, you will (probably) need to write some code to compute the right alignment. At that point, you will need to split your printf into pieces, so why not just use printf's return code, which tells you the number of characters written?
The child comment provides a use case for %n, so I won't go into this here.
> glibc on Linux started hardening printf by replacing calls to printf with calls to __printf_chk
Only with -D_FORTIFY_SOURCE=2 set when compiling. Swapping printf calls with __printf_chk is off by default.
> The implementation is somewhat horrific - printf opens /proc/self/maps and range-checks the format string's address, but it works and does provide the necessary security
> By the way, which version of macOS did they introduce the SIGILL hardening, and is it documented? I'm on 10.12.6
You're off by a version; this was introduced in 10.13. Your program crashes nicely on with Apple LLVM version 9.1.0 (clang-902.0.39.2) on macOS High Sierra 10.13.5 (17F70a). Documented? Ha, I wish. It just ended up breaking a couple of programs that relied on this behavior, since it would just call os_crash with "%n used in a non-immutable format string".
At that point, you will need to split your printf into pieces, so why not just use printf's return code, which tells you the number of characters written?
Suppose you are aligning subsequent output lines to the current line, and want to know the number of characters at certain points in the string. In fact this is probably the exact use-case for %.* and %n used together.
Reminds me of the time I realized that the indie multiplayer game I was playing would print jibberish whenever you typed a "%" in the chat and therefore it was probably just doing an unsanitized sprintf somewhere. Then I realized that putting a %n in there would mess things up and yup crash the server (not just the client). It would have taken a lot more skill than I had to actually exploit that to do more than just randomly crash servers.
Yeah, pulling off a successful format string attack generally requires knowledge of where the stack is or having a variable pre-initialized to the address of something on it; then you'll need the address of something useful to jump to. If you have a copy of the non-PIE/ASLR executable, this is easy, otherwise other methods are needed.
Actually, if you (the attacker) receive the printf output, then you can pretty trivially leak almost any memory you want with %N$x, allowing you to first leak the entire stack, then pivot to leak any referenced memory regions (which will almost certainly include the executable and many libraries due to return addresses on the stack). Worse, you can then use %N$n to write to any pointer in memory - if you can find your input string in memory (dereference stuff until you find the heap, for example), you can stuff pointers into it and get a write-what-where primitive (which is game over).
This is part of what makes format string vulnerabilities so nasty - they hand you a free leak in addition to %n being capable of very flexible overwrites.
> you can then use %N$n to write to any pointer in memory - if you can find your input string in memory (dereference stuff until you find the heap, for example), you can stuff pointers into it
You first have to get a valid pointer to something, which can be pretty hard when the addresses are randomized. Yes: if you can satisfy all the requirements for a printf attack, there's a whole lot you can do: overflow the stack and write to the return address, find the address of system, perform arbitrary reads or writes, etc. But as you've mentioned yourself, a successful format string exploit requires more than just "someone passed a user-controlled string to printf", at least as far as I'm aware.
> %% - complete form
Wait, what? %n writes the current number of characters outputted to the int * you specify, while %% just prints a percent sign.