Ok, so this article is at least contributing to the good side of the discourse about undefined behavior, because it isn't actually delusional about reality. (I mean this in the sense of people being deluded/having the wrong notions about actual facts, by the way, rather than actual craziness. The people who are deluded about undefined behavior tend to say things like "C is definitely a portable assembler" or "compilers are trying to trick me" which are definitely false. A non-deluded person can be upset about the current state but understanding how C currently works is important if you want to discuss it.)
Anyways, to the actual content of the article, I agree with it but I think the frustration it accepts as reasonable is actually misguided. Here is my understanding of how things ended up for C (disclaimer: I was not born when most of this stuff happened.)
In the beginning, you had a C compiler for your computer, and it was basically just an assembler. This is what the "make C great again" people think the language really is under the hood, by the way. However, very quickly people realized that they want their C code to run elsewhere, and every computer does things differently, so they needed to have some sort of standard of what was approximately the lowest common denominator for most machines and that become the C standard for what it is legal to do. The guarantee created at that point was that if you conform to the standard, every implementation of C has to run your program as the standard specifies. This was palatable to people because they had a bunch of machines with weird byte orderings or whatever, and it was obvious what would happen if the dumb compilers of the time translated their platform-specific code to a new architecture.
Later, the weird architectures started becoming rare. At the same time, though, a new architecture started growing: a virtual architecture, one where the compiler would actually "port" your code to the exactly same processor you were compiling for before, but the code would run faster. It would do this by starting to take latitude through intermediate transformations which it assumed it could do because your program should have been portable to the abstract machine.
Now, this completely weirded people out, because "I'm compiling to a new virtual architecture called 'x86-64 -O3' that is the same as 'x86-64 -O0' but faster and more restrictive" sounds really stupid. It's the same architecture, and they're not even real processors! But if you really think about it compilers are really just taking advantage of the fact that your code is portable, because it works in the space of the C abstract machine, to do a "port" called "run a bunch of optimization passes". People understand when a port ends up causing a trap on another processor because of course it does that on the new platform. But getting people to understand that your unaligned accesses on the "-O0" machine are no longer valid on the "-O3" machine, because, again, the instructions that come out look awfully similar and straightforward most of the time, except for the weird times where a change "surprises" you because the transition between the two crossed through an invalid space. Kind of like a path that seems to have a "weird jump" because it normally crosses through 3D space and at some point someone found a shortcut through the fourth dimension.
Anyways, the performance virtual architecture is all well and good, but what I think will be interesting moving forward is the security virtual architecture, where overflows and out-of-bounds accesses and type confusions are focused on more. Right now as a side effect of performance optimization they end up causing headaches for people, but Valgrind/sanitizers are an interesting look into what compiling to "x86-64 for security testing" architecture looks like. The logical next step is even more exciting, because we're actually starting to deploy real architectures with security-focused features that will require ports that are every bit as concrete as any other physical architecture difference, which I think will "legitimize" this mindset to the people who I called deluded at the start of this now very rambly comment. Page protections means that "const" is not something you can ignore. Pointer signing can mean that your "but they're the same bits underneath!" type confusions are no longer valid. ARM's Morello now means you can't play fast-and-loose with your pointers anymore; they're 128+ bits and you can't just decide you want forge one out of an integer anymore without caring. Ports to these architectures absolutely rely on the existence of a C abstract machine, which has served pretty well considering that its existence is really just what a piece of paper says is legal or not, rather than something really planned beforehand.
Thank you for taking the time to explain this, I've seen numerous threads where people complain about C but lack enough understanding of it to know where to properly attribute the blame.
C shares one common trait with assembly: the ability to access, interpret, and modify memory freely. It is also an intensely manual language and expects the programmer to be both discriminating and thorough when dealing with unexpected values. Beyond that, they should not be compared.
The language is not forgiving and as such has earned resentment from programmers which have had the benefit of using other languages that lessen the burden on the developer. Truthfully, all programming languages have an area where they excel (even ones we don't enjoy) and often times the approach or requirements of a project determine the language which should used. Many complain about it generally as being inadequate but fail to provide the context of which the language is employed, in which case it would be obvious that they should use another language.
I would also like to add that scale (as in LoC or project complexity) is an important factor to consider when selecting an appropriate language. C can be much easier to manage for smaller executables/libraries or projects that don't have many layers of abstraction. Simultaneously, modern software and application complexity has grown significantly since C's inception and it is not commonly the most ideal solution.
Discussions involving languages are enjoyable here when there is deliberation but I loathe when they devolve into tribalistic posturing and whinging.
> C shares one common trait with assembly: the ability to access, interpret, and modify memory freely
In reality that is a trap; C makes you think that you might be able to poke memory in all sorts of ways but in reality there are lot of subtle restrictions around memory access. The whole discussion around pointer provence is the tip of the iceberg here.
And that is pretty much at the core of the whole ub hullabaloo; difference between what C seems to be (or have been) and what the standard says.
To see how cross platform C looked like in those early days, nothing like reaching out to books like "A book on C" from 1984 (Robert Edward Berry and B. A. E. Meekings), which has an implementation of RatC [0].
I fully agree with this. For example that assigning a freed pointer in C is UB is not because of optimization, but because there were real world architectures with memory segmentation were loading such a pointer caused a run-time trap (e.g. 286 protected mode). Or that reading an uninitialized automatic variable is UB is because it could cause a trap on architectures which could detect this (e.g. IA-64). There were also C versions with bounds checking etc. That compilers which are popular today focus on exploiting UB for optimization instead of security is an implementation choice, but not a fundamental problem of the language itself.
I like this description. It's a useful mental model.
But in summary, doesn't that just move the target from "The compiler is stupid, it shouldn't be doing this, it clearly should know what I mean and I didn't mean that!" to "This is a bad minimum common denominator, if any architecture really needs this guarantee or for things to behave like this, then they should pay a performance penalty. We shouldn't all have to pay the portability price for this one thing that isn't a issue anywhere."
And to be honest, most of the UB hate I see is about the latter, not the former, no?
Most of the UB hate is that bugs that always existed only recently became exposed. "This code worked fine for years why is the compiler breaking it!" is always the rant, but it's misplaced. It should instead be "why didn't I get a sanitizer/linters/debug-whatever error first?"
The proliferation of optimization passes outpaced decent debuggability, and that's really the problem. Rants against UB are nearly always irrelevant or even just outright wrong. And worse still, those crusaders are harmful. You can see this in Rust as a perfect example. Signed integer overflow is defined two's compliment, much rejoicing from the "UB always bad!" crowd. Except wait a minute, in a debug build of Rust it's defined to be a panic. Why? Because signed integer overflow is 99.999% of the time a bug, and defining how it overflows doesn't actually help anyone. So instead you're left with the worst of both worlds - you both can't rely on how signed ints behave in Rust as a programmer because they have 2 extremely incompatible defined behaviors, and the optimizer/runtime then can't take advantage of them being undefined behavior in practice in release builds to optimize better.
It boils down to a culture problem, while communities in safer systems programming languages embrace having a panic on signed integer overflow, in the C languages world suggesting the use of -ftrapv (or similar) will make them reach out for the pitchforks.
The linters and compiler security flags are there, the problem is getting them adopted.
Of course, a panic is also a failure. If may be a less serious one, or it may just make your rocket explode on take-off and kill anything down-range while the result of the incorrect computation would otherwise have been irrelevant.
One of the difficulties I've had with the 'safer systems programming languages' advocacy is that since something going wrong is inherent and unavoidable -- since the flaw is ultimately in the user's code -- there is a tendency to pretend that the panic isn't something going wrong. In my experience this has result in measurably lower quality code from these communities, code which panics in slightly unexpected conditions -- while something written in C would not (yet may fail in a worse way when it does fail).
I don't think I've yet managed to download and run anything written in rust where it doesn't panic within the first 15 minutes of usage-- except the rust compiler itself and firefox (though I do now frequently get firefox crashes that are rust panics).
It may well be that the increased runtime sensitivity to programmer errors in these languages inherently mean we should expect more runtime failures as previously benign mistakes are exposed, and ought to accept that software written in these languages may be less reliable on aggregate because when it does fail its less likely to create security problems and that this is a worthwhile tradeoff. (Python users sure seem to survive a near constant rate of surprising runtime failures...)
But to the extent and so long as language advocates pretend that panics aren't failures they can't really advocate for the trade-off, advance better static analysis to reduce the gap, and will continue to seem fundamentally dishonest to people who try to use the languages and software written in them and experience the frequent panics first hand.
The difference between Rust and Java is that Rust developers decided that panics shouldn't be recoverable except as an afterthought for C compatibility.
In principle most panics are amendable to retrying the operation which would be the equivalent of catching exceptions in Java. So yes you get an "error has occurred" warning but your program doesn't terminate immediately. I don't think C has an edge over Java here.
How common is it for java code to handle exceptions in useful ways rather than just fail in even more inexplicable ways due to no one ever having conceived of much less tested those code paths being executed?
A panic makes an error situation visible, the way of C can lead to unnoticeable error situation go for longer than expected corrupting data in more inrecovereable ways than just crashing right there on the spot.
A bit like having warnings as errors, or deciding to ignore warnings as peril to what might come later without the feedback of what those warnigns were all about.
Yes, but visible at runtime. Depending on the situation you may well prefer* the silent failure. Many such silent failures are completely benign, e.g. the result of the wrong code (or whatever it corrupted) wasn't subsequently used.
*would prefer if you actually got to pick. But you don't get to pick because once you know of the bug you fix it either way.
Warnings as errors isn't a great example, because if you do it in code distributed to third parties its an absolute disaster as the warnings are not stable and there are constantly shifting false positives. It's perhaps not a good example even without distributing it, because it can lead to hasty "make it compile" 'fixes' that can introduce serious (and inherently warning undetectable) bugs. It's arguably better to have warnings warn until you have the time to look at them and handle them seriously, so long as they don't get missed.
The parallel doesn't carry through to undefined behavior because the undefined behavior isn't logging a warning that you could check out later (e.g. before cutting a release).
However, culture results in artefacts. You mostly won't find American Football Stadiums in England's cities, because it's not part of their culture. If the English suddenly took to this game, such stadiums likely would take as much as several decades to become widespread.
C libraries like OpenSSL reflect what's culturally appropriate in that language, so even if you came to C from a language with a different culture, too bad it has the culturally appropriate API design and behaviour.
I think that OpenSSL has historically reflected a rather antiquated C culture that most software moved on from long ago, FWIW.
A clear example of this is OpenSSL intentionally mixing uninitialized memory into its randomness pool (because on some obscure and long forgotten platforms it was the only way they had to get any 'randomness'), resulting in any programs written using it absolutely spewing valgrind errors all over the place. (Unless your openssl has been compiled with -DPURIFY to skip that behavior, or had the debian "fix" of bypassing the rng almost completely :P ).
I think the OpenSSL situation you're talking about arises because of a mistake by a maintainer.
MD_Update(&m,buf,j);
Kurt Roeckx found this line twice in OpenSSL. Valgrind moaned about this code and Kurt proposed removing it. Nobody objected, so in Debian Kurt removed the two lines.
One of these occasions is, as you described, mixing uninitialized (in practice likely zero) bytes into a pool of other data and removing it does indeed silence the Valgrind error and fixes the problem. The other, however is actually how real random numbers get fed into OpenSSL's "entropy pool", by removing it there is no entropy and the result was the "Debian keys" - predictable keys "randomly" generated by affected OpenSSL builds.
I haven't seen OpenSSL people claim that the first, erroneous, call was somehow supposed to make OpenSSL produce random bits on some hypothetical platform where the contents of uninitialised memory doesn't start as zero, it looks more like ordinary C programmer laziness to me.
The odd thing with that incident is that the "PURIFY" define long predated it-- the correct fix in debian should have been "Just compile with DPURIFY"-- I believe redhat was already doing so at the time.
> I haven't seen OpenSSL people claim that the first, erroneous, call was somehow supposed to make OpenSSL produce random bits on some hypothetical platform where the contents of uninitialised memory doesn't start as zero
I had an openssl dev explain (in person) to to me when I complained about the default behavior: that there had been platforms that depended on that behavior, that they weren't sure that which ones did, and so it didn't seem safe to eliminate it. (I'd complained because I couldn't have users with non -DPURIFY openssl code run valgrind as part of troubleshooting). IIRC the use of uninitialized memory was intentional and remarked on in comments in the code.
- If the "uninitialized" data is actually somehow some kind of interference.
- In LLVM, using a "undef" value will not always do the same thing each time; however, the "freeze" command can be used to avoid that problem. (I don't know if this feature of LLVM can be accessed from C codes, or how the similar things are working in GCC.)
- If the code seems unusual, then you should write comments to explain why it is written in the way that it is. (You can then also know what considerations to make if you want to remove it.)
- Whether or not there is uninitialized data, you will need to make proper entropy too, from other properly entropy data.
>So instead you're left with the worst of both worlds - you both can't rely on how signed ints behave in Rust as a programmer because they have 2 extremely incompatible defined behaviors, and the optimizer/runtime then can't take advantage of them being undefined behavior in practice in release builds to optimize better.
I don't understand how this is the worst of both worlds.
You can explicitly define overflow behavior in Rust. There are wrapper types and explicit checked or saturating and wrapping operations if those are necessary for the correctness of your program. If your program doesn't rely on them then checked overflow being the default in debug builds is the way to go and given enough confidence in the final product it makes sense to drop them in release builds and given enough processor advancements we can also do checked overflow in release builds.
I can do that in C/C++ where regular signed integer overflow is otherwise undefined behavior. The point is just Rust defining the behavior (no UB!) didn't do a damn thing to help anyone since if you actually want and expect overflow you need to use specific functions/wrappers to do that anyway. You also probably want the carry flag anyway so having regular addition be "defined behavior" is still useless.
Yep, exactly this. There's a handful of undefined behavior that might actually be worth reconsidering, but almost all UB that people want turned into defined behavior are "yeah we had a bug let's make it do something about as bad but call it defined".
Anyways, to the actual content of the article, I agree with it but I think the frustration it accepts as reasonable is actually misguided. Here is my understanding of how things ended up for C (disclaimer: I was not born when most of this stuff happened.)
In the beginning, you had a C compiler for your computer, and it was basically just an assembler. This is what the "make C great again" people think the language really is under the hood, by the way. However, very quickly people realized that they want their C code to run elsewhere, and every computer does things differently, so they needed to have some sort of standard of what was approximately the lowest common denominator for most machines and that become the C standard for what it is legal to do. The guarantee created at that point was that if you conform to the standard, every implementation of C has to run your program as the standard specifies. This was palatable to people because they had a bunch of machines with weird byte orderings or whatever, and it was obvious what would happen if the dumb compilers of the time translated their platform-specific code to a new architecture.
Later, the weird architectures started becoming rare. At the same time, though, a new architecture started growing: a virtual architecture, one where the compiler would actually "port" your code to the exactly same processor you were compiling for before, but the code would run faster. It would do this by starting to take latitude through intermediate transformations which it assumed it could do because your program should have been portable to the abstract machine.
Now, this completely weirded people out, because "I'm compiling to a new virtual architecture called 'x86-64 -O3' that is the same as 'x86-64 -O0' but faster and more restrictive" sounds really stupid. It's the same architecture, and they're not even real processors! But if you really think about it compilers are really just taking advantage of the fact that your code is portable, because it works in the space of the C abstract machine, to do a "port" called "run a bunch of optimization passes". People understand when a port ends up causing a trap on another processor because of course it does that on the new platform. But getting people to understand that your unaligned accesses on the "-O0" machine are no longer valid on the "-O3" machine, because, again, the instructions that come out look awfully similar and straightforward most of the time, except for the weird times where a change "surprises" you because the transition between the two crossed through an invalid space. Kind of like a path that seems to have a "weird jump" because it normally crosses through 3D space and at some point someone found a shortcut through the fourth dimension.
Anyways, the performance virtual architecture is all well and good, but what I think will be interesting moving forward is the security virtual architecture, where overflows and out-of-bounds accesses and type confusions are focused on more. Right now as a side effect of performance optimization they end up causing headaches for people, but Valgrind/sanitizers are an interesting look into what compiling to "x86-64 for security testing" architecture looks like. The logical next step is even more exciting, because we're actually starting to deploy real architectures with security-focused features that will require ports that are every bit as concrete as any other physical architecture difference, which I think will "legitimize" this mindset to the people who I called deluded at the start of this now very rambly comment. Page protections means that "const" is not something you can ignore. Pointer signing can mean that your "but they're the same bits underneath!" type confusions are no longer valid. ARM's Morello now means you can't play fast-and-loose with your pointers anymore; they're 128+ bits and you can't just decide you want forge one out of an integer anymore without caring. Ports to these architectures absolutely rely on the existence of a C abstract machine, which has served pretty well considering that its existence is really just what a piece of paper says is legal or not, rather than something really planned beforehand.