Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, I agree. I used to write C full time, but haven't in around 6 years, and I only flubbed #11 & #12 (I knew there was undefined behavior but couldn't remember why; after reading the answers I was like "duh", esp for #12 after having read #11).

I've never actually run into #2 in practice, though: even at -O3 the dereference in line 1 has always crashed for me, though I guess probably because I've never written code for an OS where an address of 0 is valid and doesn't cause a SIGSEGV or similar.

What's the best way to "fix" strict aliasing without disabling the undefined behavior around it? Using a union?



I think the catch with #2 is that we probably run into it more in the opposite way - a unnecessary NULL check put in from, say, inlining a function or expanding a macro is removed. On that note though, even in OS code address 0 is usually setup to cause some type of fault to catch errors. I think the issue happens when the compiler removes or moves the deference somewhere else - though obviously the cases where this is legal are limited. But in the posted code, for example, the `y` variable is unused and thus could be removed entirely, which would also remove the NULL dereference (but still remove the NULL check).

> What's the best way to "fix" strict aliasing without disabling the undefined behavior around it? Using a union?

I was just talking about `-fno-strict-aliasing`, which is a flag for `gcc` (And `clang` I assume), but it does remove all UB like you're saying by simply allowing all pointers to alias.

The other options are unions like you're thinking (Though that's also technically UB, since writing to one union member and reading from another is undefined, though most compilers allow it without incident), or extensions like `gcc`s `may_alias` attribute. The `may_alias` is really the cleanest way to do it, but for something like OS code the aliasing happens in such strange places that just disabling strict-aliasing completely tends to be the way to go.


> What's the best way to "fix" strict aliasing without disabling the undefined behavior around it? Using a union?

I had this discussion with another C++ programmer and we came to the conclusion that, if you care to avoid that particular UB, any time you cast pointers between unrelated or basic types and you're going to write to one pointer and read from the other, you need to go through a union, as annoying as it is.


Not just a union, but the union definition needs to be in scope _and_ used such so that the compiler can see the possibility of the relationship between the two objects.

But a union doesn't magically make type-punning correct. This code is not correct:

  union {
    int d;
    long long lld;
  } u;

  u.d = 1;
  printf("%lld\n", u.lld);
  u.lld = 0;
  printf("%lld\n", u.lld);
The union ensures that the compiler doesn't move "u.lld = 0" above the first print statement, but usually writing from one type and reading from another is undefined behavior no matter how you accomplish it. That's because the representations can be different, and one or the other might have invalid representations. The biggest exception is reading through a char pointer; reading representation bits through a char pointer is guaranteed to always be okay.

Aliasing and type punning are two different issues that are only tangentially related in terms of language semantics. But the issues do often coincide, especially in poorly written code.

You can also put the compiler on notice not to apply the strict aliasing rule by using simple type coercion (implicit or explicit) in the relevant statements. What matters is that we put the compiler on notice that two objects of [seemingly] different types are related and thus have an ordering relationship, and the standard provides a few ways to do that.

For example, this code is wrong:

  struct foo {
    int i;
  };

  struct bar {
    int i;
  };

  void baz(struct foo *foo, struct bar *bar) {
    foo->i = 0;
    bar->i++:
  }

  struct foo foo;
  baz(&foo, (struct bar *)&foo);
whereas all of

  void baz(struct foo *foo, struct bar *bar) {
    foo->i = 0;
    (((struct foo *)bar)->i)++;
  }
and

  void baz(struct foo *foo, struct bar *bar) {
    union {
      struct foo foo;
      struct bar bar;
    } *foo_u = (void *)foo, *bar_u = (void *)bar;    
    foo_u->foo.i = 0;
    bar_u->bar.i++;
  }
and

  void baz(struct foo *foo, struct bar *bar) {
      *(int *)foo = 0;
      (*(int *)bar)++;
  }
are correct. This should be correct, too, I think

  void baz(struct foo *foo, struct bar *bar) {
      *(int *)&foo->i = 0;
      (*(int *)&bar->i)++;
  }
and is also a weird case where the superfluous cast is necessary.

The purpose in all 4 cases is to make it evident viz-a-viz C's typing system that two objects might alias each other, and they do that by using constructs that put those objects into the same universe of alias-able types.

The conspicuous description of the union method in the C standard is more directed, I think, at compiler writers. It's not the only way to alias correctly (explicit casting to the basic type is enough), but often times it's the most natural when dealing with polymorphic compound objects.

Compiler writers historically didn't always implement enough smarts in their compiler to be able to detect possible aliasing through unions, and that needed to be addressed by a more thorough specification of union behavior. That is, the standard needed to make it clear that a compiler was required grok the relationship of two sub-objects (of the same basic type) that were derived from the same root union type.

Explicitly type-casting through a union just for aliasing is a little stilted, though, when you can achieve the same thing using a cast through a basic type. The union method is preferable, but only in so far as it's used to _avoid_ or to _minimize_ type coercion. And it'll never solve type punning issues.


> The union ensures that the compiler doesn't move "u.lld = 0" above the first print statement, but usually writing from one type and reading from another is undefined behavior no matter how you accomplish it.

I know, but the only reason aliasing becomes an issue is because someone is trying to cast between unrelated pointer types to perform cheap type conversions. Yes, even with the union the behavior is undefined, but if you know the platform you're targeting the program may be well-behaved.

As for your snippets, yes, casting pointers across function boundaries will work. The problem is when you don't want to introduce a call, which is where unions come in.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: