Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No: from https://en.cppreference.com/w/cpp/language/union.

The union is only as big as necessary to hold its largest data member. The other data members are allocated in the same bytes as part of that largest member. The details of that allocation are implementation-defined but all non-static data members will have the same address (since C++14). It's undefined behavior to read from the member of the union that wasn't most recently written. Many compilers implement, as a non-standard language extension, the ability to read inactive members of a union.

What 6.5.2.3 simplifies is the use of unions of the type:

struct A{int type; DataA a;}

struct B{int type; DataB b;}

union U{A a;B b};

U u;

switch(u.type)...

Its not what is beeing used here.

std::variant is designed to deprecate all legitimate uses of union



The post is about C, not C++. My comment stands, as the original post has two structs in a union, and they start the same way, so it’s exactly the case covered in the C11 Standard.


It's actually weirder than that. The C standard allows type punning through unions, but not because of the clause you mentioned. It allows it because of footnote 95:

> If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’)

This is broader than the common initial subsequence clause, and allows punning between completely different types, e.g. int, char[4], and float.

You might ask, what is the point of the "common initial subsequence" rule then? It's to allow certain accesses that don't go directly through the union, so the compiler doesn't know for sure whether there's a union involved. Only problem is that all major compilers completely ignore this rule. [1] (But they do implement the first clause I mentioned, where the accesses do go through the union.)

[1] https://stackoverflow.com/questions/34616086/union-punning-s...


Your response to GP is based on the C++ reference and his explicitly is based on the C standard. Your assertion that ‘ [t]he details of that allocation are implementation-defined but all non-static data members will have the same address (since C++14)’ seems to directly conflict with the C11 standard. Also, your closing comment about std::variant is clearly only applicable to C++. I am just curious why you are using C++ when the article and GP are specifically addressing C?


You've mentioned this several times on this page, but this is still incorrect.

The C standard references "struct or union" all over the place because the two are so similar. The distinction is of course made clear in multiple places, but one that seems relevant here is:

> As discussed in 6.2.5, a structure is a type consisting of a sequence of members, whose storage is allocated in an ordered sequence, and a union is a type consisting of a sequence of members whose storage overlap. (ISO/IEC 9899:201x, §6.7.2.1, #6)

That's it. There's nothing about undefined behavior if you access one member and then another later. In fact there's even a paragraph which mentions doing just that:

> The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members (or if a member is a bitfield, then to the unit in which it resides), and vice versa. (ISO/IEC 9899:201x, §6.7.2.1, #16)

A pointer to the union points to each of its members, and can be dereferenced to access it.

std::variant is not used in C; C and C++ are two different languages.


> The size of a union is sufficient to contain the largest of its members.

Correct me if I'm wrong, but there is no part of the C spec that says this:

When initializing a union member that is smaller than the largest member, the remaining bytes will always automatically be initialized to zero.

If I'm right then the following caveat must be added to your statement:

> A pointer to the union points to each of its members, and can be dereferenced to access it.

... if and only if the member which was originally initialized is at least as large as the other member being accessed.

In other words, if you write your program in a way that ensures it will only compile when all union members are exactly the same size, and you have mandatory tooling to make sure that any changes to said union follow the same rule by force of compilation errors, then and only then can you claim what you claimed without the threat of undefined behavior.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: