Aliasing and character types
Author |
Message |
Haakon Riise #1 / 10
|
 Aliasing and character types
A question about C's aliasing rules: I don't have a copy of the ANSI C standard, but Google tells me that it says something like this: "An object shall have its stored value accessed only by an lvalue that has one of the following types: [irrelevant types deleted] o A character type." This means that the following should be safe: (disregard endianness) int i = 0; *(char *) &i = 1; Right? But how about the other way around: (assuming sizeof(int) == 4) char p[] = { 0, 0, 0, 0 }; *(int *) p = 1; If "*(int *) p = 1" counts as an int object with the value 1, accessing it using p[0 .. 3] should work as expected, but I won't feel safe until someone confirms it. :-) Thanks in advance. -- Haakon
|
Mon, 27 Jun 2005 07:04:49 GMT |
|
 |
Ben Pfaf #2 / 10
|
 Aliasing and character types
Quote:
> A question about C's aliasing rules: I don't have a copy of > the ANSI C standard, but Google tells me that it says something > like this: > "An object shall have its stored value accessed only by an > lvalue that has one of the following types: > [irrelevant types deleted] > o A character type." > This means that the following should be safe: (disregard endianness) > int i = 0; > *(char *) &i = 1;
The code above is itself safe, in that it doesn't invoke undefined behavior, but subsequent accesses to int `i' as an int may not be safe if the assignment sets `i' to a trap representation. Quote: > Right? But how about the other way around: (assuming sizeof(int) == 4) > char p[] = { 0, 0, 0, 0 }; > *(int *) p = 1;
That falls afoul of the rules you cited above, invoking undefined behavior. A character array cannot necessarily be treated as an int. -- "Large amounts of money tend to quench any scruples I might be having." -- Stephan Wilms
|
Mon, 27 Jun 2005 07:06:58 GMT |
|
 |
Christian Ba #3 / 10
|
 Aliasing and character types
Quote: > A question about C's aliasing rules: I don't have a copy of > the ANSI C standard, but Google tells me that it says something > like this: > "An object shall have its stored value accessed only by an > lvalue that has one of the following types: > [irrelevant types deleted] > o A character type." > This means that the following should be safe: (disregard endianness) > int i = 0; > *(char *) &i = 1; > Right? But how about the other way around: (assuming sizeof(int) == 4)
Ok so far. Of course you have to be careful what you do next. Accessing i is now undefined behavior (I think). Quote: > char p[] = { 0, 0, 0, 0 }; > *(int *) p = 1; > If "*(int *) p = 1" counts as an int object with the value 1, > accessing it using p[0 .. 3] should work as expected, but I won't > feel safe until someone confirms it. :-)
Undefined behavior, because the address of array p need not be aligned correctly to be casted to an int*. Also, you are not accessing the char values in p either through their own data type (which is char) or through type char (which is also char), but through int. Undefined behavior again.
|
Mon, 27 Jun 2005 07:32:16 GMT |
|
 |
E. Gibbo #4 / 10
|
 Aliasing and character types
Quote:
>> A question about C's aliasing rules: I don't have a copy of >> the ANSI C standard, but Google tells me that it says something >> like this: >> "An object shall have its stored value accessed only by an >> lvalue that has one of the following types: >> [irrelevant types deleted] >> o A character type." >> This means that the following should be safe: (disregard endianness) >> int i = 0; >> *(char *) &i = 1; >The code above is itself safe, in that it doesn't invoke >undefined behavior, but subsequent accesses to int `i' as an int >may not be safe if the assignment sets `i' to a trap >representation.
In fact, the assignment itself is not safe. If it results in a trap representation being stored in the int, you have problems right there; you don't have to wait for any later accesses. Quote: >> Right? But how about the other way around: (assuming sizeof(int) == 4) >> char p[] = { 0, 0, 0, 0 }; >> *(int *) p = 1; >That falls afoul of the rules you cited above, invoking undefined >behavior. A character array cannot necessarily be treated as an >int.
However, you can store the value of an int into a char array, and later transfer it back to an int variable. If you modify the bits while in the char array, you might produce a trap representation for ints (unless you modify them to all-0 :)), but you have no problem until you try to copy the bits back into an int variable. --Ben --
|
Mon, 27 Jun 2005 08:35:44 GMT |
|
 |
Jack Klei #5 / 10
|
 Aliasing and character types
in comp.lang.c: Quote:
> > A question about C's aliasing rules: I don't have a copy of > > the ANSI C standard, but Google tells me that it says something > > like this: > > "An object shall have its stored value accessed only by an > > lvalue that has one of the following types: > > [irrelevant types deleted] > > o A character type." > > This means that the following should be safe: (disregard endianness) > > int i = 0; > > *(char *) &i = 1; > The code above is itself safe, in that it doesn't invoke > undefined behavior, but subsequent accesses to int `i' as an int > may not be safe if the assignment sets `i' to a trap > representation.
Actually that is not true according to C99, even though the OP's quotation, virtually unchanged, in C99, implies that it is so. Plain char may have trap representations on implementations where it is equivalent to signed char. Because signed char may have padding bits and/or trap representations according to the C99 standard. I have proposed on comp.std.c a few times that these references to "character type" (there are a few others) be changed "unsigned character type", because that is the only type for which C99 makes this guarantee. Quote: > > Right? But how about the other way around: (assuming sizeof(int) == 4) > > char p[] = { 0, 0, 0, 0 }; > > *(int *) p = 1; > That falls afoul of the rules you cited above, invoking undefined > behavior. A character array cannot necessarily be treated as an > int.
Agreed. -- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
|
Mon, 27 Jun 2005 10:45:15 GMT |
|
 |
Ben Pfaf #6 / 10
|
 Aliasing and character types
Quote:
> >> This means that the following should be safe: (disregard endianness) > >> int i = 0; > >> *(char *) &i = 1; > >The code above is itself safe, in that it doesn't invoke > >undefined behavior, but subsequent accesses to int `i' as an int > >may not be safe if the assignment sets `i' to a trap > >representation. > In fact, the assignment itself is not safe. If it results in a trap > representation being stored in the int, you have problems right there; > you don't have to wait for any later accesses.
How so? Can you cite anything to back that up? My mental model of trap representations has always been that you have to read one of them in order for it to cause a problem. -- "I don't have C&V for that handy, but I've got Dan Pop." --E. Gibbons
|
Mon, 27 Jun 2005 12:31:00 GMT |
|
 |
Ben Pfaf #7 / 10
|
 Aliasing and character types
Quote:
> in comp.lang.c:
> > > "An object shall have its stored value accessed only by an > > > lvalue that has one of the following types: > > > [irrelevant types deleted] > > > o A character type." > > > This means that the following should be safe: (disregard endianness) > > > int i = 0; > > > *(char *) &i = 1; > > The code above is itself safe, in that it doesn't invoke > > undefined behavior, but subsequent accesses to int `i' as an int > > may not be safe if the assignment sets `i' to a trap > > representation. > Actually that is not true according to C99, even though the OP's > quotation, virtually unchanged, in C99, implies that it is so. > Plain char may have trap representations on implementations where it > is equivalent to signed char. Because signed char may have padding > bits and/or trap representations according to the C99 standard.
That's not a problem in this case as far as I can tell, because we are storing a char, not reading out a char. Storing a value of 1 into a char cannot produce a trap representation, and as far as I know although it yields undefined behavior to read out a trap representation, it is okay to store into an object that contains a trap representation. -- "Some people *are* arrogant, and others read the FAQ." --Chris Dollin
|
Mon, 27 Jun 2005 12:30:01 GMT |
|
 |
pete #8 / 10
|
 Aliasing and character types
Quote:
> (assuming sizeof(int) == 4) > char p[] = { 0, 0, 0, 0 }; > *(int *) p = 1; > If "*(int *) p = 1" counts as an int object with the value 1, > accessing it using p[0 .. 3] should work as expected, but I won't > feel safe until someone confirms it. :-)
It's not safe. Aside from traps, there's also an alignment issue. p may or may not be properly aligned for int. -- pete
|
Mon, 27 Jun 2005 18:57:28 GMT |
|
 |
Peter Nilsso #9 / 10
|
 Aliasing and character types
Quote:
> > in comp.lang.c:
> > > > "An object shall have its stored value accessed only by an > > > > lvalue that has one of the following types: > > > > [irrelevant types deleted] > > > > o A character type." > > > > This means that the following should be safe: (disregard endianness) > > > > int i = 0; > > > > *(char *) &i = 1; > > > The code above is itself safe, in that it doesn't invoke > > > undefined behavior, but subsequent accesses to int `i' as an int > > > may not be safe if the assignment sets `i' to a trap > > > representation. > > Actually that is not true according to C99, even though the OP's > > quotation, virtually unchanged, in C99, implies that it is so. > > Plain char may have trap representations on implementations where it > > is equivalent to signed char. Because signed char may have padding > > bits and/or trap representations according to the C99 standard. > That's not a problem in this case as far as I can tell, because > we are storing a char, not reading out a char. Storing a value > of 1 into a char cannot produce a trap representation, and as far > as I know although it yields undefined behavior to read out a > trap representation, it is okay to store into an object that > contains a trap representation.
6.2.6.1p5 excludes character lvalue reads from the undefined behaviour. So at worst, reading a character trap will yield an unspecified value. How else can a 'non-trapping' character trap respresentation behave?! Curiously, /assignment/ of a value out of range to a signed char, or signed plain char, can produce and implementation defined signal under C99! -- Peter
|
Mon, 27 Jun 2005 19:47:41 GMT |
|
 |
those who know me have no need of my nam #10 / 10
|
 Aliasing and character types
in comp.lang.c i read: Quote:
>> (assuming sizeof(int) == 4) >> char p[] = { 0, 0, 0, 0 };
we shouldn't assume when there's a perfectly usable way to write it with the results you desire which works on all platforms, and while slightly more cluttered directly shows your intent: char p[sizeof(int)] = {0}; Quote: >> *(int *) p = 1; >> If "*(int *) p = 1" counts as an int object with the value 1, >> accessing it using p[0 .. 3] should work as expected,
yes, but that's not what you are doing. if you had allowed for proper alignment (using a union) then this would be safe according to pre-c99 standards; change the `char' to `unsigned char' and it'd also be safe under c99. though what you'd find in p should remain a secret between you, your ghod(s) and your other hand. Quote: >Aside from traps, there's also an alignment issue.
traps are not the issue, and cannot be in this case. why? because `1' *is* an int, the implementation cannot produce a trap representation for it otherwise ``int i = 1;'' would also trap. -- bringing you boring signatures for 17 years
|
Tue, 28 Jun 2005 00:15:57 GMT |
|
|
|