Aliasing and character types 
Author Message
 Aliasing and character types

A question about C's aliasing rules:  I don't have a copy of
the ANSI C standard, but Google tells me that it says something
like this:

      "An object shall have its stored value accessed only by an
      lvalue that has one of the following types:
        [irrelevant types deleted]
        o  A character type."

This means that the following should be safe: (disregard endianness)

        int i = 0;
        *(char *) &i = 1;

Right?  But how about the other way around: (assuming sizeof(int) == 4)

        char p[] = { 0, 0, 0, 0 };
        *(int *) p = 1;

If "*(int *) p = 1" counts as an int object with the value 1,
accessing it using p[0 .. 3] should work as expected, but I won't
feel safe until someone confirms it. :-)

Thanks in advance.

--
 Haakon



Mon, 27 Jun 2005 07:04:49 GMT  
 Aliasing and character types

Quote:

> A question about C's aliasing rules:  I don't have a copy of
> the ANSI C standard, but Google tells me that it says something
> like this:

>       "An object shall have its stored value accessed only by an
>       lvalue that has one of the following types:
>    [irrelevant types deleted]
>         o  A character type."

> This means that the following should be safe: (disregard endianness)

>         int i = 0;
>         *(char *) &i = 1;

The code above is itself safe, in that it doesn't invoke
undefined behavior, but subsequent accesses to int `i' as an int
may not be safe if the assignment sets `i' to a trap
representation.

Quote:
> Right?  But how about the other way around: (assuming sizeof(int) == 4)

>         char p[] = { 0, 0, 0, 0 };
>         *(int *) p = 1;

That falls afoul of the rules you cited above, invoking undefined
behavior.  A character array cannot necessarily be treated as an
int.
--
"Large amounts of money tend to quench any scruples I might be having."
  -- Stephan Wilms


Mon, 27 Jun 2005 07:06:58 GMT  
 Aliasing and character types


Quote:
> A question about C's aliasing rules:  I don't have a copy of
> the ANSI C standard, but Google tells me that it says something
> like this:

>       "An object shall have its stored value accessed only by an
>       lvalue that has one of the following types:
>    [irrelevant types deleted]
>         o  A character type."

> This means that the following should be safe: (disregard endianness)

>         int i = 0;
>         *(char *) &i = 1;

> Right?  But how about the other way around: (assuming sizeof(int) == 4)

Ok so far. Of course you have to be careful what you do next. Accessing
i is now undefined behavior (I think).

Quote:
>         char p[] = { 0, 0, 0, 0 };
>         *(int *) p = 1;

> If "*(int *) p = 1" counts as an int object with the value 1,
> accessing it using p[0 .. 3] should work as expected, but I won't
> feel safe until someone confirms it. :-)

Undefined behavior, because the address of array p need not be aligned
correctly to be casted to an int*.

Also, you are not accessing the char values in p either through their
own data type (which is char) or through type char (which is also char),
but through int. Undefined behavior again.



Mon, 27 Jun 2005 07:32:16 GMT  
 Aliasing and character types


Quote:

>> A question about C's aliasing rules:  I don't have a copy of
>> the ANSI C standard, but Google tells me that it says something
>> like this:

>>       "An object shall have its stored value accessed only by an
>>       lvalue that has one of the following types:
>>        [irrelevant types deleted]
>>         o  A character type."

>> This means that the following should be safe: (disregard endianness)

>>         int i = 0;
>>         *(char *) &i = 1;

>The code above is itself safe, in that it doesn't invoke
>undefined behavior, but subsequent accesses to int `i' as an int
>may not be safe if the assignment sets `i' to a trap
>representation.

In fact, the assignment itself is not safe.  If it results in a trap
representation being stored in the int, you have problems right there;
you don't have to wait for any later accesses.

Quote:
>> Right?  But how about the other way around: (assuming sizeof(int) == 4)

>>         char p[] = { 0, 0, 0, 0 };
>>         *(int *) p = 1;

>That falls afoul of the rules you cited above, invoking undefined
>behavior.  A character array cannot necessarily be treated as an
>int.

However, you can store the value of an int into a char array, and later
transfer it back to an int variable.  If you modify the bits while in the
char array, you might produce a trap representation for ints (unless you
modify them to all-0 :)), but you have no problem until you try to copy
the bits back into an int variable.

--Ben

--



Mon, 27 Jun 2005 08:35:44 GMT  
 Aliasing and character types

in comp.lang.c:

Quote:

> > A question about C's aliasing rules:  I don't have a copy of
> > the ANSI C standard, but Google tells me that it says something
> > like this:

> >       "An object shall have its stored value accessed only by an
> >       lvalue that has one of the following types:
> >       [irrelevant types deleted]
> >         o  A character type."

> > This means that the following should be safe: (disregard endianness)

> >         int i = 0;
> >         *(char *) &i = 1;

> The code above is itself safe, in that it doesn't invoke
> undefined behavior, but subsequent accesses to int `i' as an int
> may not be safe if the assignment sets `i' to a trap
> representation.

Actually that is not true according to C99, even though the OP's
quotation, virtually unchanged, in C99, implies that it is so.

Plain char may have trap representations on implementations where it
is equivalent to signed char.  Because signed char may have padding
bits and/or trap representations according to the C99 standard.

I have proposed on comp.std.c a few times that these references to
"character type" (there are a few others) be changed "unsigned
character type", because that is the only type for which C99 makes
this guarantee.

Quote:
> > Right?  But how about the other way around: (assuming sizeof(int) == 4)

> >         char p[] = { 0, 0, 0, 0 };
> >         *(int *) p = 1;

> That falls afoul of the rules you cited above, invoking undefined
> behavior.  A character array cannot necessarily be treated as an
> int.

Agreed.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq



Mon, 27 Jun 2005 10:45:15 GMT  
 Aliasing and character types

Quote:




> >> This means that the following should be safe: (disregard endianness)

> >>         int i = 0;
> >>         *(char *) &i = 1;

> >The code above is itself safe, in that it doesn't invoke
> >undefined behavior, but subsequent accesses to int `i' as an int
> >may not be safe if the assignment sets `i' to a trap
> >representation.

> In fact, the assignment itself is not safe.  If it results in a trap
> representation being stored in the int, you have problems right there;
> you don't have to wait for any later accesses.

How so?  Can you cite anything to back that up?  My mental model
of trap representations has always been that you have to read one
of them in order for it to cause a problem.
--
"I don't have C&V for that handy, but I've got Dan Pop."
--E. Gibbons


Mon, 27 Jun 2005 12:31:00 GMT  
 Aliasing and character types

Quote:


> in comp.lang.c:


> > >       "An object shall have its stored value accessed only by an
> > >       lvalue that has one of the following types:
> > >  [irrelevant types deleted]
> > >         o  A character type."

> > > This means that the following should be safe: (disregard endianness)

> > >         int i = 0;
> > >         *(char *) &i = 1;

> > The code above is itself safe, in that it doesn't invoke
> > undefined behavior, but subsequent accesses to int `i' as an int
> > may not be safe if the assignment sets `i' to a trap
> > representation.

> Actually that is not true according to C99, even though the OP's
> quotation, virtually unchanged, in C99, implies that it is so.

> Plain char may have trap representations on implementations where it
> is equivalent to signed char.  Because signed char may have padding
> bits and/or trap representations according to the C99 standard.

That's not a problem in this case as far as I can tell, because
we are storing a char, not reading out a char.  Storing a value
of 1 into a char cannot produce a trap representation, and as far
as I know although it yields undefined behavior to read out a
trap representation, it is okay to store into an object that
contains a trap representation.
--
"Some people *are* arrogant, and others read the FAQ."
--Chris Dollin


Mon, 27 Jun 2005 12:30:01 GMT  
 Aliasing and character types

Quote:

>  (assuming sizeof(int) == 4)

>         char p[] = { 0, 0, 0, 0 };
>         *(int *) p = 1;

> If "*(int *) p = 1" counts as an int object with the value 1,
> accessing it using p[0 .. 3] should work as expected, but I won't
> feel safe until someone confirms it. :-)

It's not safe.
Aside from traps, there's also an alignment issue.
p may or may not be properly aligned for int.

--
pete



Mon, 27 Jun 2005 18:57:28 GMT  
 Aliasing and character types

Quote:


> > in comp.lang.c:


> > > >       "An object shall have its stored value accessed only by an
> > > >       lvalue that has one of the following types:
> > > > [irrelevant types deleted]
> > > >         o  A character type."

> > > > This means that the following should be safe: (disregard endianness)

> > > >         int i = 0;
> > > >         *(char *) &i = 1;

> > > The code above is itself safe, in that it doesn't invoke
> > > undefined behavior, but subsequent accesses to int `i' as an int
> > > may not be safe if the assignment sets `i' to a trap
> > > representation.

> > Actually that is not true according to C99, even though the OP's
> > quotation, virtually unchanged, in C99, implies that it is so.

> > Plain char may have trap representations on implementations where it
> > is equivalent to signed char.  Because signed char may have padding
> > bits and/or trap representations according to the C99 standard.

> That's not a problem in this case as far as I can tell, because
> we are storing a char, not reading out a char.  Storing a value
> of 1 into a char cannot produce a trap representation, and as far
> as I know although it yields undefined behavior to read out a
> trap representation, it is okay to store into an object that
> contains a trap representation.

6.2.6.1p5 excludes character lvalue reads from the undefined behaviour. So
at worst, reading a character trap will yield an unspecified value. How else
can a 'non-trapping' character trap respresentation behave?! Curiously,
/assignment/ of a value out of range to a signed char, or signed plain char,
can produce and implementation defined signal under C99!

--
Peter



Mon, 27 Jun 2005 19:47:41 GMT  
 Aliasing and character types
in comp.lang.c i read:

Quote:

>>  (assuming sizeof(int) == 4)

>>         char p[] = { 0, 0, 0, 0 };

we shouldn't assume when there's a perfectly usable way to write it with
the results you desire which works on all platforms, and while slightly
more cluttered directly shows your intent:

  char p[sizeof(int)] = {0};

Quote:
>>         *(int *) p = 1;

>> If "*(int *) p = 1" counts as an int object with the value 1,
>> accessing it using p[0 .. 3] should work as expected,

yes, but that's not what you are doing.  if you had allowed for proper
alignment (using a union) then this would be safe according to pre-c99
standards; change the `char' to `unsigned char' and it'd also be safe
under c99.  though what you'd find in p should remain a secret between
you, your ghod(s) and your other hand.

Quote:
>Aside from traps, there's also an alignment issue.

traps are not the issue, and cannot be in this case.  why?  because `1'
*is* an int, the implementation cannot produce a trap representation for
it otherwise ``int i = 1;'' would also trap.

--
bringing you boring signatures for 17 years



Tue, 28 Jun 2005 00:15:57 GMT  
 
 [ 10 post ] 

 Relevant Pages 

1. Aliasing basic types

2. Interop with FORTRAN 90 Character Arrays in User Types

3. Help - I need to read a character without halting the program if none have been typed

4. Character types

5. pointer to character array type

6. converting a character to an integer type

7. type convert from character to string

8. character types

9. regarding character types

10. Character types in ANSI C

11. Typing initial character in CComboBoxEx

12. Aliasing through union, C++ vs. C

 

 
Powered by phpBB® Forum Software