using <string.h> functions on non-character objects? 
Author Message
 using <string.h> functions on non-character objects?

I posted code similar to this on sci.crypt:

    void foo (int x, y)
    {
      volatile int bar [SIZE];

      do_some_stuff ();
      bar [x] = y;
      do_some_other_stuff ();

      memset (bar, 0, sizeof (bar));
    }


Quote:
> It's not *semantically* valid.

The two relevant sections of ISO/IEC 9899 seem to be:

        Section 6.2.6.1 paragraph 5:

        Certain object representations need not represent a value of
        the object type.  If the stored value of an object has such a
        representation and is read by an lvalue expression that does
        not have character type, the behavior is undefined.  If such a
        representation is produced by a side effect that modifies all
        or any part of the object by an lvalue expression that does
        not have character type, the behavior is undefined.  Such a
        representation is called a trap representation.

        Section 7.21.1 paragraph 1:

        The header <string.h> declares one type and several functions, and
        defines one macro useful for manipulating arrays of character type
        and other objects treated as arrays of character type.  [...]

Can there be a trap representation for a character?  I can't find anything
that says that there can't.  If there can, then no operations defined
in <string.h> are useful on "other objects treated as arrays of character
type."

Can an all-binary-zeros representation be a trap representation for an
integer?

Does calling memset() with a pointer to an integer array as the first
argument constitute a "side effect that modifies all or any part
of the object by an lvalue expression that does not have character type"?

And it sounds like passing any object of a non-character type to a string
function may have undefined behavior because of the second sentence
of 6.2.6.1 paragraph 5.

Is it really the intent that writing a trap representation to an object
using <string.h> functions should have undefined behavior?  (I can
certainly see why reading those objects afterware would have undefined
behavior.)

If this is really the case, doesn't that mean that calloc() can only
be used on characters and character arrays?

        Section 7.20.3.1 paragraph 2:

        The calloc function allocates space for an array of nmemb
        objects, each of whose size is size.  The space is initialized
        to all bits zero.  252)

        252) Note that this need not be the same as the representation
        of floating-point zero or a null pointer constant.

Therefore, it sounds like doing a calloc() and storing the result
as a pointer to an array of integers (or most other types) can
result in undefined behavior, as soon as the resulting object is
read.

If this is all true, it sounds like the only way to "zero" a non-integer
array is to iterate over all the elements, storing zero into them.  I suppose
that's not too bad for an array of integers, because the compiler might
be smart enough to optimize it, but trying to zero an array of structs
will be abysmal.



Sun, 27 Jun 2004 10:42:07 GMT  
 using <string.h> functions on non-character objects?

[snip - using memset to initialize types other than array of char]

Quote:
> If this is all true, it sounds like the only way to "zero" a non-integer
> array is to iterate over all the elements, storing zero into them.  I suppose
> that's not too bad for an array of integers, because the compiler might
> be smart enough to optimize it, but trying to zero an array of structs
> will be abysmal.

Unless you use an initializer or make the variable static. However, it certainly
seems like the original intent of memcpy was to initialize all types...unless
the only type at the time was char.

        david

--
If 91 were prime, it would be a counterexample to your conjecture.
    -- Bruce Wheeler



Sun, 27 Jun 2004 08:30:16 GMT  
 using <string.h> functions on non-character objects?
Quote:

> I posted code similar to this on sci.crypt:

>     void foo (int x, y)
>     {
>       volatile int bar [SIZE];

>       do_some_stuff ();
>       bar [x] = y;
>       do_some_other_stuff ();

>       memset (bar, 0, sizeof (bar));
>     }


> > It's not *semantically* valid.

> The two relevant sections of ISO/IEC 9899 seem to be:

<snip>

Yes, those were both relevant.

Quote:
> Can there be a trap representation for a character?  I can't find anything
> that says that there can't.

I'm reasonably sure there can, but for /characters/ all-bits-zero cannot
be a trap representation. There was a thread on this recently in
comp.lang.c entitled "Extracing a substring (fast)" - complete with
typo! - in which the following conclusion was reached (delimited by +++
signs):

+++++++++++++
On Thursday, in article


- Show quoted text -

Quote:

><snip>

>> C99 6.2.6.2p5 says

>> "The values of any padding bits are unspecified. A valid (non-trap)
>>  object representation of a signed integer type where the sign bit
>>  is zero is a valid object representation of the corresponding
>>  unsigned type, and shall represent the same value."

>> unsigned char cannot have padding bits but signed char can if its range
>> of values is sufficiently small. For example UCHAR_MAX==65535 and
>> SCHAR_MAX=127 is valid and allows for 8 padding bits in a signed char.
>> However for nonnegative values that they have in common signed char
>> and unsigned char must use the same representation. So in the example
>> here, for any value 0-127 written to the signed char all 8 padding bits
>> must be zeroed. When reading the value of an object using a signed char
>> lvalue the padding bits can be ignored.

>If I am reading this aright, it implies that (even though signed char
>may have padding bits) memset(signedchararray, 0, sizeof
>signedchararray) gives you the expected array-full of '\0' characters,
>and thus memset(a, 0, sizeof a) works for char, signed char, and
>unsigned char arrays.

Correct.
+++++++++++++

Quote:
> If there can, then no operations defined
> in <string.h> are useful on "other objects treated as arrays of character
> type."

Not so. The memcpy, memmove, memcmp, and memchr functions can all be
used safely in such a way. It's just memset that has problems, and
that's because it doesn't just look at or copy bits - it actually /sets/
them, without any knowledge of their underlying object type. (See below,
paragraph starting "Except with memset".)

Quote:

> Can an all-binary-zeros representation be a trap representation for an
> integer?

Yes (except for the three types char, signed char, and unsigned char).
The Standard allows integers to have padding bits, and the
implementation is allowed to use those bits (for example, for parity
checking). Consider an implementation which mandates odd parity for its
integers, and traps if it discovers even parity in any integer.
memset(myintarray, 0, sizeof myintarray) would trap on such an
implementation.

Quote:
> Does calling memset() with a pointer to an integer array as the first
> argument constitute a "side effect that modifies all or any part
> of the object by an lvalue expression that does not have character type"?

The honest answer here is "I don't know and I don't have time right now
to find out", so I'll pass on this one. :-)

Quote:

> And it sounds like passing any object of a non-character type to a string
> function may have undefined behavior because of the second sentence
> of 6.2.6.1 paragraph 5.

Except with memset(), you're actually all right here because the other
mem* functions work by copying whole bytes, including any padding bits.
So, provided those bytes were set correctly to start with, copying them
is well-defined.

Quote:

> Is it really the intent that writing a trap representation to an object
> using <string.h> functions should have undefined behavior?

That's really a question for comp.std.c IMHO.

Quote:
> (I can
> certainly see why reading those objects afterware would have undefined
> behavior.)

Right.

Quote:

> If this is really the case, doesn't that mean that calloc() can only
> be used on characters and character arrays?

Yes, if you want well-defined behaviour.

Quote:

>         Section 7.20.3.1 paragraph 2:

>         The calloc function allocates space for an array of nmemb
>         objects, each of whose size is size.  The space is initialized
>         to all bits zero.  252)

>         252) Note that this need not be the same as the representation
>         of floating-point zero or a null pointer constant.

> Therefore, it sounds like doing a calloc() and storing the result
> as a pointer to an array of integers (or most other types) can
> result in undefined behavior, as soon as the resulting object is
> read.

Correct, except that you don't have to wait that long. :-)

Quote:

> If this is all true, it sounds like the only way to "zero" a non-integer
> array is to iterate over all the elements, storing zero into them.

Or you can initialise them at declaration:

int blankarray[100] = {0}; /* guaranteed to zero out everything */

If you need to blank them later, you can do so with memcpy at the
expense of some memory:

memcpy(workingarray, blankarray, sizeof workingarray);

Quote:
> I suppose
> that's not too bad for an array of integers, because the compiler might
> be smart enough to optimize it, but trying to zero an array of structs
> will be abysmal.

Again, the {0} trick works beautifully.

--

"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton



Sun, 27 Jun 2004 18:20:59 GMT  
 using <string.h> functions on non-character objects?


Quote:
>I posted code similar to this on sci.crypt:

>    void foo (int x, y)
>    {
>      volatile int bar [SIZE];

>      do_some_stuff ();
>      bar [x] = y;
>      do_some_other_stuff ();

>      memset (bar, 0, sizeof (bar));
>    }


>> It's not *semantically* valid.

>The two relevant sections of ISO/IEC 9899 seem to be:

>        Section 6.2.6.1 paragraph 5:

>        Certain object representations need not represent a value of
>        the object type.  If the stored value of an object has such a
>        representation and is read by an lvalue expression that does
>        not have character type, the behavior is undefined.  If such a
>        representation is produced by a side effect that modifies all
>        or any part of the object by an lvalue expression that does
>        not have character type, the behavior is undefined.  Such a
>        representation is called a trap representation.

>        Section 7.21.1 paragraph 1:

>        The header <string.h> declares one type and several functions, and
>        defines one macro useful for manipulating arrays of character type
>        and other objects treated as arrays of character type.  [...]

>Can there be a trap representation for a character?

There cannot be trap representations for unsigned char. 6.2.6.1p3
implies this. 5.2.4.2.1p2 says "UCHAR_MAX shall equal
(2 (to the power of) CHAR_BIT)-1. For that to be possible every bit
pattern in a byte must represent a value as an unsigned char. There is
no room for trap representations.

Quote:
> I can't find anything
>that says that there can't.  If there can, then no operations defined
>in <string.h> are useful on "other objects treated as arrays of character
>type."

6.2.6.1p5 indicates that accessing any object with a character typed
lvalue does not produce undefined behaviour. So character types can't trap.

Quote:
>Can an all-binary-zeros representation be a trap representation for an
>integer?

Yes, for any integer type other than character types. Apart from the
arguments above we know that all-bits-zero is a valid representation
of zero for unsigned char. Because of 6.2.5p9 (also 6.2.6.2p5) it
must also be a valid representation of zero for signed char. Taking
these together the same must also be true for plain char.

Quote:
>Does calling memset() with a pointer to an integer array as the first
>argument constitute a "side effect that modifies all or any part
>of the object by an lvalue expression that does not have character type"?

No, 7.21.1.p1 says objects are treated as arrays of character type.

Quote:
>And it sounds like passing any object of a non-character type to a string
>function may have undefined behavior because of the second sentence
>of 6.2.6.1 paragraph 5.

Again, the functions behave as if they access objects character by
character.

Quote:
>Is it really the intent that writing a trap representation to an object
>using <string.h> functions should have undefined behavior?  (I can
>certainly see why reading those objects afterware would have undefined
>behavior.)

6.2.6.1p4 and p5 (also see the footnotes) show that that is not the
intent.

Quote:
>If this is really the case, doesn't that mean that calloc() can only
>be used on characters and character arrays?

Yes, that is true along with memset(). This is pointed out quite
regularly in comp.lang.c. :-)

Quote:
>        Section 7.20.3.1 paragraph 2:

>        The calloc function allocates space for an array of nmemb
>        objects, each of whose size is size.  The space is initialized
>        to all bits zero.  252)

>        252) Note that this need not be the same as the representation
>        of floating-point zero or a null pointer constant.

>Therefore, it sounds like doing a calloc() and storing the result
>as a pointer to an array of integers (or most other types) can
>result in undefined behavior, as soon as the resulting object is
>read.

Correct.

Quote:
>If this is all true, it sounds like the only way to "zero" a non-integer
>array is to iterate over all the elements, storing zero into them.  I suppose
>that's not too bad for an array of integers, because the compiler might
>be smart enough to optimize it, but trying to zero an array of structs
>will be abysmal.

Initialisation can be used to correctly set structure members to 0, 0.0,
null as appropriate. One portable way of correctly zeroing the members of
an array of structures would be to initialise an instance of the structure
to zero and then copy that to each element of the array.

--
-----------------------------------------


-----------------------------------------



Sun, 27 Jun 2004 20:46:40 GMT  
 using <string.h> functions on non-character objects?

Quote:


> >Can there be a trap representation for a character?
> There cannot be trap representations for unsigned char. 6.2.6.1p3
> implies this. 5.2.4.2.1p2 says "UCHAR_MAX shall equal
> (2 (to the power of) CHAR_BIT)-1. For that to be possible every bit
> pattern in a byte must represent a value as an unsigned char. There is
> no room for trap representations.

Right.

Quote:
> 6.2.6.1p5 indicates that accessing any object with a character typed
> lvalue does not produce undefined behaviour.
> So character types can't trap.

Question is that
  can there be a trap representation for signed char type object?

Is 6.2.6.1p5 sufficient to imply no trap representation for signed char?

Quote:
> >Can an all-binary-zeros representation be a trap representation for an
> >integer?

> Yes, for any integer type other than character types.

No, no trap of the form of all-binary-zeros.

When sign bit is zero, the value of the signed integer is not affected
by the rule described in 6.2.6.2p2. The interger value of
all-binary-zeros is therefore the same as the value of the unsiged integer
with all-binary-zeros, which is zero.



Sun, 27 Jun 2004 22:53:03 GMT  
 using <string.h> functions on non-character objects?

Quote:

> >Can an all-binary-zeros representation be a trap representation for an
> >integer?

> Yes, for any integer type other than character types. Apart from the
> arguments above we know that all-bits-zero is a valid representation
> of zero for unsigned char. Because of 6.2.5p9 (also 6.2.6.2p5) it
> must also be a valid representation of zero for signed char. Taking
> these together the same must also be true for plain char.

After having read the other thread "Extracing a substring (fast)" which
Richard Heathfield indicated, I stand to correct my previous reply.

The interger type other than character one can has a trap representation
of all-binary-zeros.
Sorry for my ignorance. :-)

I have a question for what C99 6.2.5p9 said:

"The range of nonnegative values of a signed integer type is a subrange
 of the corresponding unsigned integer type, and the representation of
 the same value in each type is the same."

If a system uses 1's complement or sign/magnitude integer representation,
0 has two representations in signed type.
How can the representation of the same value in each type be the same?

Does it imply that for a system with 1's complement or sign/magnitude
integer representation,  "-0" need to be a trap?

thank you.

paiyi



Mon, 28 Jun 2004 02:14:17 GMT  
 using <string.h> functions on non-character objects?
On Wednesday, in article


...

Quote:
>I have a question for what C99 6.2.5p9 said:

>"The range of nonnegative values of a signed integer type is a subrange
> of the corresponding unsigned integer type, and the representation of
> the same value in each type is the same."

>If a system uses 1's complement or sign/magnitude integer representation,
>0 has two representations in signed type.

It can have 2 representations of 0, or the one with the sign bit set can
be a trap representation.

Quote:
>How can the representation of the same value in each type be the same?

When there are 2 representations of 0 one is referred to as the
positive representation and the other the negative representation
(i.e. with ign bit clear and set respectively). The quote above refers
to nonnegative values and must be taken to exclude the "negative zero".
The wording could perhaps be better but there's really no other way to
interpret it.

Quote:
>Does it imply that for a system with 1's complement or sign/magnitude
>integer representation,  "-0" need to be a trap?

No, 2 representation of zero are explicitly allowed by 6.2.6.2p2 which
also defines the term "negative zero".

--
-----------------------------------------


-----------------------------------------



Mon, 28 Jun 2004 22:03:44 GMT  
 using <string.h> functions on non-character objects?
On Wednesday, in article


Quote:


>> >Can there be a trap representation for a character?

>> There cannot be trap representations for unsigned char. 6.2.6.1p3
>> implies this. 5.2.4.2.1p2 says "UCHAR_MAX shall equal
>> (2 (to the power of) CHAR_BIT)-1. For that to be possible every bit
>> pattern in a byte must represent a value as an unsigned char. There is
>> no room for trap representations.

>Right.

>> 6.2.6.1p5 indicates that accessing any object with a character typed
>> lvalue does not produce undefined behaviour.
>> So character types can't trap.

>Question is that
>  can there be a trap representation for signed char type object?

I'll answer your question with a question: what would be the significance
of a trap representation that can't trap?

Quote:
>Is 6.2.6.1p5 sufficient to imply no trap representation for signed char?

I would say so unless you can think of implications of trap representations
that don't involve undefined behaviour.

--
-----------------------------------------


-----------------------------------------



Mon, 28 Jun 2004 21:59:24 GMT  
 using <string.h> functions on non-character objects?

Quote:


> >> 6.2.6.1p5 indicates that accessing any object with a character typed
> >> lvalue does not produce undefined behaviour.
> >> So character types can't trap.

> >Question is that
> >  can there be a trap representation for signed char type object?

> I'll answer your question with a question: what would be the significance
> of a trap representation that can't trap?

There is hence some contradiction.
To avoid the contradiction is to let signed char type object have no
trap representation.

Quote:
> >Is 6.2.6.1p5 sufficient to imply no trap representation for signed char?
> I would say so unless you can think of implications of trap representations
> that don't involve undefined behaviour.

I can not find such implications.
As you have pointed out that a signed char may contained padding bits,
any bit pattern in the padding can not cause trap.

paiyi



Tue, 29 Jun 2004 20:51:02 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. <<<>>>Need C code advice with functions and sorting.<<<>>>

2. <<<<<<<Parsing help, please>>>>>>>>

3. File Format conversion, ascii freeform -->.csv <-->.wk1<-->dbf<-->?HELP

4. Warnings using STL Map<string,string>

5. <><><>HELP<><><> PCMCIA Motorola Montana 33.6

6. >>>Windows Service<<<

7. conflict using string from <string> and strstr fucntion from <string.h>

8. Problems creating a valid object using CComObject<...>::CreateInstance

9. using greater<> on an object

10. using <string> problem

11. Wrong constructor used for vector<string>.

12. Problem using list<string>

 

 
Powered by phpBB® Forum Software