Question about character arrays & pointers
Author Message
Question about character arrays & pointers

Lets say that  I want to declare two character arrays as:

char s[] = "Str1";
char t[] = "Longer string";

and a pointer to char for each, which points to the first char in the
array:

char *p = s;
char *q = t;

I was wondering why it seems I can store more chars onto the end of s.
Here is a routine that
concatenates t onto the end of s using pointer arithmetic to access the
array elements:

int i, j = 0;

while ( *(p+i) != '\0') i++;     /* set i to width of s */
while ( *(p+i++) = *(q+j++) );  /* copy t into s */

Why doesn't this cause a segmentation fault?  I thought that if I tried
to access p beyond the \0 in s,
especially here since im assigning it a value, that I would get a
segmentation fault.  Why is it that
I can store values into memory locations that I have not set aside
storage for?  What is going on in
memory and in the pointers that allows this to happen?

If someone could give me some insight into this problem I would really
appreciate it.  Some of you
might recognize this as one of the exercises from K&R2.  It is ex.5-3 on
p. 107.

Here is a more complete version of what I was trying:

int main()
{
char s[] = "Str1";
char t[] = "Longer string";
char *p = s;
char *q = t;
int i;
int j;
i=0; j=0;

while ( *(p+i) != '\0' )
i++;  /* set i to width of s */

while ( *(p+i++) = *(q+j++) )

printf("s = %s t = %s\n", s, t);

return 0;

Quote:
}

Thanks,

Steve

Mon, 23 Aug 2004 14:05:15 GMT
Question about character arrays & pointers

Quote:
> Lets say that  I want to declare two character arrays as:

>     char s[] = "Str1";
>     char t[] = "Longer string";

> and a pointer to char for each, which points to the first char in the
> array:

>     char *p = s;
>     char *q = t;

> I was wondering why it seems I can store more chars onto the end of s.

You can merrily write past the end of the string and not get a segmentation
fault... until you happen to touch memory your program isn't allowed to
modify.

Don't expect to get a segmentation fault the moment you write past the bounds
of a particular object. In fact, don't expect anything at all; it is undefined
behavior.

-Daniel

Mon, 23 Aug 2004 14:17:37 GMT
Question about character arrays & pointers

Quote:

> Lets say that  I want to declare two character arrays as:

>     char s[] = "Str1";
>     char t[] = "Longer string";

> and a pointer to char for each, which points to the first char in the
> array:

>     char *p = s;
>     char *q = t;

>    <snip>

>     int i, j = 0;

>     while ( *(p+i) != '\0') i++;     /* set i to width of s */ while (
>     *(p+i++) = *(q+j++) );  /* copy t into s */

> Why doesn't this cause a segmentation fault?

I am The Magical Pony!

If you access a random memory location, you might get a segmentation
fault. But, if the random memory locaiton you access just happens to be
allocated by your program everything is just fine. And dandy.

An easy way to test this in the real world is to get five baseballs and a
bat. Go to the mall and drop all the baseballs in random spots. Now,
blindfold yourself, begin screaming (which is optional), and
starting running around swinging the bat. You will find that chances are
you jack somebody up good, but once in a while you'll actually hit one of

In this case, running off the end of the storage for s probably puts you
into the storage for t, or for some other automatic variable. Something
is most probably allocated continguously with s and you are trouncing it
with the code, but as far as the OS is concerned there's nothing wrong.

Pony.

Mon, 23 Aug 2004 14:18:50 GMT
Question about character arrays & pointers

Quote:

> An easy way to test this in the real world is to get five baseballs and a
> bat. Go to the mall and drop all the baseballs in random spots. Now,
> blindfold yourself, begin screaming (which is optional), and
> starting running around swinging the bat. You will find that chances are
> you jack somebody up good, but once in a while you'll actually hit one of

What the hell are you smoking?

-Daniel

Mon, 23 Aug 2004 14:47:18 GMT
Question about character arrays & pointers

Quote:

> > Lets say that  I want to declare two character arrays as:

> >     char s[] = "Str1";
> >     char t[] = "Longer string";

> > and a pointer to char for each, which points to the first char in the
> > array:

> >     char *p = s;
> >     char *q = t;

> >    <snip>

> >     int i, j = 0;

> >     while ( *(p+i) != '\0') i++;     /* set i to width of s */ while (
> >     *(p+i++) = *(q+j++) );  /* copy t into s */

> > Why doesn't this cause a segmentation fault?

> I am The Magical Pony!

> If you access a random memory location, you might get a segmentation
> fault. But, if the random memory locaiton you access just happens to be
> allocated by your program everything is just fine. And dandy.

> An easy way to test this in the real world is to get five baseballs and a
> bat. Go to the mall and drop all the baseballs in random spots. Now,
> blindfold yourself, begin screaming (which is optional), and
> starting running around swinging the bat. You will find that chances are
> you jack somebody up good, but once in a while you'll actually hit one of

> In this case, running off the end of the storage for s probably puts you
> into the storage for t, or for some other automatic variable. Something
> is most probably allocated continguously with s and you are trouncing it
> with the code, but as far as the OS is concerned there's nothing wrong.

> Pony.

I think I understand what's going on now.  So when my main() starts there is

extra storage allocated onto the local stack.  Lets just say that in memory,

the beginning of t is at the end of s.  For simplicity, say that s starts at
0x0000 and t starts at 0x0100 using byte-addressing.  When I go through the
first while loop, p should point to 0x0004 because it doesn't count the null

character.  The first time through the second while copies the data at
0x0100
into 0x0005, the second time through copies 0x0101 into 0x0006, and so on.
The reason there is no segfault is because there is extra room on the stack
that is not allocated by any variable.

But say t was more than 0xFF (256) bytes.  Once I try to copy something
into 0x0100 then I get a segmentation fault, if I hadn't trounced over some
other variable's storage first.  Is this where the undefined behavior comes
from?  Is there a better way to write this code so that I can avoid all
possible
segmentation faults or undefined behavior?  Allocating enough storage
for a character array comes to mind, but is there any other way to do it?

Steve

Tue, 24 Aug 2004 03:55:13 GMT
Question about character arrays & pointers
Stephen Orzel rambled on saying:

Quote:

> I think I understand what's going on now.  So when my main() starts there
> is

> extra storage allocated onto the local stack.  Lets just say that in
> memory,

> the beginning of t is at the end of s.  For simplicity, say that s starts
> 0x0000 and t starts at 0x0100 using byte-addressing.  When I go through
> the first while loop, p should point to 0x0004 because it doesn't count
> the null

> character.  The first time through the second while copies the data at
> 0x0100
> into 0x0005, the second time through copies 0x0101 into 0x0006, and so on.
> The reason there is no segfault is because there is extra room on the
> stack that is not allocated by any variable.

> But say t was more than 0xFF (256) bytes.  Once I try to copy something
> into 0x0100 then I get a segmentation fault, if I hadn't trounced over
> some
> other variable's storage first.

Almost, you would probably be able to write over the contents of your
variable t so that if you had a of more than xFF bytes you would re-write
the first elements of the t array in your while loop. You would only get a
segmentation fault if you tried to write over read-only memory, that is,
memory not designated for your program to use and modify.

The reason it is called undefined behaviour is because the C standard does
not specify how the memory is defined, therefore one implementation may be
diferent from another, it is not possible to say then how the OS will
behave. Behaviour is undefined.

Try writing code with bounds checking to see if you are at the end of the
array. If you want to copy the first elements of the longer array over the
smaller then test using strlen(s) so you could have

/* Not tested. */
for(i=0;i<(strlen(s)-1);i++)
{
s[i] = t[i];

Quote:
}

s[(strlen(s)-1)] = '\0';

--
[root]# rm -rf /*
You know it makes sense!

Tue, 24 Aug 2004 06:42:10 GMT
Question about character arrays & pointers

...

Quote:
>I think I understand what's going on now.  So when my main() starts there is

>extra storage allocated onto the local stack.  Lets just say that in memory,

>the beginning of t is at the end of s.  For simplicity, say that s starts at
>0x0000 and t starts at 0x0100 using byte-addressing.  When I go through the
>first while loop, p should point to 0x0004 because it doesn't count the null

>character.  The first time through the second while copies the data at
>0x0100
>into 0x0005, the second time through copies 0x0101 into 0x0006, and so on.
>The reason there is no segfault is because there is extra room on the stack
>that is not allocated by any variable.

Or perhaps you are accessing or corrupting memory that is used by other
variables or internally by C's runtime system.

Quote:
>But say t was more than 0xFF (256) bytes.  Once I try to copy something
>into 0x0100 then I get a segmentation fault, if I hadn't trounced over some
>other variable's storage first.  Is this where the undefined behavior comes
>from?

Undefined behaviour simply means that the C language no longer makes
any requirements about how your program will behave; the program is in
error and the implementation is at liberty to crash lock up, start acting
funny in random ways or apparently continue executing the program as
if nothing untoward had happened (or anything else for that matter).
The state of the progream might have been terminally corrupted or it might
not, strange effects might startb happening immediately or at some
arbitrary point in the future.

Quote:
> Is there a better way to write this code so that I can avoid all
>possible
>segmentation faults or undefined behavior?  Allocating enough storage
>for a character array comes to mind, but is there any other way to do it?

Check your boundaries and ensure that you don't exceed them.

--
-----------------------------------------

-----------------------------------------

Wed, 25 Aug 2004 23:21:17 GMT
Question about character arrays & pointers
On Thu, 07 Mar 2002 06:05:15 GMT, Stephen Orzel

Quote:
> Lets say that  I want to declare two character arrays as:

>     char s[] = "Str1";
>     char t[] = "Longer string";

> and a pointer to char for each, which points to the first
> char in the array:

>     char *p = s;
>     char *q = t;

> I was wondering why it seems I can store more chars onto the
> end of s. Here is a routine that
> concatenates t onto the end of s using pointer arithmetic to
> access the array elements:

>     int i, j = 0;

>     while ( *(p+i) != '\0') i++;     /* set i to width of s
>     */ while ( *(p+i++) = *(q+j++) );  /* copy t into s */

> Why doesn't this cause a segmentation fault?

<snip>

The result of what you are doing is not guaranteed to be a
segmentation fault. It is "undefined behavior". Can be anything
from not causing any obvious problems to crashing your system.

Your compiler, which got only a pointer without any information
on how much memory is allocated for it, cannot give an error
when you do it. Your runtime system may give an error, but not
necessarily.

You need to make sure this will not happen. That is one painful
thing in C.

- Umesh

--

Umesh P Nair
Remove 'z's from my e-mail ID

Mon, 23 Aug 2004 14:49:38 GMT

 Page 1 of 1 [ 8 post ]

Relevant Pages