Author |
Message |
sridh #1 / 25
|
 reverse string - UB
A few days back this was posted as a solution for revsering a string #include <stdio.h> char str[] = "ABCDE"; void rev(char *); int main(void) { printf("Before: %s\n", str); rev(str); printf(" After: %s\n", str); return 0; Quote: }
void rev(char *s) { char *d = s; char c; while (*d) ++d; --d; while (d > s) { c = *s; *s++ = *d; *d-- = c; } Quote: }
it was pointed out that if *s = '\0' then it is invokes UB.am not able to get how? i was thinking that d > s would be false and the loop would terminate. is it not guaranteed? -sridhar
|
Tue, 05 Apr 2005 00:27:27 GMT |
|
 |
Joona I Palast #2 / 25
|
 reverse string - UB
Quote: > A few days back this was posted as a solution for revsering a string > #include <stdio.h> > char str[] = "ABCDE"; > void rev(char *); > int main(void) { > printf("Before: %s\n", str); > rev(str); > printf(" After: %s\n", str); > return 0; > } > void rev(char *s) { > char *d = s; > char c; > while (*d) ++d; > --d; > while (d > s) { > c = *s; > *s++ = *d; > *d-- = c; > } > } > it was pointed out that if *s = '\0' then it is invokes UB.am not able > to get how? i was thinking that d > s would be false and the loop > would terminate. is it not guaranteed? > -sridhar
If *s == '\0', the while loop will be skipped, thus keeping d pointing at the same address as s. --d will then make d point to the previous byte, which quite probably does not belong to your program's memory space. There's the undefined behaviour. --
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++| | http://www.helsinki.fi/~palaste W++ B OP+ | \----------------------------------------- Finland rules! ------------/ "And according to Occam's Toothbrush, we only need to optimise the most frequent instructions." - Teemu Kerola
|
Tue, 05 Apr 2005 00:30:36 GMT |
|
 |
Zoran Cutur #3 / 25
|
 reverse string - UB
Quote: > A few days back this was posted as a solution for revsering a string > #include <stdio.h> > char str[] = "ABCDE"; > void rev(char *); > int main(void) { > printf("Before: %s\n", str); > rev(str); > printf(" After: %s\n", str); > return 0; > } > void rev(char *s) { > char *d = s; > char c; > while (*d) ++d; > --d; > while (d > s) { > c = *s; > *s++ = *d; > *d-- = c; > } > } > it was pointed out that if *s = '\0' then it is invokes UB.am not able > to get how? i was thinking that d > s would be false and the loop > would terminate. is it not guaranteed? > -sridhar
I'm not quite sure about this, but I'ld expect that this invoked UB because pointer comparision is only defined for pointers pointing into the same array (or exactly one poition past the end of the array). But if s pointed to an "" (empty string), than d would at the point where d and s are compared point to some address befor the actual array and here we go UB... Probably somebody else can assure my assumption. --
"LISP is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days." -- Eric S. Raymond
|
Tue, 05 Apr 2005 00:38:32 GMT |
|
 |
Zoran Cutur #4 / 25
|
 reverse string - UB
Quote:
>> A few days back this was posted as a solution for revsering a string >> #include <stdio.h> >> char str[] = "ABCDE"; >> void rev(char *); >> int main(void) { >> printf("Before: %s\n", str); >> rev(str); >> printf(" After: %s\n", str); >> return 0; >> } >> void rev(char *s) { >> char *d = s; >> char c; >> while (*d) ++d; >> --d; >> while (d > s) { >> c = *s; >> *s++ = *d; >> *d-- = c; >> } >> } >> it was pointed out that if *s = '\0' then it is invokes UB.am not able >> to get how? i was thinking that d > s would be false and the loop >> would terminate. is it not guaranteed? >> -sridhar > If *s == '\0', the while loop will be skipped, thus keeping d pointing > at the same address as s. --d will then make d point to the previous > byte, which quite probably does not belong to your program's memory > space. There's the undefined behaviour.
As long as d isn't dereferenced, I don't see how this is UB. --
"LISP is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days." -- Eric S. Raymond
|
Tue, 05 Apr 2005 00:39:45 GMT |
|
 |
Joona I Palast #5 / 25
|
 reverse string - UB
Quote:
>>> A few days back this was posted as a solution for revsering a string >>> #include <stdio.h> >>> char str[] = "ABCDE"; >>> void rev(char *); >>> int main(void) { >>> printf("Before: %s\n", str); >>> rev(str); >>> printf(" After: %s\n", str); >>> return 0; >>> } >>> void rev(char *s) { >>> char *d = s; >>> char c; >>> while (*d) ++d; >>> --d; >>> while (d > s) { >>> c = *s; >>> *s++ = *d; >>> *d-- = c; >>> } >>> } >>> it was pointed out that if *s = '\0' then it is invokes UB.am not able >>> to get how? i was thinking that d > s would be false and the loop >>> would terminate. is it not guaranteed? >>> -sridhar >> If *s == '\0', the while loop will be skipped, thus keeping d pointing >> at the same address as s. --d will then make d point to the previous >> byte, which quite probably does not belong to your program's memory >> space. There's the undefined behaviour. > As long as d isn't dereferenced, I don't see how this is UB.
IHNRTS, but AFAIK even forming a non-null pointer into non-accessible memory invokes UB. But then, EICBW. I suppose the real gurus on this newsgroup (Heathfield, Pfaff, Pop, Torek and a bunch of others) can shed more light on this. --
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++| | http://www.helsinki.fi/~palaste W++ B OP+ | \----------------------------------------- Finland rules! ------------/ "'It can be easily shown that' means 'I saw a proof of this once (which I didn't understand) which I can no longer remember'." - A maths teacher
|
Tue, 05 Apr 2005 00:52:48 GMT |
|
 |
Jun Woon #6 / 25
|
 reverse string - UB
Quote:
> >> A few days back this was posted as a solution for revsering a string > >> #include <stdio.h> > >> char str[] = "ABCDE"; > >> void rev(char *); > >> int main(void) { > >> printf("Before: %s\n", str); > >> rev(str); > >> printf(" After: %s\n", str); > >> return 0; > >> } > >> void rev(char *s) { > >> char *d = s; > >> char c; > >> while (*d) ++d; > >> --d; > >> while (d > s) { > >> c = *s; > >> *s++ = *d; > >> *d-- = c; > >> } > >> } > >> it was pointed out that if *s = '\0' then it is invokes UB.am not able > >> to get how? i was thinking that d > s would be false and the loop > >> would terminate. is it not guaranteed? > >> -sridhar > > If *s == '\0', the while loop will be skipped, thus keeping d pointing > > at the same address as s. --d will then make d point to the previous > > byte, which quite probably does not belong to your program's memory > > space. There's the undefined behaviour. > As long as d isn't dereferenced, I don't see how this is UB.
It results in UB. The pointer arithmetic is defined only in an array object including one past its last element. int a[10]; int *p; p = a+10; /* okay */ *p; /* UB */ p = a-1; /* UB */ --
Dept. of Physics, Univ. of Seoul
|
Tue, 05 Apr 2005 01:11:33 GMT |
|
 |
Martin Dicko #7 / 25
|
 reverse string - UB
Quote:
>>> A few days back this was posted as a solution for revsering a string >>> #include <stdio.h> >>> char str[] = "ABCDE"; >>> void rev(char *); >>> int main(void) { >>> printf("Before: %s\n", str); >>> rev(str); >>> printf(" After: %s\n", str); >>> return 0; >>> } >>> void rev(char *s) { >>> char *d = s; >>> char c; >>> while (*d) ++d; >>> --d; >>> while (d > s) { >>> c = *s; >>> *s++ = *d; >>> *d-- = c; >>> } >>> } >>> it was pointed out that if *s = '\0' then it is invokes UB.am not able >>> to get how? i was thinking that d > s would be false and the loop >>> would terminate. is it not guaranteed? >>> -sridhar >> If *s == '\0', the while loop will be skipped, thus keeping d pointing >> at the same address as s. --d will then make d point to the previous >> byte, which quite probably does not belong to your program's memory >> space. There's the undefined behaviour. > As long as d isn't dereferenced, I don't see how this is UB.
According to my understanding of section 6.5.6 of the standard, it is undefined behavior even if d is not dereferenced. Specifically: | (8) When an expression that has integer type is added to or | subtracted from a pointer, the result has the type of the pointer | operand. [...] If both the pointer operand and the result point to ^^^^^^^^^^ | elements of the same array object, or one past the last element of | the array object, the evaluation shall not produce an overflow; | otherwise, the behavior is undefined. If the result points one past | the last element of the array object, it shall not be used as the | operand of a unary * operator that is evaluated. Since the result of the expression --d points before the object, I think it invokes UB. Martin
|
Tue, 05 Apr 2005 01:36:47 GMT |
|
 |
Chris Engebretso #8 / 25
|
 reverse string - UB
[ snip ] Quote: >>If *s == '\0', the while loop will be skipped, thus keeping d pointing >>at the same address as s. --d will then make d point to the previous >>byte, which quite probably does not belong to your program's memory >>space. There's the undefined behaviour. > As long as d isn't dereferenced, I don't see how this is UB.
At the risk of grossly oversimplifying things, pointer arithmetic in C is only permitted within array objects. (Regions of memory set aside by *alloc() are considered to be arrays for this purpose, and non-array objects are special cases of "arrays of one element.") With that in mind, you can use pointer arithmetic to walk through an array, and you are also allowed to use said arithmetic to create a pointer that points *one element past* the end of an array (though you are not allowed to dereference such a pointer, for obvious reasons.) With any pointer arithmetic beyond that, all bets are off. This includes cases where you have a pointer that points to one element "before" the start of an array, which is what we are dealing with in this case. Incidentally, this is the reason that the following (ugly) code is not legal C: int a[10]; int *ptr_a = a - 1; /* ptr_a can now be used to treat a as a one-based array */ ptr_a[1] = ... This "clever" trick was commonly-used in early editions of the book "Numerical Recipes in C." This book is a great reference for mathematical algorithms, but not a particularly good reference for the C programming language. Regards, -- Chris Engebretson - Raytheon Systems Company | Ph#: (605)594-6829 USGS EROS Data Center, Sioux Falls, SD 57198 | Fax: (605)594-6940 Software Project Lead -- Landsat 7 Level 1 Sustaining Engineering
|
Tue, 05 Apr 2005 01:26:50 GMT |
|
 |
Joona I Palast #9 / 25
|
 reverse string - UB
Quote: > Incidentally, this is the reason that the following (ugly) code is > not legal C: > int a[10]; > int *ptr_a = a - 1; > /* ptr_a can now be used to treat a as a one-based array */ > ptr_a[1] = ... > This "clever" trick was commonly-used in early editions of the > book "Numerical Recipes in C." This book is a great reference for > mathematical algorithms, but not a particularly good reference for > the C programming language.
We once had a glorious flamewar about that thing... it was called "Different array indexing" or something like that. --
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++| | http://www.helsinki.fi/~palaste W++ B OP+ | \----------------------------------------- Finland rules! ------------/ "And according to Occam's Toothbrush, we only need to optimise the most frequent instructions." - Teemu Kerola
|
Tue, 05 Apr 2005 01:39:39 GMT |
|
 |
Mike Wahle #10 / 25
|
 reverse string - UB
Quote:
> >>> A few days back this was posted as a solution for revsering a string > >>> #include <stdio.h> > >>> char str[] = "ABCDE"; > >>> void rev(char *); > >>> int main(void) { > >>> printf("Before: %s\n", str); > >>> rev(str); > >>> printf(" After: %s\n", str); > >>> return 0; > >>> } > >>> void rev(char *s) { > >>> char *d = s; > >>> char c; > >>> while (*d) ++d; > >>> --d; > >>> while (d > s) { > >>> c = *s; > >>> *s++ = *d; > >>> *d-- = c; > >>> } > >>> } > >>> it was pointed out that if *s = '\0' then it is invokes UB.am not able > >>> to get how? i was thinking that d > s would be false and the loop > >>> would terminate. is it not guaranteed? > >>> -sridhar > >> If *s == '\0', the while loop will be skipped, thus keeping d pointing > >> at the same address as s. --d will then make d point to the previous > >> byte, which quite probably does not belong to your program's memory > >> space. There's the undefined behaviour. > > As long as d isn't dereferenced, I don't see how this is UB. > IHNRTS, but AFAIK even forming a non-null pointer into non-accessible > memory invokes UB. But then, EICBW. > I suppose the real gurus on this newsgroup (Heathfield, Pfaff, Pop, > Torek and a bunch of others) can shed more light on this.
---------------------------------------------------------------- ISO/IEC 9899:1999 (E) 6.5.3.1 Prefix increment and decrement operators Constraints 1 The operand of the prefix increment or decrement operator shall have qualified or unqualified real or pointer type and shall be a modifiable lvalue. Semantics 2 The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation. The expression ++E is equivalent to <<===== (E+=1). See the discussions of additive operators and compound assignment for information on constraints, types, side effects, and conversions and the effects of operations on pointers. 3 The prefix -- operator is analogous to the prefix ++ operator, except that the value of the operand is decremented. Forward references: additive operators (6.5.6), compound assignment (6.5.16.2). 6.5.6 Additive operators 8 When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i-n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the <<====== pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated. 6.5.16.2 Compound assignment Constraints 1 For the operators += and -= only, either the left operand shall be a pointer to an object type and the right shall have integer type, or the left operand shall have qualified or unqualified arithmetic type and the right shall have arithmetic type. 2 For the other operators, each operand shall have arithmetic type consistent with those allowed by the corresponding binary operator. Semantics 3 A compound assignment of the form E1 op =E2 differs from the simple assignment expression E1 = E1 op (E2) only in that the lvalue E1 is evaluated only once. ---------------------------------------------------------------- -Mike
|
Tue, 05 Apr 2005 01:54:08 GMT |
|
 |
Thomas Stege #11 / 25
|
 reverse string - UB
Quote:
> [ snip ] >>> If *s == '\0', the while loop will be skipped, thus keeping d pointing >>> at the same address as s. --d will then make d point to the previous >>> byte, which quite probably does not belong to your program's memory >>> space. There's the undefined behaviour. >> As long as d isn't dereferenced, I don't see how this is UB. > At the risk of grossly oversimplifying things, pointer arithmetic > in C is only permitted within array objects.
Also, if you have a pointer to any object, you can cast this pointer to char* (or maybe you need a char pointer...) and increment it. This does not allow you to treat the object as a char array, but you can walk the array char by char (or byte by byte as the case may be). -- Thomas.
|
Tue, 05 Apr 2005 01:05:30 GMT |
|
 |
Richard Heathfiel #12 / 25
|
 reverse string - UB
Quote:
> A few days back this was posted as a solution for revsering a string > #include <stdio.h> > char str[] = "ABCDE"; > void rev(char *); > int main(void) { > printf("Before: %s\n", str); > rev(str); > printf(" After: %s\n", str); > return 0; > } > void rev(char *s) { > char *d = s; > char c; > while (*d) ++d; > --d; > while (d > s) { > c = *s; > *s++ = *d; > *d-- = c; > } > } > it was pointed out that if *s = '\0' then it is invokes UB.am not able > to get how? i was thinking that d > s would be false and the loop > would terminate. is it not guaranteed?
No, it isn't. If *s is 0, then while(*d) ++d; won't affect d at all, and --d will therefore decrement the pointer, so that it points to the byte *before* the start of s. Okay, enough of the empty string. Let's focus on the behaviour question now, which is: "is the behaviour of a program defined by the Standard if we point a pointer to the place just *before* the start of an object?", and to answer that question, let's pick a string we can see: "hello". Now, s is a pointer to a character, but in practice we can deduce that it's a pointer to the first element of an array of characters. Let's draw that: +------+------+------+------+------+------+------+------+------+ | | | | | | | | | | | ??? | h | e | l | l | o | \0 | ??? | ??? | | | | | | | | | | | +------+------+------+------+------+------+------+------+------+ ^ | s Now, the Standard defines the behaviour if d points at the same place as s, or at any place up to the end of that array, or one past the end: +------+------+------+------+------+------+------+------+------+ | | | | | | | | | | | ??? | h | e | l | l | o | \0 | ??? | ??? | | | | | | | | | | | +------+------+------+------+------+------+------+------+------+ ^ [1] | [2] s <<<---- behaviour defined for d pointing ---->>> to any of these characters. But if d points to the character marked [1], or the character marked [2], the Standard does not define the behaviour of the program. Therefore, the behaviour is undefined. I can't give you a citation because there isn't one, except possibly this extract from the section on Conformance: If a shall or shall not requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words undefined behavior or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe behavior that is undefined. HTH. --
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999. C FAQ: http://www.eskimo.com/~scs/C-faq/top.html K&R answers, C books, etc: http://users.powernet.co.uk/eton
|
Tue, 05 Apr 2005 01:43:04 GMT |
|
 |
E. Gibbo #13 / 25
|
 reverse string - UB
Quote:
>[ snip ] >>>If *s == '\0', the while loop will be skipped, thus keeping d pointing >>>at the same address as s. --d will then make d point to the previous >>>byte, which quite probably does not belong to your program's memory >>>space. There's the undefined behaviour. >> As long as d isn't dereferenced, I don't see how this is UB. >At the risk of grossly oversimplifying things, pointer arithmetic >in C is only permitted within array objects. (Regions of memory >set aside by *alloc() are considered to be arrays for this purpose, >and non-array objects are special cases of "arrays of one element.")
This brings up an interesting question: is one allowed to produce a pointer "one past" a non-array object? E.g., int x; int * p = &x; ++p; /* allowed? */ (I realize that one can't dereference the pointer, a separate issue.) --Ben --
|
Tue, 05 Apr 2005 02:22:44 GMT |
|
 |
Martin Dicko #14 / 25
|
 reverse string - UB
Quote:
> This brings up an interesting question: is one allowed to produce a > pointer "one past" a non-array object? E.g., > int x; > int * p = &x; > ++p; /* allowed? */ > (I realize that one can't dereference the pointer, a separate issue.)
Yes. Section 6.5.6: | (7) For the purposes of these operators, a pointer to an object that | is not an element of an array behaves the same as a pointer to the | first element of an array of length one with the type of the object | as its element type. Martin
|
Tue, 05 Apr 2005 02:49:58 GMT |
|
 |
Zoran Cutur #15 / 25
|
 reverse string - UB
... Quote: > As long as d isn't dereferenced, I don't see how this is UB.
Thank you all, for clearing this up. I thought that only comparison of pointers was restricted to array bounds but not all pointer arithmetic. Thanx again. --
"LISP is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days." -- Eric S. Raymond
|
Tue, 05 Apr 2005 15:35:28 GMT |
|
|
Page 1 of 2
|
[ 25 post ] |
|
Go to page:
[1]
[2] |
|