Sequence Points & Side Effects 
Author Message
 Sequence Points & Side Effects

The C FAQ mentions in section 3.1 that the statement "array[x]=x++" does
not work.

Ostensibly, this is because sequence points in compilation occur at
certain places like loop control expressions, semicolons, after the
first operand of the condition operator, and so on.

(1) My question is, why not just make an assign '=' a sequence point
which binds from right to left, so the lvalue can be modified by the
operators in the rvalue?

(2) Or better yet, in this case, it would *seem* straight forward that
the array subscript operator [] has a higher precedence than the simple
assignment operator =. So shouldn't array[x] in this case resolve to a
pointer to array[x], then the next operator would be increment, ++,
(after lvalue is resolved already) and LASTLY the low precendence simple
assign operator = ?   This seems like based on simple precedence it
should be the same as: array[x]=(x+1); x+=1;

I'm sure there is an ambiguous expression that would break in these
cases, if simple assign were deemed a sequence point like I said. Can
somebody think of an example or explain why it couldn't be a sequence
point "atomic modification of rvalue before assign" as in (1). I realize
it would execute differently than (2), but either one seems better than
saying "don't even try".
--



Tue, 02 Dec 2003 00:09:36 GMT  
 Sequence Points & Side Effects


Quote:
>The C FAQ mentions in section 3.1 that the statement "array[x]=x++" does
>not work.

>Ostensibly, this is because sequence points in compilation occur at
>certain places like loop control expressions, semicolons, after the
>first operand of the condition operator, and so on.

>(1) My question is, why not just make an assign '=' a sequence point
>which binds from right to left, so the lvalue can be modified by the
>operators in the rvalue?

The real question isn't why not, but why do this.  Would there be
a signficant improvement in the language by making assignment a
sequence point.

Quote:

>(2) Or better yet, in this case, it would *seem* straight forward that
>the array subscript operator [] has a higher precedence than the simple
>assignment operator =. So shouldn't array[x] in this case resolve to a
>pointer to array[x], then the next operator would be increment, ++,
>(after lvalue is resolved already) and LASTLY the low precendence simple
>assign operator = ?   This seems like based on simple precedence it
>should be the same as: array[x]=(x+1); x+=1;

That's just not how precedence works in C.  The fact that the []
operator has higher precedence than assignment means that [] is
done before assignment; it says nothing about whether the
operands of [] are evaluated before both operands of assignment.

Let's look at a simpler example.  Consider A + B * C where A, B,
and C are expressions.  Precedence requires that the
multiplication will take place before the addition, but says
nothing about the order in which A, B, and C will be evaluated.
It would be correct for them to be evaluated in the order B, A,
C, the result of B and C multiplied and added to the result of A.

Quote:

>I'm sure there is an ambiguous expression that would break in these
>cases, if simple assign were deemed a sequence point like I said. Can
>somebody think of an example or explain why it couldn't be a sequence
>point "atomic modification of rvalue before assign" as in (1). I realize
>it would execute differently than (2), but either one seems better than
>saying "don't even try".

I suspect that specifying the order of evaluation (as do some
other languages like Java) would not be harmful to the language.
But that's not the real question now.  We live with our history
and there is a long history in C of not specifying the order of
evaluation; it would almost certainly be a significant expense
for implementors to change this now and the question is whether
it is worth the expense, or more accurately, can you convince the
standards committee that it is worth the expense.

Certainly a change this large could not be done before the next
standard.  Even if it could be incorporated in C0x (something I
very much doubt), the practical effect would be nil until a large
percentage of implementations supported the new standard.  In
order to get this change into the standard you're going to have
to convince the committee that the change is valuable enough to
work out the details with no real benefit until (optimistically)
around 2012.

This is not impossible -- certainly the members of the committee
did make changes in C99 knowing that we won't see any real hope
of using them portably for years, but it won't be easy.
--
Michael M Rubenstein
--



Tue, 02 Dec 2003 05:39:33 GMT  
 Sequence Points & Side Effects

Quote:

> The C FAQ mentions in section 3.1 that the statement "array[x]=x++" does
> not work.

> Ostensibly, this is because sequence points in compilation occur at
> certain places like loop control expressions, semicolons, after the
> first operand of the condition operator, and so on.

> (1) My question is, why not just make an assign '=' a sequence point
> which binds from right to left, so the lvalue can be modified by the
> operators in the rvalue?

Why?

Quote:
> (2) Or better yet, in this case, it would *seem* straight forward that
> the array subscript operator [] has a higher precedence than the simple
> assignment operator =. So shouldn't array[x] in this case resolve to a
> pointer to array[x], then the next operator would be increment, ++,
> (after lvalue is resolved already) and LASTLY the low precendence simple
> assign operator = ?   This seems like based on simple precedence it
> should be the same as: array[x]=(x+1); x+=1;

Precedence of operations does not imply precedence of evaluating those
operations' parameters.

For example, in the expression "a + b * c", the multiplication has a
higher precedence than the addition.  However, that does not guarantee
that "b" and "c" will be evaluated before "a".  If the expression were
"foo() + bar() * baz()", there is no guarantee (AFAIK) that bar() and
baz() will be called before foo().

Quote:
> I'm sure there is an ambiguous expression that would break in these
> cases, if simple assign were deemed a sequence point like I said. Can
> somebody think of an example or explain why it couldn't be a sequence
> point "atomic modification of rvalue before assign" as in (1). I realize
> it would execute differently than (2), but either one seems better than
> saying "don't even try".

The real question is: what does that get you?

And it still doesn't help in situations where the side effects are all on
the same side of the "=".

--

+---------+----------------------------------+-----------------------------+

|    J.   |                                  |  herein are not necessarily |
|  Brody  | http://www.bestweb.net/~kenbrody |  those of fP Technologies." |
+---------+----------------------------------+-----------------------------+
GCS (ver 3.12) d- s+++: a C++$(+++) ULAVHSC^++++$ P+>+++ L+(++) E-(---)

    DI+(++++) D---() G e* h---- r+++ y?
--



Tue, 02 Dec 2003 20:15:54 GMT  
 Sequence Points & Side Effects
^^^^

[You sure you really want to post to Usenet from your root account???]

Quote:
> The C FAQ mentions in section 3.1 that the statement "array[x]=x++" does
> not work.

That's not quite what it says. It may work, in your compiler, if you
happen to try on the Tuesday after Easter. What the FAQ says is that
this causes undefined behaviour --- which does include "works as I
thought it would".

Quote:
> (1) My question is, why not just make an assign '=' a sequence point
> which binds from right to left, so the lvalue can be modified by the
> operators in the rvalue?

Because that would introduce a performance bottleneck on certain
hardware platforms, or a restriction to compiler writers, for no good
reason.

As it is, any given C compiler is allowed to treat this situation
however its author sees fit --- choose one evaluation order at random,
issue a diagnostic, crash the compiler or the program --- whatever.
But code making assumption about such pecularities of a given compiler
is still causing undefined behaviour, so it's not portable to any
other compiler.  Caveat user.

Quote:
> (2) Or better yet, in this case, it would *seem* straight forward that
> the array subscript operator [] has a higher precedence than the simple
> assignment operator =.

Operator precedence has nothing to do with this. The evaluation
ordering question here is between [] and the ++ side-effect, not
between [] and assignment.

--

Even if all the snow were burnt, ashes would remain.
--



Thu, 04 Dec 2003 01:01:59 GMT  
 Sequence Points & Side Effects

Quote:

> The real question isn't why not, but why do this.  Would there be
> a signficant improvement in the language by making assignment a
> sequence point.

In a more general context: The reason C implementations aren't
required to evaluate everything in strict step by step accordance
with the syntax is that significantly faster code can be generated
when "side effects" are deferred.  This is particularly true when
there are several side effects that modify the same object's value,
or when a side effect would change a value that has already been
cached into a fast register.  However, without reasonable controls
over the degree of optimization, it would be impossible to write
reliable programs.  C's "sequence point" rules are part of the
treaty between the implementation and the programmer; because all
side effects are guaranteed to be enforced at each sequence point
(but not before), the programmer can be sure that many constructs
will have well-defined behavior, while there is still enough
latitude for the compiler to produce efficient generated code.

Quote:
> That's just not how precedence works in C.  The fact that the []
> operator has higher precedence than assignment means that [] is
> done before assignment; it says nothing about whether the
> operands of [] are evaluated before both operands of assignment.

"Before" has several meanings, two different ones being used here.
The *parsing* of the expression is such that the [] operation is
deeper in the parse tree than the = operation.  But no particular
time sequencing is implied for order of evaluation (as you said).

Quote:
> I suspect that specifying the order of evaluation (as do some
> other languages like Java) would not be harmful to the language.
> But that's not the real question now.  We live with our history
> and there is a long history in C of not specifying the order of
> evaluation; it would almost certainly be a significant expense
> for implementors to change this now and the question is whether
> it is worth the expense, or more accurately, can you convince the
> standards committee that it is worth the expense.

One reason for resistance is that in order to attain current
levels of code efficiency, optimizers would have to become even
smarter if such rules were imposed.  Since the programmer can
always obtain the same effect by exploiting explicit sequence
points, this change is a hard sell.
        a[i++] = i++;   // ambiguous
can be rewritten as
        a[i] = i;
        i += 2;
or
        a[i + 1] = i + 1;
        i += 2;
or whatever else the programmer intended, which would be clearer
and less likely to contain a bug.
--



Thu, 04 Dec 2003 01:02:15 GMT  
 Sequence Points & Side Effects


Quote:
>Precedence of operations does not imply precedence of evaluating those
>operations' parameters.

>For example, in the expression "a + b * c", the multiplication has a
>higher precedence than the addition.  However, that does not guarantee
>that "b" and "c" will be evaluated before "a".  If the expression were
>"foo() + bar() * baz()", there is no guarantee (AFAIK) that bar() and
>baz() will be called before foo().

You're completely correct.  And to hit this home, given:

int i = 99;

int foo() { return ++i; }
int baz() { return ++i; }
int bar() { return ++i; }

then with:

    ... foo() + baz() * bar() ...

you don't know what values are going to be used.
People may want to think it's 100 + 101 * 102,
but it need not be.
--
Greg Comeau                 Countdown to "export": December 20, 2001
Comeau C/C++ ONLINE ==>     http://www.comeaucomputing.com/tryitout
NEW: Try out libcomo!       NEW: Try out our C99 mode!

--



Thu, 04 Dec 2003 01:02:52 GMT  
 Sequence Points & Side Effects

Quote:

> For example, in the expression "a + b * c", the multiplication has a
> higher precedence than the addition.  However, that does not guarantee
> that "b" and "c" will be evaluated before "a".  If the expression were
> "foo() + bar() * baz()", there is no guarantee (AFAIK) that bar() and
> baz() will be called before foo().

Thanks (both of you who responded) for the insight. I would have thought since
* had higher precedence it would consume two arguments, the one on the left
being the + operator with foo() and bar() as it's leaves. I guess what you are
saying is that the order in which two operands of a binary operator are
evaluated is not known, so if both functions modify the same object the order
in which this occurs is not defined. Recursive descent though you would think
would follow the left-hand nodes in an expression tree.

I guess I wasn't suggesting that = become a sequence point in the future, as
much as wondering why it never was when they were throwing the list together.
If it is so trivial, why does it get special mention as a possible source of
bugs in the FAQ and almost every single tutorial on C?

There's just this mental habit of thinking evaluate the rvalue and assign to
the lvalue after almost as two operations.
--



Thu, 04 Dec 2003 01:06:40 GMT  
 Sequence Points & Side Effects
#include <stdio.h>

void* P[5];                /* create a stack */
void** p = &P[4];          /* and stack pointer */
#define PUSH(V) *--p = (V) /* the push operation */

int main ()
{
    P[4] = (void*) 1;    /* init one value */
    P[3] = (void*) 2;    /* and a second value */
    /* do something unimportant, p still at P[4] */
    PUSH (*p);           /* DUP the top of stack */
    printf ("P[4] = %p ; P[3] = %p  \n", P[4], P[3]);
    return 0;

Quote:
}

result:
   P[4] = 0x1 ; P[3] = 0x2  

the PUSH expanded to:
  *--p = *p;
and the decrement is done before the value was fetched on the rhs.
 (with `gcc -O0` ...)
 804841a:       83 05 e8 94 04 08 fc    addl   -4,p          ; dec p
 8048421:       a1 e8 94 04 08          mov    p,%eax        ; lhs p
 8048426:       8b 15 e8 94 04 08       mov    p,%edx        ; rhs p
 804842c:       8b 0a                   mov    (%edx),%ecx   ; load
 804842e:       89 08                   mov    %ecx,(%eax)   ; store

this is completly non-obvious, and I am sure that some parts of
one of my biggest application does contain this very bug (it's
a forth interpreter in C, implemenation a virtual machine based
on stack philosophy (java is not unlike), and there are many
things done macros to speed up things and let the compiler see
the variables involved, so the cc can do some optimization
including aliases of memory positions).

counter-example:

#include <stdio.h>

void* P[5];                /* create a stack */
void** p = &P[4];          /* and stack pointer */
inline void* PUSH(void* V) { return *--p = (V); }

int main ()
{
    P[4] = (void*) 1;    /* init one value */
    P[3] = (void*) 2;    /* and a second value */
    /* do something unimportant, p still at P[4] */
    PUSH (*p);           /* DUP the top of stack */
    printf ("P[4] = %p ; P[3] = %p  \n", P[4], P[3]);
    return 0;

Quote:
}

result:
  P[4] = 0x1 ; P[3] = 0x1  

the PUSH expanded to:
  V=*p, *--p = V;
and the decrement is done *after* the value was fetched at arg-building.
 (with `gcc -O1` ...)
 804841a:       a1 f8 94 04 08          mov    p,%eax
 804841f:       8b 10                   mov    (%eax),%edx
 8048421:       83 05 f8 94 04 08 fc    addl   -4,p
 8048428:       a1 f8 94 04 08          mov    p,%eax
 804842d:       89 10                   mov    %edx,(%eax)

More interestingly, there is no way to express the correct code
series in a C environment without "inline"-functionality. You can
not make a local temporary, and return a value - as the former needs
a declaration-block, and blocks have no value in plain C.

Even more, I did not find an option to let the compiler warn at
the unknown side-effect - does any compiler support that?

Quote:


> ^^^^

> [You sure you really want to post to Usenet from your root account???]

> > The C FAQ mentions in section 3.1 that the statement "array[x]=x++" does
> > not work.

> That's not quite what it says. It may work, in your compiler, if you
> happen to try on the Tuesday after Easter. What the FAQ says is that
> this causes undefined behaviour --- which does include "works as I
> thought it would".

> > (1) My question is, why not just make an assign '=' a sequence point
> > which binds from right to left, so the lvalue can be modified by the
> > operators in the rvalue?

> Because that would introduce a performance bottleneck on certain
> hardware platforms, or a restriction to compiler writers, for no good
> reason.

> As it is, any given C compiler is allowed to treat this situation
> however its author sees fit --- choose one evaluation order at random,
> issue a diagnostic, crash the compiler or the program --- whatever.
> But code making assumption about such pecularities of a given compiler
> is still causing undefined behaviour, so it's not portable to any
> other compiler.  Caveat user.

> > (2) Or better yet, in this case, it would *seem* straight forward that
> > the array subscript operator [] has a higher precedence than the simple
> > assignment operator =.

> Operator precedence has nothing to do with this. The evaluation
> ordering question here is between [] and the ++ side-effect, not
> between [] and assignment.

> --

> Even if all the snow were burnt, ashes would remain.
> --


-- guido                         Edel sei der Mensch, hilfreich und gut

--



Fri, 05 Dec 2003 00:14:17 GMT  
 Sequence Points & Side Effects

Quote:

> > For example, in the expression "a + b * c", the multiplication has a
> > higher precedence than the addition.  However, that does not guarantee
> > that "b" and "c" will be evaluated before "a".  If the expression were
> > "foo() + bar() * baz()", there is no guarantee (AFAIK) that bar() and
> > baz() will be called before foo().

> Thanks (both of you who responded) for the insight. I would have thought since
> * had higher precedence it would consume two arguments, the one on the left
> being the + operator with foo() and bar() as it's leaves. I guess what you are
> saying is that the order in which two operands of a binary operator are
> evaluated is not known, so if both functions modify the same object the order
> in which this occurs is not defined. Recursive descent though you would think
> would follow the left-hand nodes in an expression tree.

Well, the multiplication will be performed before the addition, but the
values of the three arguments can be evaluated in any order.  Consider an
RPN implementation that results in:

    PUSH foo()
    PUSH bar()
    PUSH baz()
    MULT
    ADD

which is certainly different than

    temp = bar() * baz();
    result = foo() + temp;

--

+---------+----------------------------------+-----------------------------+

|    J.   |                                  |  herein are not necessarily |
|  Brody  | http://www.bestweb.net/~kenbrody |  those of fP Technologies." |
+---------+----------------------------------+-----------------------------+
GCS (ver 3.12) d- s+++: a C++$(+++) ULAVHSC^++++$ P+>+++ L+(++) E-(---)

    DI+(++++) D---() G e* h---- r+++ y?
--



Fri, 05 Dec 2003 00:14:38 GMT  
 Sequence Points & Side Effects

Quote:

> #define PUSH(V) *--p = (V) /* the push operation */
>     PUSH (*p);           /* DUP the top of stack */
>     printf ("P[4] = %p ; P[3] = %p  \n", P[4], P[3]);
> More interestingly, there is no way to express the correct code
> series in a C environment without "inline"-functionality.

Sure there is -- use a comma operator to interpose a s.p.
--



Fri, 05 Dec 2003 22:43:08 GMT  
 Sequence Points & Side Effects

Quote:


> > #define PUSH(V) *--p = (V) /* the push operation */
> >     PUSH (*p);           /* DUP the top of stack */
> >     printf ("P[4] = %p ; P[3] = %p  \n", P[4], P[3]);
> > More interestingly, there is no way to express the correct code
> > series in a C environment without "inline"-functionality.

> Sure there is -- use a comma operator to interpose a s.p.

*grin*, okay Doug, show me...

specification: rhs: *sp, lhs: --sp, *sp,
  where rhs-fetch-deref must be before decrease which must be
  before lhs-store-deref. Express without '{}' so that the
  assigned value is returned, just as it is in the inline-func
  version. Your turn, Doug, show me... *grin*

-- guido                                  http://pfe.sf.net

--



Sat, 06 Dec 2003 09:16:00 GMT  
 Sequence Points & Side Effects

Quote:

> (1) My question is, why not just make an assign '=' a sequence point
> which binds from right to left, so the lvalue can be modified by the
> operators in the rvalue?

The longer I think about it... well, what about a ",=" operator,
as just another binary-operator-with-assignment *grin* it would
be well/known/obvious to the reader, and it just needs to get
another token during parse-stage of the C compiler *blink* and
there's no chance that this syntax is already in use anywhere
else, I guess *raisebrow*

oh, btw, there's a nice added-value here, as the value is the rhs
which is possibly an lvalue, *yeeahoooo*, what tricks we can play
with that *laughter*

int a = 1, b = 2, c = 3;
a ,= b ,= c;
... will result in a = 2, b = 3, c = 3;

cheers, guido *noiamnotseriousbutitlooksdamncool*
--



Sat, 06 Dec 2003 09:16:19 GMT  
 Sequence Points & Side Effects


Quote:


>> > #define PUSH(V) *--p = (V) /* the push operation */
>> >     PUSH (*p);           /* DUP the top of stack */
>> >     printf ("P[4] = %p ; P[3] = %p  \n", P[4], P[3]);
>> > More interestingly, there is no way to express the correct code
>> > series in a C environment without "inline"-functionality.

>> Sure there is -- use a comma operator to interpose a s.p.

>*grin*, okay Doug, show me...

>specification: rhs: *sp, lhs: --sp, *sp,
>  where rhs-fetch-deref must be before decrease which must be
>  before lhs-store-deref. Express without '{}' so that the
>  assigned value is returned, just as it is in the inline-func
>  version. Your turn, Doug, show me... *grin*

Well, I'm not Doug, and your "specification" confuses me more than
the original code, but I think this definition of PUSH should do the
trick:

#define PUSH(V) ((*(p-1) = (V)), p-- )

Unfortunately even this version will invoke undefined behavior if called
        PUSH(++p);

-andy
--



Mon, 08 Dec 2003 00:07:08 GMT  
 Sequence Points & Side Effects

Quote:



> > > #define PUSH(V) *--p = (V) /* the push operation */
> > >     PUSH (*p);           /* DUP the top of stack */
> > >     printf ("P[4] = %p ; P[3] = %p  \n", P[4], P[3]);
> > > More interestingly, there is no way to express the correct code
> > > series in a C environment without "inline"-functionality.
> > Sure there is -- use a comma operator to interpose a s.p.
> *grin*, okay Doug, show me...

whatever_t t;
#define PUSH(v) (t = (v), *--p = t)

Actually, however, it would be better to encapsulate t:

#define PUSH(v) do { whatever_t t = (v); *--p = t } while (0)

although that loses the property of returning a value.
I don't much like the interface anyway, since it in effect
treats p as a global.

Quote:
> ... Express without '{}' so that the assigned value is returned,
> just as it is in the inline-func version.

See my first example.

Actually your macro definition didn't do that, quite apart
from issues of sequence points; it should have been enclosed
in parentheses.
--



Mon, 08 Dec 2003 00:08:05 GMT  
 Sequence Points & Side Effects

Quote:

> #define PUSH(V) ((*(p-1) = (V)), p-- )

> Unfortunately even this version will invoke undefined behavior if called
>         PUSH(++p);

hmmm, interesting example, even that it does not return the value
but the stackpointer... ;-) ... however, the actual machine code
does not get the idea to combine the *(p-1) and the p--, thereby
ommitting the superflous x-1 operation. So the code is not the
same as the inline code... but a good try :-)))

to complete the thing, along with Doug's example, I seem to have
to surrender and use the old do-while trick in here, thereby
making the example like if there is a void-inline function. In
the actual code that needs the push-operation, it is an
approximation that is good enough. Or may be, I'll go to
require a compiler that understands inline-funcs, but I have to
think a while about that given the user base of the project and
all their cross-cc-to-embedded-target compilers.

cheers,
-- guido                         Edel sei der Mensch, hilfreich und gut

--



Mon, 08 Dec 2003 23:18:08 GMT  
 
 [ 20 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Sequence points and side effects

2. Multiple assignments & side-effects

3. Multiple assignments & side-effects

4. Sequence Points, Aliasing & Optimisation

5. Potential Side-effects of CopyTo?

6. compilers, side-effects

7. Side Effects with ^= ?

8. Some questions about C (punning, side effect, ...)

9. Function side effects?

10. side-effects..

11. Side Effects: Good Thing or Bad Thing

12. side-effects of not using free()...

 

 
Powered by phpBB® Forum Software