Comma operator in #define's (was Re: Usage of comma operator) 
Author Message
 Comma operator in #define's (was Re: Usage of comma operator)

        Hello, netters!

Some time ago I saw in a posting:

Quote:
># define foobar(a)     (foo(a), bar(a))

and many flames about possible misuse: foobar(i++), etc.

Well, consider a following C code fragment:

static int __;
define predicat(a)      (.......)
#define foobar(a)       (__ = a, predicat(__) ? foo(__) : bar(__))

and then:

        if (foobar(a) == foobar(b)) .....

Do you know how this works under some old C compilers (PCC based ones)?
Like this:

        __ = a; __ = b;
        if ((predicat(__) ? foo(__) : bar(__)) ==
                (predicat(__) ? foo(__) : bar(__))) ....

GCC et al. produce correct code, but I didn't know if this (correct)
behavior was standartized.

        X/OPEN (Portability Guide) say (2.6.15):

"A pair of expressions separated by a comma is evaluated left to
right, and the value of the left expression is discarded."  That's
all about evaluation order!

        Is it legal to evaluate other expressions in the "middle" of
comma operator evaluation?

        May I expect the correct results when using comma
operator in such a way?

        Regards,

--
Leonid A. Broukhis | 89-1-95 Liberty St. | "BROUKHIS" is Hebrew for
+7 095 494 6241 (h)| Moscow 123481 USSR  |       "BENEDICTAE"



Tue, 15 Mar 1994 22:26:01 GMT  
 Comma operator in #define's (was Re: Usage of comma operator)

Quote:
>static int __;
>define predicat(a)      (.......)
>#define foobar(a)       (__ = a, predicat(__) ? foo(__) : bar(__))
>    if (foobar(a) == foobar(b)) .....

>[some compilers generate code like this:]

>    __ = a; __ = b;
>    if ((predicat(__) ? foo(__) : bar(__)) ==
>            (predicat(__) ? foo(__) : bar(__))) ....

>GCC et al. produce correct code, but I didn't know if this (correct)
>behavior was standartized.

>    Is it legal to evaluate other expressions in the "middle" of
>comma operator evaluation?

First off, this really has nothing to do with the comma expression.
The issue is evaluation with respect to the == operator.  Examples
are at the end that don't use comma expressions.

From Harbison & Steele's 'C: A reference Manual, 2nd edition' (a
book written in a style appropriate for use by a compiler writer):

Section 7.12, page 189, 'Order of evalutation':

Quote:
> In general, the compiler is free to rearrange the order in
> which an expression is evaluated with the following restrictions.
...
> When evaluating the actual arguments of a function call, the order
> in which the arguments are evaluated is not specified; but the
> program must behave as if it chose one argument, evaluated it
> fully, then chose another argument, evaluated it fully, and so
> on, until all arguments were evaluated. [...] their computations
> may not appear to be interleaved.

Note that the 'appearance' basically means you can evaluate it in
any order, but in terms of side-effects it should come out as if
you had done this.

Quote:
> A similar restriction holds for binary operators: [must behave as if
> one operand was fully evaluated before the other was started].
>     The original description of C specified that subexpressions
> may be evaluated in any order [...]  The matter of interleaving was
> not discussed [...]  We advise implementors to adhere rigidly to
> the restrictions outlines here (which actually are quite sensible
> and not terribly restrictive).

Harbison and Steele's sample expression is:

        char **p;
        if (!strcmp(p++,p++)) ...

which they argue must have the same result as:

        p+=2;
        if (!strcmp(p-1,p-2)) ...
or
        p+=2;
        if (!strcmp(p-2,p-1)) ...

Because we're comparing the strings for equality, this "works" either
way.  (They do not actually recommend this style of programming).

An alternate is to use arguments with other sorts of side
effects, for example function side effects.  We can in fact
have a simple function that tells us what order things are being
generated in; this function takes two values, one to print (for
where we are in the expression tree), and one the value to return.
Then each call to the function has two arguments, each of which can
be sub expressions.

Here's the expression tree I'm using (and expected value as a result
of that operand on the right, value on the left indicates what it will
print when it finishes evaluating that sub-expression.)

                                == 1

                        /                   \
                1 s 0                           2 s 0
             /         \                      /        \
        10 s 1          11 s 0          20 s 2          21 s 0
                                        /     \
                                  200 s 20  201 s 2

Note that each 's' expression prints the value return by its
left child, and returns the value returned by its right child.
This was the constraint used to number the above graph, and
to guide construction of the program below.

Note that definition of C doesn't even say you have to fully evalute
elements further down the tree; here, we've encapsulated the side-effects
so the compiler can't move them around, so all compilers must obey
the constraint that 10 and 11 are evaluated before 1.  If we were
using increment-type side-effects, under the original definition of
C not even this constraint needs to be met [e.g., !strcmp(p++,p++) could
be evaluated as !strcmp(p,p),p+=2; now imagine p to be a global variable
and replace strcmp with a user-defined function.]

The additional Harbison & Steele constraint basically says you're not
allowed to evaluate (side-effects-wise) across the tree; I'm not sure
but it looks like you're constrained to a post-order traversal with
random (free) choice between left and right children.

The program (pre-squished for ease of follow-up):

#include <stdio.h>
s(x,y){printf("%d ",x);return y;}
main(){s(s(10,1),s(11,0))==s(s(s(200,20),s(201,2)),s(21,0));}

Sun OS cc:      200 201 20 21 2 10 11 1

Sun OS cc -O:   10 11 1 200 201 20 21 2

gcc:            10 11 1 200 201 20 21 2

gcc -O:         10 11 1 200 201 20 21 2

Notice all the '2's are executed un-interleaved from the '1's,
and the '21' is not interleaved with the '20's.

It would be better to make a much much much more complicated expression,
complicated enough to cause serious register complications, instead of
using such a simple expression.  Furthermore, this doesn't isolate
compiler-mobile side-effects; I'll write a follow-up one that does,
and see what happens.

People may follow this up and talk about sequence points,
but as far as I know sequence points are an ANSI C notion,
and this constraint is something that non-ANSI C compilers
should probably meet.




Sun, 20 Mar 1994 11:42:38 GMT  
 Comma operator in #define's (was Re: Usage of comma operator)
Quote:


>>static int __;
>>define predicat(a)      (.......)
>>#define foobar(a)       (__ = a, predicat(__) ? foo(__) : bar(__))
>>        if (foobar(a) == foobar(b)) .....

>>[some compilers generate code like this:]

>>        __ = a; __ = b;
>>        if ((predicat(__) ? foo(__) : bar(__)) ==
>>                (predicat(__) ? foo(__) : bar(__))) ....

>>GCC et al. produce correct code, but I didn't know if this (correct)
>>behavior was standartized.

>>        Is it legal to evaluate other expressions in the "middle" of
>>comma operator evaluation?
> [ A long discussion of order of evaluation in C which:

    1. is intended to apply only to pre-ANSI C
    2. quotes passages from Harbison & Steele's 'C: A reference Manual,
       2nd edition' which were apparently intended as a suggestion to
       compiler implementer, not an interpretation of the language.
    3. uses examples for actual machines, which is never useful to answer
       what should be done.  (How does one even know if the machine always
       prints the same output?)]

To answer the original question, some might argue, but the sensible
interpretation is that A , B means A must be completely evaluated before B,
but no more.  In particular, (A,B) == (C,D) may be evaluated so that the
evaluation of A mixes with that of C or D and C with that A or B.
Also, there is a very important ANSI rule with states that if a location is
written twice or read and independently writen (i.e. a += a is okay) without
a sequence point intervening then the expression is undefined.  In the above,
the same variable (locations) may appear in A and B or in C and D but not on
both sides of the '=='.
(In K&R I there is an implication that C operators represent atomic operations,
side effect operators representing more than one.  Evaluation of subexpressions
can be interleaved yielding a set of permuted orders of evaluation each of
which may be defined, depending on what it actually does.)

Jon.



Mon, 21 Mar 1994 02:27:03 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. comma operator: C compiler bug or bogus usage?

2. Usage of comma operator

3. Apocryphal stories (was: Usage of comma operator)

4. Comma operator

5. Accidental comma operator

6. Comma operator question.

7. comma operator in C

8. help with comma operator (I think)

9. Question about comma operator

10. ANSI C and Comma Operator Tricks

11. Comma operator

12. comma operator

 

 
Powered by phpBB® Forum Software