ANSI aliasing rules 
Author Message
 ANSI aliasing rules

I'm trying to determine why two compilers seem to think an int and a float
array are aliased (and therefore addresses of elements must be constantly
reloaded), even when my (perhaps naive) interpretation of ANSI rules says
the arrays cannot be aliased.

In particular, I'd like to understand why:
-------------------------------------------------
float test2d(float ****ff, int ***solid, int ni, int nz, int ny, int nx)
{
 int i,z,y,x;
 float fsum;
 for(fsum=0.0f, isum=i=0; i<ni; i++){
  for(z=0; z<nz; z++){
   for(y=0; y<ny; y++){
    for(x=0; x<nx; x++){
     fsum += ff[i][z][y][x]*solid[z][y][x];
    }
   }
  }
 }
return(fsum);

Quote:
}

----------------------------------------------
generates an efficient inner loop:
----------------------------------------------
$B1$10:                         ; Preds $B1$9 $B1$10
        fild      DWORD PTR [edi+ebp*4]                         ;19.29
        fmul      DWORD PTR [esi+ebp*4]                         ;19.29
        faddp     st(1), st                                     ;19.6
        fstp      DWORD PTR [esp+80]                            ;19.6
        cvtsi2ss  xmm1, DWORD PTR [edi+ebp*4+4]                 ;19.29
        mulss     xmm1, DWORD PTR [esi+ebp*4+4]                 ;19.29
        addss     xmm1, DWORD PTR [esp+80]                      ;19.6
        cvtsi2ss  xmm0, DWORD PTR [edi+ebp*4+8]                 ;19.29
        mulss     xmm0, DWORD PTR [esi+ebp*4+8]                 ;19.29
        addss     xmm1, xmm0                                    ;19.6
        movss     DWORD PTR [esp+80], xmm1                      ;19.6
        fld       DWORD PTR [esp+80]                            ;19.6
        add       ebp, 3                                        ;18.20
        cmp       ebp, ecx                                      ;18.5
        jle       $B1$10        ; Prob 97%                      ;18.5
----------------------------------------------
(no loads other than floating point loads; pointers to ff are dereferenced
in the outer loop, and the offsets incremented, rather than reloading and
reconstructing the total ff[][][][] address at each x value), whereas:
----------------------------------------------
float test2(float ****ff, int ***solid, int ni, int nz, int ny, int nx)
{

 int i,z,y,x;
 float sum;
 for(sum=0.0f, i=0; i<ni; i++){
  for(z=0; z<nz; z++){
   for(y=0; y<ny; y++){
    for(x=0; x<nx; x++){
     sum += ff[i][z][y][x];
     solid[z][y][x] <<= 1;
    }
   }
  }
 }
return(sum);

Quote:
}

----------------------------------------------
generates an inner loop with:
----------------------------------------------
$B1$9:                          ; Preds $B1$8 $B1$9
        mov       eax, DWORD PTR [esp+68]                       ;19.13
        mov       eax, DWORD PTR [eax]                          ;19.13
        mov       esi, DWORD PTR [esp+72]                       ;19.13
        mov       eax, DWORD PTR [eax+esi*2]                    ;19.13
        mov       eax, DWORD PTR [eax+ebp*2]                    ;19.13
        fadd      DWORD PTR [eax+edx*4]                         ;19.6
        fstp      DWORD PTR [esp+48]                            ;19.6
        fld       DWORD PTR [esp+48]                            ;19.6

;;;      solid[z][y][x] <<= 1;

        mov       eax, DWORD PTR [ecx+ebx*4]                    ;20.6
        mov       eax, DWORD PTR [eax+ebp*2]                    ;20.6
        mov       esi, DWORD PTR [eax+edx*4]                    ;20.6
        add       esi, esi                                      ;20.6
        mov       DWORD PTR [eax+edx*4], esi                    ;20.6
        inc       edx                                           ;18.20
        cmp       edx, edi                                      ;18.5
        jl        $B1$9         ; Prob 97%                      ;18.5
----------------------------------------------
....that is, the higher-level pointers in ff get reloaded from memory with
every iteration of the inner loop, apparently because solid[][][] is stored
back, even though by ansi rules the ff and solid arrays cannot be aliased.

The problem arises with both Intel C v5 and Microsoft VC 6, with pure C
code.  Neither compiler mentions a switch to turn off the ansi rules, so I
naively assumed that ansi must be the default.  I can overcome the problem
with manual dereferencing, but I'd still like to understand what is going
on.
--



Thu, 25 Dec 2003 12:58:37 GMT  
 ANSI aliasing rules
With the Intel compiler, the std C aliasing rules are not assumed by
default. The -Qansi option invokes the std C aliasing rules. The -Oa option
invokes fortran-like aliasing rules, which would be the same as the std C
rules in this case.  The compiler doesn't even take advantage of restrict
qualifiers unless -Qrestrict is set.  Drop the Q's if you are on linux.
Those options are mentioned in the [Windows verson]  Compiler User's Guide,
although not in the command line -help.

 Of those options, MSVC6 has only -Oa.  I would think most compilers which
do not have a switch to invoke or disable ansi aliasing analysis would not
take advantage of it, since so much code violates those rules.  Since Intel
C aims to be MSVC compatible by default, you must specifically turn on
aliasing analysis if you want it.


Quote:
> I'm trying to determine why two compilers seem to think an int and a float
> array are aliased (and therefore addresses of elements must be constantly
> reloaded), even when my (perhaps naive) interpretation of ANSI rules says
> the arrays cannot be aliased.

--



Fri, 26 Dec 2003 05:11:51 GMT  
 ANSI aliasing rules

Quote:

> I'm trying to determine why two compilers seem to think an int and a float
> array are aliased (and therefore addresses of elements must be constantly
> reloaded), even when my (perhaps naive) interpretation of ANSI rules says
> the arrays cannot be aliased.

> In particular, I'd like to understand why:
> -------------------------------------------------
> float test2d(float ****ff, int ***solid, int ni, int nz, int ny, int nx)
> {
>  int i,z,y,x;
>  float fsum;
>  for(fsum=0.0f, isum=i=0; i<ni; i++){
>   for(z=0; z<nz; z++){
>    for(y=0; y<ny; y++){
>     for(x=0; x<nx; x++){
>      fsum += ff[i][z][y][x]*solid[z][y][x];
>     }
>    }
>   }
>  }
> return(fsum);
> }
> ----------------------------------------------
> generates an efficient inner loop:
> ----------------------------------------------
> $B1$10:                         ; Preds $B1$9 $B1$10
>         fild      DWORD PTR [edi+ebp*4]                         ;19.29
>         fmul      DWORD PTR [esi+ebp*4]                         ;19.29
>         faddp     st(1), st                                     ;19.6
>         fstp      DWORD PTR [esp+80]                            ;19.6
>         cvtsi2ss  xmm1, DWORD PTR [edi+ebp*4+4]                 ;19.29
>         mulss     xmm1, DWORD PTR [esi+ebp*4+4]                 ;19.29
>         addss     xmm1, DWORD PTR [esp+80]                      ;19.6
>         cvtsi2ss  xmm0, DWORD PTR [edi+ebp*4+8]                 ;19.29
>         mulss     xmm0, DWORD PTR [esi+ebp*4+8]                 ;19.29
>         addss     xmm1, xmm0                                    ;19.6
>         movss     DWORD PTR [esp+80], xmm1                      ;19.6
>         fld       DWORD PTR [esp+80]                            ;19.6
>         add       ebp, 3                                        ;18.20
>         cmp       ebp, ecx                                      ;18.5
>         jle       $B1$10        ; Prob 97%                      ;18.5
> ----------------------------------------------
> (no loads other than floating point loads; pointers to ff are dereferenced
> in the outer loop, and the offsets incremented, rather than reloading and
> reconstructing the total ff[][][][] address at each x value), whereas:
> ----------------------------------------------
> float test2(float ****ff, int ***solid, int ni, int nz, int ny, int nx)
> {

>  int i,z,y,x;
>  float sum;
>  for(sum=0.0f, i=0; i<ni; i++){
>   for(z=0; z<nz; z++){
>    for(y=0; y<ny; y++){
>     for(x=0; x<nx; x++){
>      sum += ff[i][z][y][x];
>      solid[z][y][x] <<= 1;
>     }
>    }
>   }
>  }
> return(sum);
> }
> ----------------------------------------------
> generates an inner loop with:
> ----------------------------------------------
> $B1$9:                          ; Preds $B1$8 $B1$9
>         mov       eax, DWORD PTR [esp+68]                       ;19.13
>         mov       eax, DWORD PTR [eax]                          ;19.13
>         mov       esi, DWORD PTR [esp+72]                       ;19.13
>         mov       eax, DWORD PTR [eax+esi*2]                    ;19.13
>         mov       eax, DWORD PTR [eax+ebp*2]                    ;19.13
>         fadd      DWORD PTR [eax+edx*4]                         ;19.6
>         fstp      DWORD PTR [esp+48]                            ;19.6
>         fld       DWORD PTR [esp+48]                            ;19.6

> ;;;      solid[z][y][x] <<= 1;

>         mov       eax, DWORD PTR [ecx+ebx*4]                    ;20.6
>         mov       eax, DWORD PTR [eax+ebp*2]                    ;20.6
>         mov       esi, DWORD PTR [eax+edx*4]                    ;20.6
>         add       esi, esi                                      ;20.6
>         mov       DWORD PTR [eax+edx*4], esi                    ;20.6
>         inc       edx                                           ;18.20
>         cmp       edx, edi                                      ;18.5
>         jl        $B1$9         ; Prob 97%                      ;18.5
> ----------------------------------------------
> ....that is, the higher-level pointers in ff get reloaded from memory with
> every iteration of the inner loop, apparently because solid[][][] is stored
> back, even though by ansi rules the ff and solid arrays cannot be aliased.

Might it be concerned that the modification through solid could kill x,
y, z, or i?
--



Sun, 28 Dec 2003 23:25:32 GMT  
 ANSI aliasing rules


Quote:

> > I'm trying to determine why two compilers seem to think an int and a
float
> > array are aliased (and therefore addresses of elements must be
constantly
> > reloaded), even when my (perhaps naive) interpretation of ANSI rules
says
> > the arrays cannot be aliased.
[...]
> Might it be concerned that the modification through solid could kill x,
> y, z, or i?

Local variables on the stack are pretty safe -- they can't be aliased to any
address that we pass from the outside world (unless something is horribly
amiss with the compiler).
--



Mon, 29 Dec 2003 23:58:20 GMT  
 ANSI aliasing rules

Quote:




> > > I'm trying to determine why two compilers seem to think an int and a
> float
> > > array are aliased (and therefore addresses of elements must be
> constantly
> > > reloaded), even when my (perhaps naive) interpretation of ANSI rules
> says
> > > the arrays cannot be aliased.
> [...]
> > Might it be concerned that the modification through solid could kill x,
> > y, z, or i?

> Local variables on the stack are pretty safe -- they can't be aliased to any
> address that we pass from the outside world (unless something is horribly
> amiss with the compiler).

Well, yes, I'll allow that it would be a mistake.

A possibility is that the code generator ran out of registers.

How does it do if you omit the sum accumulation? Maybe two passes, each
with a single assignment, will come out better. Though if that does
work,
it still won't shed much light on your original question.
--



Wed, 31 Dec 2003 04:18:55 GMT  
 ANSI aliasing rules


[...]

Quote:
> > Local variables on the stack are pretty safe -- they can't be aliased to
any
> > address that we pass from the outside world (unless something is
horribly
> > amiss with the compiler).
> Well, yes, I'll allow that it would be a mistake.

> A possibility is that the code generator ran out of registers.

When it runs out of registers (which happens a lot on the x86),
the generator uses temps on the stack.

Quote:
> How does it do if you omit the sum accumulation? Maybe two passes, each
> with a single assignment, will come out better. Though if that does
> work,
> it still won't shed much light on your original question.

Any calculation that ends up in a local is OK; anything that writes back
to a (possibly aliased) memory address in the outside (non-stack) world
generates all the extra traffic.
--



Thu, 01 Jan 2004 03:44:07 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Aliasing in ANSI C

2. ansi aliasing in vc++ ??

3. ANSI rule for strtok() ?

4. Enforcement of ANSI rules for external definitions

5. Enforcement of ANSI rules for external definitions

6. ANSI typedef rules

7. MSVC++ 5.0 and /Za option: Concerning compliance with ANSI C++ for scoping rules

8. ISO/ANSI/POSIX Make rules?

9. Aliasing through union, C++ vs. C

10. Aliasing question...

11. Aliasing and character types

12. Sequence Points, Aliasing & Optimisation

 

 
Powered by phpBB® Forum Software