Problems with for speed optimized code 
Author Message
 Problems with for speed optimized code

-----BEGIN PGP SIGNED MESSAGE-----

Hi,

  I have a problem with a short piece of code if I choose
 the option "Optimize Speed" when compiling.
 The code is an implementation of the en/decryption algorithm
 RC4. If I do not optimize the program for speed I get another
 (correct) output as when I optimize for speed (though the
 speed optimization would be VERY important for this code!)

 Here is the code, and an example of the output for both
 optimized and not optimized versions:

int clsRC4::crypt(char *buffer, unsigned long length)
{
 unsigned long i = 0, j = 0, t, x;

 if (!(buffer && length)) {     return ERR_NODATA;      }

 for (x = 0; x < length; x++)
 {
  i = (i + 1) % 256;
  j = (j + state[i]) % 256;
  swap8(state[i], state[j]);
  t = (state[i] + state[j]) % 256;
  *((unsigned char *)buffer+x) ^= state[t];
 }

 return 0;

Quote:
}

If I encrypt this plaintext:     00 00 00 00
with this en/decryption key:     00 00 00 00 00 00 00 00
I get this (correct) ciphertext: DE 18 89 41

If I use speed optimization and encrypt the same plaintext with
the same key as above I get this (different and incorrect)
ciphertext: 0A 0F 11 1A

I'd highly appreciate any help which would help me to solve this problem!

(If anyone would like the full source code for testing or whatever just send
me an email and I'll send you a copy.)

 thanks in advance! -Max

~~~
This PGP signature only certifies the sender and date of the message.
It implies no approval from the administrators of nym.alias.net.
Date: Mon Mar 22 18:28:33 1999 GMT

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQEVAwUBNvaL005NDhYLYPHNAQFlQQf6A+yIH21gHT7k5ECzYM6VFeETGN+oF9HH
NL0fl+YD76vHN+4BG9gyzrs/MjF70z3LxKbrHhQ6vx+HyVva15BWVaMJmk6AG7/H
j4pLyd0rs4ey4+zbG8AedESovJ0GpWwDN0q5091JNvvYgMlE/lAliNdr1HWZ3+Lt
gT7OFPq3Yeqq9Zu6nVaMq0pIMsvC5u2j9ngP/RP4vT/TbUfrrWjfu/hmDt6NboyK
xmy+gDYcLhFJk7A00TybuCZsiCi8R4JGlk6dR5aofXtHmkpGa26XHTTzmo//9MJC
9C3n+7UUfZDGBf5Vv53YfHgaQHubM8i9sgTDNq2OMUj0cbpia/mkZg==
=9Og1
-----END PGP SIGNATURE-----



Fri, 07 Sep 2001 03:00:00 GMT  
 Problems with for speed optimized code

Quote:
>I'd highly appreciate any help which would help me to solve this problem!

Try optimizing for SIZE rather than speed. That may seem
counter-intuitive, but the size optimization includes most if not all
of the "safe" speed optimizations anyway. You could then try adding in
the other optimizations by hand to see which of them causes the
problem.

Bob Moore [MVP]
http://www.mooremvp.freeserve.co.uk
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Due to an unreasonable amount of queries, I no
longer answer unsolicited email questions. Sorry,
no exceptions.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



Fri, 07 Sep 2001 03:00:00 GMT  
 Problems with for speed optimized code
-----BEGIN PGP SIGNED MESSAGE-----

Quote:
>>I'd highly appreciate any help which would help me to solve this problem!

>Try optimizing for SIZE rather than speed. That may seem
>counter-intuitive, but the size optimization includes most if not all
>of the "safe" speed optimizations anyway. You could then try adding in
>the other optimizations by hand to see which of them causes the
>problem.

I tried to optimize it for size, and it works fine. But its still too slow,
with speed optimization it takes exactly half as long as with size optimization
(with no optimization it takes 4 times as long as with speed optimization).

I took a look at the assembler code of the routine, well i'm not sure, to me
it seems that it just forgot to compile an instruction, i = (i + 1) % 256;

If anybody is interested in taking a look at it the full source code is available
on a friends server (outside the usa because of crypto export restrictions)
http://www.ooenet.at/user/spooky/index.html
(please don't critisize the c++ code, i just started converting it from C to C++,
optimizations will soon be made ;-)

 TIA
 //iLLusIOn

~~~
This PGP signature only certifies the sender and date of the message.
It implies no approval from the administrators of nym.alias.net.
Date: Wed Mar 24 00:17:00 1999 GMT

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQEVAwUBNvgvF05NDhYLYPHNAQEhWgf8DexnSFpnNi69DYyntTXlswcSbIwFoL2p
us23fDb1Hq1QsS2kD/J+YW/mJj0nIM4H23yTdEK4RXBSbbF19hq8yooUGG1zYSKT
UzGf5yBY7KsNdI+tFAjTFqWbyVHEz2o406fZ8+wgdKW/GxeqsdvKEWDlD9px/NZP
xc7L54epftvrdGMhb21WYOcBXz2/SF60EvClFbhIafvZtgwaCUs8G3utVtYQwTEp
RP6FK0LN4hv/c4TamfsVsekJJUZ8k5U4cQFYCCEWuhY5fjAy9lhy6um9erd0mpm+
pAUZNQd/wsvJtEYTWBVhEkPHmJJX4JzQSFkOX+pDtvg6VxmsqK09TA==
=ZewL
-----END PGP SIGNATURE-----



Sun, 09 Sep 2001 03:00:00 GMT  
 Problems with for speed optimized code
-----BEGIN PGP SIGNED MESSAGE-----

  Thanks for all the answers I have got, I have now (right now)
 solved the problem, though its not a very good solution.
 Here is the (Assembler/C) code, any comments/suggestions?

This is the old version, the one which didn't compile correctly
when optimized for size. It should fill an array of 256 unsigned
char's with the numbers from 0 to 255. But instead, it is filling
it with the numbers 1, 2, 3, ..., 255, 0

C:
           unsigned char state[256];
           for (i = 0; i < 256; i++)
             state[i] = i;

Assembler:
004014ED   xor         eax, eax                      // i = 0;
004014EF   mov         ecx,dword ptr [ebp]           // ecx = state;
004014F2   inc         eax                           // i++ (TOO EARLY!)
004014F3   cmp         eax,100h                      // i < 256 ?
004014F8   mov         byte ptr [ecx+eax-1],al       // state[i] = i (TOO LATE!)
004014FC   jb          clsRC4::init(0x004014ef)+2Fh  // i < 256 ? yes, next for

and here is the version which is compiling correctly, even with
speed optimization. What I did here is to add a do-nothing instruction,
j = i + j;
look at the assembler code, it solved the problem of increasing 'i' too
early. Though I do not like this method, it is short, yet wastes some
speed and makes the code look very bad.

C:
           unsigned char state[256];
           for (i = 0; i < 256; i++)
           {
               state[i] = i;
               j = i + j;
           }

Assembler:
004014ED   mov         ecx,dword ptr [esp+18h]       // ecx = j
004014F1   xor         eax,eax                       // i = 0
004014F3   mov         edx,dword ptr [ebp]           // edx = state
004014F6   add         ecx,eax                       // j = j + i;
004014F8   mov         byte ptr [edx+eax],al         // state[i] = i; (right, now it adds
                                                                       'i' to state[i]
                                                                       before it increases
                                                                       i!)
004014FB   inc         eax                           // i++
004014FC   cmp         eax,100h                      // i < 256?
00401501   jb          clsRC4::init(0x004014f3)+33h  // yes, next for

  Is this a known bug? I did not install any patches for VC++ 5 (are there any? where
 could I get them exactly?).
 Maybe someone could try to reproduce this bug in VC++ 6? If it still exists in
 version 6 I'd send a bug report to MS.
 However, any ideas how I could get rid of this j = i + j, yet keeping the routine
 functioning with speed optimization??

  thanks!
  //iLLusIOn

~~~
This PGP signature only certifies the sender and date of the message.
It implies no approval from the administrators of nym.alias.net.
Date: Thu Mar 25 19:43:38 1999 GMT

-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQEVAwUBNvqR7E5NDhYLYPHNAQGFOwf8D3qK7K99SekVr6uFexVojecJ4B7szsH8
5KJWxN99fdqA9Vre3YXEmNEIJyC65LMCuFOoNKWXRsZ0c9BL9sVvgTSWz/5CBjLp
8Ekq1xyLNHfIf2IsBMtMmSOl1/akqZmXRaNuwJN54HX886KQJnWSfVDiBv+adQEx
aeSJwE6BO/mBgaavNkewAM+TGddhBD5bYv3jRmbMuhkJH6IYCZMUVbtE7mbfq0QQ
xjdonaH90ljpJAQoaF8nj1ar0Iraolkol4/jBDpeowv/C08azecHy8aNXq1nk3F0
Qi6K8gKybrg3SNFETZ5M3YlGPO/OQNDxTJZHONaqfcP/aw2a7tR6Xw==
=BVEa
-----END PGP SIGNATURE-----



Mon, 10 Sep 2001 03:00:00 GMT  
 Problems with for speed optimized code
Here's what I got on VC6 SP2, Optimize for speed.

; 5    :  int i;
; 6    :    unsigned char state[256];
; 7    :       for (i = 0; i < 256; i++)

 mov DWORD PTR _i$[ebp], 0
 jmp SHORT $L220
$L221:
 mov eax, DWORD PTR _i$[ebp]
 add eax, 1                                            ;; This gets jumped over the first time

 mov DWORD PTR _i$[ebp], eax
$L220:
 cmp DWORD PTR _i$[ebp], 256   ; 00000100H
 jge SHORT $L222

; 8    :              state[i] = i;

 mov ecx, DWORD PTR _i$[ebp]
 mov dl, BYTE PTR _i$[ebp]
 mov BYTE PTR _state$[ebp+ecx], dl
 jmp SHORT $L221
$L222:

It looks fine to me. I am amazed that VC5 could have such a bug, as far as I know, the
behaviour you expected is correct.
There are three (at least) Service packs for VC5. Get them all. Iheard VC5 was basically
unusable without them

int i=0;
unsigned char state[256];
while(i<256)
{
    state[i]=i;
    i++;

Quote:
}

Nick


Mon, 10 Sep 2001 03:00:00 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. Problems with for speed optimized code

2. Bad Address from Speed Optimized Code

3. How to optimize execution speed of code ??

4. How to speed optimize c/c++ code

5. how to optimize this code for speed

6. Problem with Optimize for speed in VC++ 5.0

7. Optimizing random number generator (was Re: Optimizing code for tests)

8. Throwing exceptions in ATL dll with optimize for SPEED causes Access Violation

9. Speed/safety optimizing recomendation.

10. Optimizing C compiler(speed)

11. optimized code problem

12. Optimizing speed using inline assembly fails - Why?

 

 
Powered by phpBB® Forum Software