--- Instruction speed --- 
Author Message
 --- Instruction speed ---

I'm new to the group, what was the fastest way to clear a register?

Quote:

>A while ago, there was a discussion as to which was the fastest way to
clear a
>register: SLR, SR, or XR.



Sun, 03 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---
A while ago, there was a discussion as to which was the fastest way to clear a
register: SLR, SR, or XR.  This asks some advice on compares:  I want to
compare a byte and find the identical byte in a table.  Now the tested bytes
are not contiguous, so I can't use a TRT to find the byte in question. So,
which is faster, CLC, CLM or (IC and CR).  Assume R3 points to the byte to be
tested, and R2 points to a table of address that might contain the byte:

LOOP DS    0H
         ICM   R1,15,0(R2)
         BZ    NOTFOUND         an address of zero indicates end-of-table
         CLC   0(1,R1),0(R3)
         BE    FOUND
         LA    R2,4(R2)
         B     LOOP

Or would it be faster to pre-load R15 before the label LOOP
         SR    R15,R15
         IC     R15,0(R3)
and then instead of the CLC of 1 byte:
         CLM   R15,1,0(R1)

Or would it be faster to clear R0 and do the compare thusly:
         IC    R0,0(R1)
         CR   R0,R15

What if the byte count is two or three (instead of one),  does that change
anything?

And while we're on the discussion of speed, is the ICM of a full word faster
than a L(oad) and then an LTR?



Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---

Quote:
>I'm new to the group, what was the fastest way to clear a register?

On modern S/390s it doesn't matter.

--
No electrons were injured in the preparation of this message.



Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---

[snip]

Quote:

>Or would it be faster to pre-load R15 before the label LOOP
>         SR    R15,R15
>         IC     R15,0(R3)
>and then instead of the CLC of 1 byte:
>         CLM   R15,1,0(R1)

You don't need to pre clear R15 in this usage.  Also, on some early
model low end S/360s,
          IC    R15,0(,R3)
worked faster than
          IC    R15,0(R3)
I doubt this is true for anything faster than a P390 nowadays.

Quote:

>And while we're on the discussion of speed, is the ICM of a full word faster
>than a L(oad) and then an LTR?

On current systems, ICM and L/LTR are probably the same, or the ICM
is faster.

For the field size question.  If you load 2, 3 or 4 bytes into a register,
then

         CLM   R15,B'bbbb',0(R1)

is almost certainly faster than

         CLC   0(n,R1,0(R3)

Anything beyond that you are probably better staying with the CLC.

If the table is large, and the compares frequent, you might look into
pre-sorting the table and doing a binary search, rather than the
sequential lookup you are doing.

For the single byte test, if the table is stable, then you might use the
table to pre-load a TRT data area, and test the byte using TRT.

Finally, if you know the end of the address table, rather than have to
depend on 0s to mark the end of the table,

         LR    R15,R2
         LA    R0,4
         L     R1,addr-of-last-address
LOOP     L     R14,0(,R15)
         CLC   0(1,R14),0(R3)
         BE    FOUND
         BXLE  R15,R0,LOOP
         B     NOTFOUND  (or put the NOTFOUND code here)        

-- Steve Myers

The E-mail addresses in this message are private property.  Any use of them
to  send  unsolicited  E-mail  messages  of  a  commerical  nature  will be
considered trespassing,  and the originator of the message will be  sued in
small claims court in Camden County,  New Jersey,  for the  maximum penalty
allowed by law.



Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---
You mean I have been wasting time plugging in that comma in
LA   r1,1(,r1)

all these years?

Quote:
>  Also, on some early
>model low end S/360s,
>          IC    R15,0(,R3)
>worked faster than
>          IC    R15,0(R3)
>I doubt this is true for anything faster than a P390 nowadays.



Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---


Quote:
>You mean I have been wasting time plugging in that comma in
>LA   r1,1(,r1)

>all these years?

>>  Also, on some early
>>model low end S/360s,
>>          IC    R15,0(,R3)
>>worked faster than
>>          IC    R15,0(R3)
>>I doubt this is true for anything faster than a P390 nowadays.

Well, no. It just doesn't make any difference, but there are other good
reasons for using the "correct" base/index register notation.


Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---
How long does it take to find a " L Rx,0(Ry)" problem in AR mode in
somebody else's code?

Bob

Quote:

<snip>

> Well, no. It just doesn't make any difference, but there are other good
> reasons for using the "correct" base/index register notation.



Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---
Ok, I'll bite, what is AR mode?


Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---

Quote:



>> >You mean I have been wasting time plugging in that comma in
>> >LA   r1,1(,r1)
>> >all these years?
<snip>

>> Well, no. It just doesn't make any difference, but there are other good
>> reasons for using the "correct" base/index register notation.

>You can really get in a bind if you don't include the index register
omission
>in an RX instruction (or a 0 if you wish) and you're running in AR mode. If
>for example you're doing a LOAD instruction with the intention of loading
an
>address-space address into register operand 1 and you've not omitted the
>index register (L Rx,0(Rx), the result in register operand 1 will not be
the
>address your expecting. Instead, it will be the contents of 0(Rx) from your
>home address space. I learned that one real quick.

Well actually you can get in a hell of a bind if you even think for a
nanosecond that such an error would just occur out of the ozone. When you
write AR mode code, you have to write it carefully so that addresses are
resolved to the correct space. However, any symbolic reference is always
going to be resolved using whatever base register the symbol was mapped off.

Hence my comment above; you always have to pay attention to base and index
registers. I have a personal rule that I *never* use the lazyman's (index
only) form. I write enormous amounts of AR mode code, so you can pretty much
assume I know what I'm doing. Customer systems would be dropping like flies
if I screwed up in AR mode.

As for the possibility that your program might just happen to be called in
AR mode unexpectedly, har har har. That'd work about never. Most standard
entry linkage would crash and burn long before you got as far as worrying
about whether your load was being fetched from the right place.

Chris



Mon, 04 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---

Quote:

> You can really get in a bind if you don't include the index register
> omission
> in an RX instruction (or a 0 if you wish) and you're running in AR
> mode. If
> for example you're doing a LOAD instruction with the intention of
> loading an
> address-space address into register operand 1 and you've not omitted
> the
> index register (L Rx,0(Rx), the result in register operand 1 will not
> be the
> address your expecting. Instead, it will be the contents of 0(Rx) from
> your
> home address space. I learned that one real quick

Primary address space rather than home?


Tue, 05 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---
AR == Access-Register Mode.

Each of the general registers is shadowed by an Access Register, which holds
a value that (ultimately) designates an address space.  When running in
AR-mode, one can directly access storage in *any* address space of the
system (assuming that one has appropriate authority) by pointing the AR at
the target address space and the general register at the location within the
address space.

Of course, if you are in AR mode but some of the access registers are *not*
properly initialized to point to the address spaces that you are intending
to access, the results are, as they say, unpredictable.
--

      IBM Research, Yorktown Heights



Tue, 05 Jun 2001 03:00:00 GMT  
 --- Instruction speed ---

Quote:

> Ok, I'll bite, what is AR mode?

Bank switching; in AR mode, the base-register field selects an access
register as well as a base register, and the access register selects an
address space.

--

Shmuel (Seymour J.) Metz
Reply to host nsf (dot) gov, user smetz



Fri, 08 Jun 2001 03:00:00 GMT  
 
 [ 16 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Instruction Speed and instruction availability

2. --- Instruction speed ---

3. Instruction speeds

4. Instruction Speed

5. P4/Athlon instruction speed

6. pentium instruction clock speeds

7. instruction -> speed

8. Speed of WAM instructions

9. Reference: HW Instructions to Speed Up Lisp?

10. Speed..Speed..Speed

11. Perl speed vs. Python speed

12. integer*8 speed vs integer*4 speed

 

 
Powered by phpBB® Forum Software