--- Instruction speed --- 
Author Message
 --- Instruction speed ---

A while back there was a discussion of how to clear a register,  SR v. SLR v.
XR.  No one mentioned some of my favorites, like shifting 32 bits, but that's
another story.

What brings me here is my desire to search an entire record looking for a
specific character (for example, a X'61' = C'/').  The records are large (over
20k) and there is a high probability (over 80%) that the number of characters
between each '/' is less than 256.  So, as I search for my special character,
should I use a loop with TRT instructions, or the simple to code:

              (LOAD R2 with beginning byte, R3 with max byte count
LOOP1   CLI   0(R2),C'/'
              BE   EXITLOOP
              LA   R2,1(R2)
              BCT R3,LOOP1

Also: the code has to be reentrant, so any TRT less than 256 characters would
be the object of an EX.

Discuss.



Sat, 21 Jul 2001 03:00:00 GMT  
 --- Instruction speed ---


Quote:

> What brings me here is my desire to search an entire record looking for a
> specific character (for example, a X'61' = C'/').  The records are large
(over
> 20k) and there is a high probability (over 80%) that the number of
characters
> between each '/' is less than 256.  So, as I search for my special
character,
> should I use a loop with TRT instructions, or the simple to code:

>               (LOAD R2 with beginning byte, R3 with max byte count
> LOOP1   CLI   0(R2),C'/'
>               BE   EXITLOOP
>               LA   R2,1(R2)
>               BCT R3,LOOP1

> Also: the code has to be reentrant, so any TRT less than 256 characters
would
> be the object of an EX.

If your hardware is a S/390 and the string-instruction facility is
installed, you should do well using the SRST instruction (string search).
It works for strings of arbitrary length, though it has to be restarted if
interrupted.


Sat, 21 Jul 2001 03:00:00 GMT  
 --- Instruction speed ---
I did it with both TRT and SRST, as an exercise.  I suspect SRST is faster,
it has fewer memory hits than TRT.

WOB's idea of using a BCT counter loop in the TRT is probably faster than
my approach.  You have the high cost of a D up front, but my SR's add
up through the loop.

ChatNoir's CLI/BCT loop is good, too.  It might be faster than TRT when the
distances between the delimiters is very small, like usually somewhere
between 2 to 5 bytes.  It can be optimized a little more, too:

             (LOAD R2 with addr of beginning byte,
              R0 = 1
              R1 = addr of last byte in record )
LOOP1    CLI   0(R2),C'/'
         BE    EXITLOOP
         BXLE  R2,R0,LOOP1

TESTASM  CSECT                     Define program CSECT
         USING *,12                Establish addressability
         SAVE  (14,12),,*          Save registers
         LR    12,15               Copy EP address to R12
         LA    15,S                Load addr of the new save area
         ST    15,8(,13)           Add new save area
         ST    13,4(,15)            to the save area chain
         LR    13,15               Establish new save area pointer
         OPEN  MF=(E,OPENPARM)     Open SYSPRINT
         L     3,=A(REC)           Load addr of start of data
         L     4,=A(RECEND)        Load addr of end of data
         LA    6,OUTLINE+4         Load start of the output area
         NI    SWITCH,255-MATCH    Reset match switch
         PUT   PRINT,HDR1          Write the first header
TRT0100  LR    15,4                Copy end of data to R15
         SR    15,3                Compute remaining bytes in data
         BNP   TRT0300             Br if remaining bytes <= 0
         C     15,=F'256'          Compare remaining bytes w/ 256
         BNH   TRT0200             Br if last section in data
         LA    1,255(,3)           Compute addr of last byte in section
         TRT   0(256,3),TRTTAB     Locate character
         LR    2,1                 Copy character addr to R2
         LA    3,1(,1)             Compute resume location
         BZ    TRT0100             Br if nothing found
         BAL   14,FMTMATCH         Format offset of found character
         B     TRT0100             And continue
TRT0200  BCTR  15,0                Reduce section length by 1 for hdw
         EX    15,EXTRT            Scan for character
         BZ    TRT0300             Br if nothing found
         LR    2,1                 Copy addr of character to r2
         LA    3,1(,1)             Compute resume location
         BAL   15,FMTMATCH         Format offset of found character
         B     TRT0100             And continue
TRT0300  TM    SWITCH,MATCH        Test if match
         BZ    TRT0400             Br if not
         LA    1,OUTLINE           Load addr of the output line
         SR    6,1                 Compute length of line
         STH   6,0(,1)             Store line length in the RDW
         PUT   PRINT,OUTLINE       Write the line
         B     TRT0500             And continue
TRT0400  PUT   PRINT,EMSG          Write no matches
TRT0500  NI    SWITCH,255-MATCH    Reset match switch
         L     3,=A(REC)           Load addr of start of data
*        L     4,=A(RECEND)        Load addr of end of data
         LA    6,OUTLINE+4         Load start of the output line
         PUT   PRINT,HDR2          Write the second header
SRST0100 LA    0,C'/'              Load search character into R0
SRST0200 SRST  4,3                 Scan for character
         BC    B'0001',SRST0200    CC = 3, resume
         BC    B'0100',SRST0300    CC = 1, hit
         BC    B'0010',SRST0400    CC = 2, no hits
         DC    H'0'                CC = 0, should not happen
SRST0300 LR    2,4                 Copy addr of found character to R2
         LA    3,1(,4)             Compute resume address
         BAL   14,FMTMATCH         Format match offset
         L     4,=A(RECEND)        Reload end of data
         B     SRST0100            FMTMATCH may zap R0, must reset
SRST0400 TM    SWITCH,MATCH        Test if match
         BO    SRST0500            Br if so
         PUT   PRINT,EMSG          Write no match record
         B     SRST0600
SRST0500 LA    1,OUTLINE           Load addr of the output record
         SR    6,1                 Compute data length in record
         STH   6,0(,1)             Store data length in the RDW
         PUT   PRINT,OUTLINE       Write the data record
SRST0600 CLOSE MF=(E,OPENPARM)     Close SYSPRINT
         L     13,4(,13)           Load addr of the caller's save area
         RETURN (14,12),T,RC=0     Restore regs & return to caller
EXTRT    TRT   0(*-*,3),TRTTAB  ** Execute only **
         SPACE 5
         CNOP  0,8
FMTMATCH SR    2,12                Compute offset from start of TESTASM
         STH   2,DWORK             Save offset
         MVI   0(6),C' '           Add a blank to the output line
         UNPK  1(5,6),DWORK(3)     Convert offset to hex digits
         TR    1(4,6),HEXTAB
         LA    6,5(,6)             Compute addr of next output location
         OI    SWITCH,MATCH        Indicate match encountered
         BR    14                  Return to caller
         EJECT
S        DC    (2*9)D'0'           Define 2 save areas
DWORK    DC    D'0'                Double word work area
HEXTAB   EQU   *-C'0'              Translate table to convert data to
         DC    C'0123456789ABCDEF'  hexadecimal digits
OPENPARM OPEN  (PRINT,OUTPUT),MF=L Open/Close parameter list
         PUSH  PRINT
         PRINT NOGEN
PRINT    DCB   DSORG=PS,MACRF=PM,DDNAME=SYSPRINT,  SYSPRINT            ?
               RECFM=VBA,LRECL=125,BLKSIZE=4096     DCB
         POP   PRINT
OUTLINE  DC    AL2(*-*,0),CL121' ' Output line work area
HDR1     DC    AL2(HDR1L,0),C'-TRT test --'
HDR1L    EQU   *-HDR1
HDR2     DC    AL2(HDR2L,0),C'-SRST test --'
HDR2L    EQU   *-HDR2
EMSG     DC    AL2(EMSGL,0),C' No match found'
EMSGL    EQU   *-EMSG
         LTORG ,
SWITCH   DC    AL1(0)
MATCH    EQU   B'01000000'
         DC    0D'0'
TRTTAB   DC    0XL256'0',256X'00'  Table used with TRT to
         ORG   TRTTAB+C'/'          locate just a
         DC    X'01'                 / character
         ORG   ,
REC      DC    CL150' ',C'//',CL235' ',C'/',C' ',C'/',CL31' ',C'/'
         DC    CL256' ',C'/',CL256' '
ALLCHARS DC    256AL1(*-ALLCHARS)
RECEND   EQU   *
         END   TESTASM

-- Steve Myers

The E-mail addresses in this message are private property.  Any use of them
to  send  unsolicited  E-mail  messages  of  a  commerical  nature  will be
considered trespassing,  and the originator of the message will be  sued in
small claims court in Camden County,  New Jersey,  for the  maximum penalty
allowed by law.



Sat, 21 Jul 2001 03:00:00 GMT  
 --- Instruction speed ---
Hi,

Gunnar Opheim mentioned the SRST instruction here is an example
for you:

* 3 REGISTERS ARE NEEDED:
* R0 = must contain the byte to be searched for in the low order byte
*      byte 0 to 3 must be x'00'
* Rx = any register, should point 1 byte behind the last byte
*      in string to be examined
*      (R2 in the example)
* Ry = any register, should point to the first byte in string to be
*      examined (R1 in  the example)
*
         LA    R2,BUFFEND+1      POINT BEHIND BUFFER STRING

         LR    R3,R2             SAVE FOR RESTART

         LA    R1,BUFFER         POINT AT START OF BUFFER

         LA    R0,LOOKFOR        CHAR TO FIND, SRST CAN'T LOOK FOR MORE

SRST_ST  DS    0H

         SRST  R2,R1

         BO    *-4               INTERRUPT

         BH    NOTFOUND          

* -- FOUND THE FIRST OCCURENCE OF CHAR IN STRING
*    R2 = AT THE CHAR, R1 = UNCHANGED, R0 = UNCHANGED
*
*       CODING GOES HERE
*
         B     DONE  <----   REMOVE IF MORE THAN ONE CHAR MAY BE FOUND    

* -- PREPARE RESTART

         LA    R1,1(0,R2)        BUMP TO NEXT BYTE AFTER FOUND CHAR

         LR    R2,R3             POINT BEHIND BUFFER AGAIN

         B     SRST_ST           RESTART SEARCH

*
NOTFOUND DS    0H
*
*   CODING FOR NOT FOUND GOES HERE
*
         B     NEXT
*
LOOKFOR  EQU   C'/',1,C'C'       CHAR TO BE FOUND

--------------------------------------------

Klaus
++EOM++



Sat, 21 Jul 2001 03:00:00 GMT  
 --- Instruction speed ---
After some though, I realized WOB's BCT loop idea won't work.  The problem
is the number of times the TRT executes is not predictable, because it must
be restarted every time, except for the last time.  In other words, the
method in my code sent out yesterday may well be the best available
method.

[snip]

Quote:
>WOB's idea of using a BCT counter loop in the TRT is probably faster than
>my approach.  You have the high cost of a D up front, but my SR's add
>up through the loop.

[snip]

-- Steve Myers

The E-mail addresses in this message are private property.  Any use of them
to  send  unsolicited  E-mail  messages  of  a  commerical  nature  will be
considered trespassing,  and the originator of the message will be  sued in
small claims court in Camden County,  New Jersey,  for the  maximum penalty
allowed by law.



Sun, 22 Jul 2001 03:00:00 GMT  
 --- Instruction speed ---
WOB -- Is the idea of your code to return the addr of the last "hit" in
FWORD?  Or is the idea to return the addr of the last 256 (or maybe 255)
byte block that contains the last "hit"?

I'm not sure after reading the code.

[snip]

-- Steve Myers

The E-mail addresses in this message are private property.  Any use of them
to  send  unsolicited  E-mail  messages  of  a  commerical  nature  will be
considered trespassing,  and the originator of the message will be  sued in
small claims court in Camden County,  New Jersey,  for the  maximum penalty
allowed by law.



Mon, 23 Jul 2001 03:00:00 GMT  
 --- Instruction speed ---


[snip]

Quote:

> If your hardware is a S/390 and the string-instruction facility is
> installed, you should do well using the SRST instruction (string search).
> It works for strings of arbitrary length, though it has to be restarted if
> interrupted.

SRST is the instruction which first made me use the BC instruction
(instead of the extended mnemonics). None of the extended mnemonics
"sounded" right when used with SRST.

--
Robert Ngan
CSC Financial Services Group



Sun, 29 Jul 2001 03:00:00 GMT  
 
 [ 11 post ] 

 Relevant Pages 

1. Instruction Speed and instruction availability

2. --- Instruction speed ---

3. Instruction speeds

4. Instruction Speed

5. P4/Athlon instruction speed

6. pentium instruction clock speeds

7. instruction -> speed

8. Speed of WAM instructions

9. Reference: HW Instructions to Speed Up Lisp?

10. Speed..Speed..Speed

11. Perl speed vs. Python speed

12. integer*8 speed vs integer*4 speed

 

 
Powered by phpBB® Forum Software