Next instr of the week - CS please 
Author Message
 Next instr of the week - CS please

The title says it all
I'm trying to maintain someoneelse's code!
John Wood                      
Teamco Systems R&D



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Well, from the PoP:  
CS  R1,R3,D2(B2)

The first and second operands are compared. If they are
equal, the third operand is stored at the second operand
location. If they are unequal, the second operand is loaded
into the first operand location. The result of the comparison
is indicated in the condition code.

The first and third operands are 32 bits in length, with
each operand occupying a general register. The second
operand is a word on a word boundary in storage.

Simplified: The operation cannot be interrupted by task(s)
even from another CPU.

What happened:

CC=0 First and second operands were equal, second operand was
replaced by third operand

CC=1 First and second operand were unequal, first operand was
replaced by second operand

Example: To set a single bit (in shared storage) to 1

 LA  6,X'80' Put bit to be ORed into WORD into GR6
 SLL 6,24 and shift the bit to the desired relative position
 L   7,WORD Fetch current flag bits
RETRY EQU *
 LR  8,7 Copy flag bits into GR8
 OR  8,6 OR in new bit to be set (from GR6)
 CS  7,8,WORD Store new flags if current flags unchanged
* or refetch current flags if they were changed
 BC  4,RETRY Loop back to retry if new flags were not stored

The same technique can be used to update a common counter from
several concurrent tasks etc. etc.

There is also a CDS operation identical to CS except that it uses
64 bit operands doubleword aligned in memory and contained in
even-odd register pairs.

regards Sven

Quote:

> The title says it all
> I'm trying to maintain someoneelse's code!
> John Wood
> Teamco Systems R&D



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Quote:

> The title says it all
> I'm trying to maintain someoneelse's code!
> John Wood
> Teamco Systems R&D

Excellent topic.

Check out the back of the Poo, page A-43 in my hard copy (btw this is
ONE book I really want in hard copy). It contains examples of:

Setting a single bit
Updating counters
Bypassing POST and WAIT
Lock and Unlock both LIFO and FIFO
Free Pool management

Also check out the MVS macros OIL and NIL. It was a great laugh the
first time (1978) that I heard someone refer to OILing a lock. OIL
stands for Or Immediate Lock. NIL is aNd Immediate Lock.

I am actually working on implementing a lock routine now that supports
shared and exclusive attributes. Similar to the MVS Latch Manager,
except I am running unauthorized and therefore can't use the Latch
Manager (poor design choice in my opinion). Can't use ENQ/DEQ because of
performance. This will actually use the CDS instruction instead of the
CS instruction, so that I can maintain two linked lists, those holding
it shared and those waiting for it exclusive.

--

Beyond Software, Inc.      http://www.beyond-software.com
"Transforming Legacy Applications"



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Hi Wayne,

I agree about the problems of the latch manager, but the latch manager
exists precisely because it is a difficult problem - look how big and ugly
IRLM is. I think you will find it impossible to serialize more than one
linked list with just CS or CDS unless you transform the problem into a
simpler one.

I thought your code ran authorized anyway, or are you not talking about the
web server?

Anyway, you will probably need to define some higher level lockword for the
entire structure and chain in ECBs for the requestors. The first guy who
gets the lock owns the entire structure until he's done with it. Then he
passes ownership to the next requestor in line and posts his ECB. Of course
they all have to be key 8 requestors within the same address space... ah
details.

regards,
Chris.

Quote:

>I am actually working on implementing a lock routine now that supports
>shared and exclusive attributes. Similar to the MVS Latch Manager,
>except I am running unauthorized and therefore can't use the Latch
>Manager (poor design choice in my opinion). Can't use ENQ/DEQ because of
>performance. This will actually use the CDS instruction instead of the
>CS instruction, so that I can maintain two linked lists, those holding
>it shared and those waiting for it exclusive.



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please



Quote:
> The title says it all
> I'm trying to maintain someoneelse's code!
> John Wood                      
> Teamco Systems R&D

CS/CDS is not required for some types of updates to shared storage,
i.e. you just want to replace the contents of the word/doubleword
without regard to its original contents.
In this case, I use ST/STM as the stores are block concurrent (data
alignment must be the same as for CS/CDS).

--
Robert Ngan
CSC Financial Services Group



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Quote:

> Hi Wayne,

> I agree about the problems of the latch manager, but the latch manager
> exists precisely because it is a difficult problem - look how big and ugly
> IRLM is. I think you will find it impossible to serialize more than one
> linked list with just CS or CDS unless you transform the problem into a
> simpler one.

> I thought your code ran authorized anyway, or are you not talking about the
> web server?

snip

Actually, Chris, our web server does not require APF authorization,
except for certain optional functions. It is entirely possible to
install the web server on MVS using simple JCL and run it as a batch
job. This allows users to install it for evaluation purposes without
involving the systems programming staff. Until you approach production
there is no reason to involve the sysprogs. That is why I want to avoid
the latch manager.

IMHO, the installation of a webserver should take no more than an hour
and should require no skills beyond knowledge of basic JCL and basic
utilities like IEBCOPY. I hate building products that require sysprogs
to install them. Far better to let the end user install it, at least for
evaluation purposes. Sysprogs have enough to do with Y2k, I don't want
to rock their boat unnecessarily.

--

Beyond Software, Inc.      http://www.beyond-software.com
"Transforming Legacy Applications"



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Ah, non-blocking synchronization, a topic dear to my heart.  I attach
some code we wrote for fine-grained locking in a large, high
performance database server system.  It uses CS to maintain a chain of
lock requestors and standard WAIT/POST.  This is a little different
from the example in the PoOP in that it does not need to have special
storage allocated to hold the chain (see "LOCK/UNLOCK with FIFO
Queueing for Contentions" in Appendix A).

Michael Greenwald presented a very interesting paper on non-blocking
synchronization at the 1996 Usenix OSDI conference.  It describes
using a slightly different primitive than CDS: "compare and swap
double".  This primitive is like CDS except the two words do not have
to be adjacent in memory.  This small architectural change makes for a
much easier to use instruction (too bad the S/390 architecture doesn't
include it!).

  author = "Michael Greenwald and David Cheriton",
  title = "The Synergy Between Non-blocking Synchronization and
        Operating System Structure",
  booktitle = OSDI96,
  year = 1996,
  month = oct,
  address = "Seattle, WA",
  organization = "USENIX Assoc.",
  pages = "123--136",
  abstract = "
  Non-blocking synchronization has significant advantages over
  blocking synchronization: however, it has not been used to a
  significant degree in practice. We designed and implemented a
  multiprocessor operating system kernel and run-time library for
  high-performance, reliability and modularity. We used non-blocking
  synchronization, not because it was an objective in itself, but
  because it became the approach of choice. It was an attractive
  approach because of the synergy between other structuring techniques
  we used to achieve our primary goals and the benefits of
  non-blocking synchronization.

  This paper describes this synergy: the structuring techniques we
  used which facilitated non-blocking synchronization and our
  experience with this implementation.
"

Quote:
}

---

         MACRO                                                          
&LABEL   GETLOCK &LOCK,&AREA=LOCKAREA                                  
.*                                                                      
.*   Obtain the lock designated. Wipes out R14 -> R1                    
.*                                                                      
.*   A lock is a two word area.  The first word is a pointer to        
.*   the ownership chain (0 indicates the lock is free).  The          
.*   second word counts the number of collisions (where one            
.*   task had to wait for another that already held the lock            
.*   to release it).                                                    
.*                                                                      
.*   A Lockarea is a two word area. The first word is used for          
.*   chaining, the second word is the ECB to be waited on if the        
.*   lock is held.                                                      
.*                                                                      
&LABEL   IHBINNRA &AREA                 Point at our area              
         SR    R0,R0                    Get a zero                      
         ST    R0,4(R1)                 Clear the ECB                  
         L     R14,&LOCK                Get current chain value        
G&SYSNDX ST    R14,0(R1)                Put in our work area            
         CS    R14,R1,&LOCK             Chain us in                    
         BNE   G&SYSNDX                                                
         LTR   R14,R14                  Are we the first?              
         BZ    L&SYSNDX                 Yes, go do it                  
         WAIT  ECB=4(R1)                Wait for us to be posted        
         CSINC 4+&LOCK                  Bump collision count            
L&SYSNDX EQU   *                        We have the lock                
         MEND                                                          
---
         MACRO                                                          
&LABEL   FREELOCK &LOCK,&AREA=LOCKAREA                                  
.*                                                                      
.*       Release the lock                                              
.*                                                                      
&LABEL   IHBINNRA &AREA                                                
         SR    R0,R0                                                    
         LR    R15,R1                   Save pointer to our area        
         CS    R1,R0,&LOCK              Is it the simple case?          
         BE    L&SYSNDX                 Yes, it is free                
G&SYSNDX LR    R14,R1                   Is this what points to us?      
         LR    R1,R15                   Get our work area              
         CS    R1,R0,0(R14)             Clear if end of list            
         BNE   G&SYSNDX                                                
         POST  4(R14)                   Post the next one              
L&SYSNDX EQU   *                                                        
         MEND                                                          
---
         MACRO
&LBL     CSINC &CTR
.*
.*       Atomically increment fullword counter
.*
&LBL     L     R0,&CTR
L&SYSNDX LR    R1,R0
         AL    R1,=F'1'
         CS    R0,R1,&CTR
         BNZ   L&SYSNDX
         MEND



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Quote:

> Ah, non-blocking synchronization, a topic dear to my heart.  I attach
> some code we wrote for fine-grained locking in a large, high
> performance database server system.

Thanks for the code segment. I'll study it carefully RSN (Real Soon
Now).

I have always used LIFO because it is so simple. Just a simple push down
stack. My boss is concerned that it isn't "fair" and wants me to use
FIFO for a particular situation. I don't like the FIFO example in the
POO because of the requirement for changing ownership of storage.

--

Beyond Software, Inc.      http://www.beyond-software.com
"Transforming Legacy Applications"



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Quote:

>The title says it all

Here's a few tidbits on CS/CDS that I've learned over the years.

 - Often they are overused, particularly by those who don't know
   the POP rules governing "storage operand consistency".

 - Operand alignment is required to guarantee that each occupies
   a single cache line, the amount that is actually serialized
   on most processor models.  This means that having an array of
   CS or CDS targets in contiguous memory is not a good idea.
   Cache lines are on the order of 64 to 256 bytes or more.

 - The POP doesn't prohibit excess serialization in order to
   make the instruction work.  For example, serializing a much
   larger unit (4K) and/or stopping all other processors in
   the CEC is legal, if inefficient.

 - A reliable LIFO list (stack) cannot be managed with CS.  The
   POP explains this and shows an alternative technique using
   CDS.

 - Anyone who tells you they have come up with a generalized
   linked list management scheme based on CS/CDS probably has
   not done their homework or is relying on special factors
   such as being non-preemptable, never freeing list element
   memory, and/or tolerating leakage.

 - IBM has developed techniques for managing certain types of
   linked lists with CDS.  They are patented, and you can view
   the patents on IBM's patents webserver.

 - The Perform Locked Operation (PLO) instruction introduced
   on recent processor models provides a quantum advance in this
   area if you want to manage lists easily.  It is an elegant
   design but shares the CS/CDS characteristic that some model
   implementations may be less efficient than others.

/b
--
Bill Manry  -  IBM Products Division  -  Oracle Corporation
These are my opinions, not necessarily Oracle's.
Remove "." from "B.Manry" to email me.



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Quote:

>Michael Greenwald presented a very interesting paper on non-blocking
>synchronization at the 1996 Usenix OSDI conference.  It describes
>using a slightly different primitive than CDS: "compare and swap
>double".  This primitive is like CDS except the two words do not have
>to be adjacent in memory.  This small architectural change makes for a
>much easier to use instruction (too bad the S/390 architecture doesn't
>include it!).

Its a bit of a nit, but the S/390 does indeed have such an instruction -
check out Perform Locked Operation (PLO) in POPS. In fact, it allows for
even more complex atomic operations. Unfortunately its only available on
fairly recent (G3 and up?) 9672  systems, but it does exist!

Chris.



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please


[snip]

: Michael Greenwald presented a very interesting paper on non-blocking
: synchronization at the 1996 Usenix OSDI conference.  It describes
: using a slightly different primitive than CDS: "compare and swap
: double".  This primitive is like CDS except the two words do not have
: to be adjacent in memory.  This small architectural change makes for a
: much easier to use instruction (too bad the S/390 architecture doesn't
: include it!).

What about PLO?  You can use it to update non-contiguous [double]words
with integrity.  It's great for handling double-threaded lists!

--
| Edward E. Jaffe                | Voice:      (310) 338-0400 x318     |
| Mgr., Research & Development   | Facsimile:  (310) 338-0801          |

| 9841 Airport Blvd, Suite 700   | IBM Mail:   USS24J24 at IBMMAIL     |        
| Los Angeles, CA 90045          | Web page:   www.phoenixsoftware.com |



Mon, 18 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please

Quote:

>Well, from the PoP:  
>CS  R1,R3,D2(B2)

>The first and second operands are compared. If they are
>equal, the third operand is stored at the second operand
>location. If they are unequal, the second operand is loaded
>into the first operand location. The result of the comparison
>is indicated in the condition code.

>The first and third operands are 32 bits in length, with
>each operand occupying a general register. The second
>operand is a word on a word boundary in storage.

>Simplified: The operation cannot be interrupted by task(s)
>even from another CPU.

>What happened:

>CC=0 First and second operands were equal, second operand was
>replaced by third operand

>CC=1 First and second operand were unequal, first operand was
>replaced by second operand

>Example: To set a single bit (in shared storage) to 1

> LA  6,X'80' Put bit to be ORed into WORD into GR6
> SLL 6,24 and shift the bit to the desired relative position
> L   7,WORD Fetch current flag bits
>RETRY EQU *
> LR  8,7 Copy flag bits into GR8
> OR  8,6 OR in new bit to be set (from GR6)
> CS  7,8,WORD Store new flags if current flags unchanged
>* or refetch current flags if they were changed
> BC  4,RETRY Loop back to retry if new flags were not stored

I think the

        L       7,WORD

should be inside the retry loop, otherwise the code can't handle cases
where other processes are successfully changing the word. The result
would be a loop. Since it only happens during contention,debugging would
be tough. If the change by the other process is finally reversed, the
result would be occaisional long loops consuming considerable CPU
resource.

john alvord



Tue, 19 Dec 2000 03:00:00 GMT  
 Next instr of the week - CS please



[snip]
: >Example: To set a single bit (in shared storage) to 1
: >
: > LA  6,X'80' Put bit to be ORed into WORD into GR6
: > SLL 6,24 and shift the bit to the desired relative position
: > L   7,WORD Fetch current flag bits
: >RETRY EQU *
: > LR  8,7 Copy flag bits into GR8
: > OR  8,6 OR in new bit to be set (from GR6)
: > CS  7,8,WORD Store new flags if current flags unchanged
: >* or refetch current flags if they were changed
: > BC  4,RETRY Loop back to retry if new flags were not stored

: I think the

:       L       7,WORD

: should be inside the retry loop, otherwise the code can't handle cases
: where other processes are successfully changing the word. The result
: would be a loop. Since it only happens during contention,debugging would
: be tough. If the change by the other process is finally reversed, the
: result would be occaisional long loops consuming considerable CPU
: resource.

No.  He's coded it to spec.  CS reloads R7 automatically if the comparison
fails.

--
| Edward E. Jaffe                | Voice:      (310) 338-0400 x318     |
| Mgr., Research & Development   | Facsimile:  (310) 338-0801          |

| 9841 Airport Blvd, Suite 700   | IBM Mail:   USS24J24 at IBMMAIL     |        
| Los Angeles, CA 90045          | Web page:   www.phoenixsoftware.com |



Tue, 19 Dec 2000 03:00:00 GMT  
 
 [ 15 post ] 

 Relevant Pages 

1. Instruction for next week: BC

2. Linux Expo next week

3. C5.5.01 - next week?

4. FYI, I'll be offline for the next week

5. Dylan Course in Northern VA, USA next week

6. USENIX Very High Level Languages Symposium - NEXT WEEK!

7. Eiffel at OOPSLA (next week in Atlanta)

8. USENIX Very High Level Languages Symposium - NEXT WEEK!

9. Determing dates for next week

10. WorldconVR Event Next Week-Starbase C3 Invite!

11. CosmoCode 2.5 for NT due next week

 

 
Powered by phpBB® Forum Software