Pentium Pro & Pentium II instruction decomposition (uops) 
Author Message
 Pentium Pro & Pentium II instruction decomposition (uops)


I've got a question regarding uops on the Pro & II. It's esp. about
instructions that have implicit word registers as output. For example
CWD, idiv reg16 etc.

They are decomposed in this UOP sequence:

HDR: "CWD/CDQ":                         ( 10011001 -------- -------- )

1 FLOW: EDX = sar.Port_0.latency_1(EAX, CONST)

HDR: "IDIV eAX,rm16/32":                ( 1111011.1 11.111.sss
-------- )

1 FLOW: TMP2 = Port_0.latency_1(EDX, EAX)
2 FLOW: TMP0 = int_div.Port_0.Latency_99(TMP2, REG_sss)
3 FLOW: EAX = move.Port_01.latency_1(TMP0)
4 FLOW: EDX = Port_0.latency_1(TMP0, CONST)

Now my question:

When the output registers of these instructions is 32 bits (instead of
the implicit 16 bits), do these instructions still suffer partial
register stalls when reading the 32 bits register after? If so, the
documentation is not complete / allright.

Who can test it for me?

Wed, 24 Nov 1999 03:00:00 GMT  
 [ 1 post ] 

 Relevant Pages 

1. Pentium II vs. Pentium Pro

2. Looking for Pentium Pro/II instruction to micro-op(s) breakdown

3. string instruction replacements for Pentium pro/II

4. Pentium & Pentium II specs

5. Pentium & Pentium II spec

6. RDTSC on Pentium and Pentium Pro for Timing Accuracy

7. Pentium/Pentium Pro simulator

8. Pentium/Pentium Pro simulator

9. Optimized Pentium code on a Pentium Pro

10. Pentium Pro not faster than Pentium?

11. Speed on Pentium Pro vs Pentium?

12. Comparative perf of Pentium/Pentium II under J?


Powered by phpBB® Forum Software