Updated peephole optimizer 
Author Message
 Updated peephole optimizer

I've made a new release of my python peephole optimizer available at:

        http://www.*-*-*.com/ ~skip/python/

The patch file was generated against the Python 1.5.1 source.

Here's the README.peep file:

The accompanying patch file (optimizer.patch) implements a peephole
optimizer for Python byte code.  When applied, invoking the Python
interpreter with the -O flag will run the optimizer on the generated byte
code before writing a .pyo file.  The optimizer results in small performance
gains in many areas.  Attribute access and complex number manipulation
appear to be the most enhanced.

The optimizer proper is written entirely in Python although some changes
were made to ceval.c, compile.c and intobject.c.  This means it's rather
slow at what it does but easily modified.

Thanks to Tim Peters for many useful suggestions about how to organize the
optimizer and particular optimizations to try.  Thanks to Marc-Andre Lemburg
for pybench, which allowed me to isolate improvements and problems.

Here is a brief summary of the changes:

    * ceval.c - A number of new instructions were added:

        - LOADI - a fast load instruction for small positive integers (those
          referenced by the small_ints array in intobject.c).

        - LOAD_NONE - a fast load instruction for the constant None.

        - LOAD_ATTR_FAST - a fast load instruction for attributes where
          the object name is in the local scope.

        - STORE_ATTR_FAST - a fast store instruction for attributes where
          the object name is in the local scope.

        - LOAD_TWO_FAST - a fast load instruction for loading two local
          variables at a time.

        - STORE_TWO_FAST - a fast store instruction for storing two local
          variables at a time.

    * dis.py - The disassembler has been modified to accomodate the new

    * opcodes.py - A new module that defines opcode information (values,
      types, etc).  Used by both dis.py and optimize.py.

    * optimize.py - Each optimization is defined as a subclass of the
      OptimizeFilter base class and performs a single optimization (or small
      set of strongly related optimizations).  Optimizers are chained
      together in a pipeline.  A code object is fed in the front of the
      pipeline and emitted out the back.  The user can set the OPTLEVEL
      and/or OPTSTATS environment variables to affect the behavior of the
      optimizer a bit.  See the source code for details.

      Here are the optimizations currently implemented.

        - DeadCodeRemover - deletes code in any basic block that follows an
          unconditional transfer

        - LineNumberRemover - deletes SET_LINENO instructions.  Currently
          not being exercised because the basic optimization in compile.c
          already does this.

        - LoadStoreCompressor - combines pairs of LOAD_FAST or STORE_FAST
          instructions into LOAD_TWO_FAST or STORE_TWO_FAST instructions
          where possible.  (Local variable reference must be < 256.)

        - MultiLoadEliminator - converts n loads of the same item into a
          single load and n-1 DUP_TOP instructions.

        - StoreLoadEliminator - converts a STORE followed immediately by a
          LOAD of the same item into a DUP_TOP/STORE pair.

        - ConstPopEliminator - deletes LOAD_CONST/POP_TOP pairs.

        - JumpNextEliminator - deletes JUMPs to the next block that are the
          last instruction of the current block.

        - JumpOptimizer - optimizes JUMPs that are to other unconditional

        - TupleRearranger - reorganizes BUILD_TUPLE n/UNPACK_TUPLE n/STORE/
          ... sequences to eliminate the BUILD and UNPACK and reorder the

        - ConstantShortcut - maps LOAD_CONST n, where n refers to a small
          positive integer, into LOADI n instructions, whose execution
          directly accesses the small_ints cache in intobject.c.

        - ConstantExpressionEvaluator - precalculates constant expressions.
          This is especially helpful if you use complex numbers.

    * intobject.c - This file has been modified to expose the small_ints
      cache to the interpreter.  The optimizer generates LOADI instructions
      for small positive integers.  LOADI is faster than the equivalent
      LOAD_CONST instruction.  One significant change to its overall
      behavior was to initialize the entire small_ints array in the
      beginning instead of one-by-one as constants are encountered.  This
      code probably bears some scrutiny by another pair of eyes.

    * compile.c - Hooks were added to import and execute the optimizer when
      the -O flag is given.  It is not an error for the optimize module to
      not be located.  Optimization is just not done.  A {*filter*}
      disassembler is available (compile-time flag) to help debug the
      optimizer's actions. (If you{*filter*}up the optimizer you will find it
      difficult to run the regular disassembler.  This disassembler doesn't
      require working byte code.)

If you have any questions or problems or would like to contribute new

Skip Montanaro    | Musi-Cal: http://www.*-*-*.com/

-------- comp.lang.python.announce (moderated) --------

Python Language Home Page:   http://www.*-*-*.com/

Mon, 18 Dec 2000 03:00:00 GMT  
 [ 1 post ] 

 Relevant Pages 

1. Peephole optimizer, wazzat?

2. Peephole optimizers using expert systems

3. Does SwiftForth use Forth-level peephole optimization?

4. Wanted: COBOL and ASM compilers, translators, optimizers

5. Info for PL/I Optimizer

6. app optimizer: macos X extension

7. Optimizer hints

8. Smalltalk global optimizer?

9. Need to disable an optimizer feature

10. RTX2000 optimizer

11. looking for an optimizer which enables me to ...

12. OpenGL Optimizer, Cosmo3D and VRML 2.0


Powered by phpBB® Forum Software