0.01 dragons slain 
Author Message
 0.01 dragons slain

Although I acquired Aho, Sethi and Ullman's 'Compilers - Principles,
Techniques and Tools' (probably better known as the Dragon Book) somewhere
in 1988, I never used any of the discussed optimization strategies for writing
iForth. Their optimizer toolchest didn't look very appropriate to a stack
language.

In the last two weeks I have taken another look and lo and behold, it isn't
half as intimidating as it seemed 13 years ago :-)

Here is a simple phrase that my prototype Forth optimizer can translate right
now.

First a sample of SwiftForth to set a base line.

SwiftForth 2.2.2.9 07May2001
CREATE ape 20 CELLS ALLOT  ok
: tt  


  + 22 +
  ape 3 CELLS + ! ;  ok
see tt
46B09F   4 # EBP SUB                    83ED04
46B0A2   EBX 0 [EBP] MOV                895D00
46B0A5   68F34 [EDI] EBX LEA            8D9F348F0600
46B0AB   4 # EBP SUB                    83ED04
46B0AE   EBX 0 [EBP] MOV                895D00
46B0B1   10 # EBX ADD                   83C310
46B0B4   0 [EBX] EBX MOV                8B1B
46B0B6   4 # EBP SUB                    83ED04
46B0B9   EBX 0 [EBP] MOV                895D00
46B0BC   EBX SHL                        D1E3
46B0BE   0 [EBP] EBX ADD                035D00
46B0C1   4 # EBP ADD                    83C504
46B0C4   0 [EBP] EAX MOV                8B4500
46B0C7   EBX 0 [EBP] MOV                895D00
46B0CA   EAX EBX MOV                    8BD8
46B0CC   20 # EBX ADD                   83C320
46B0CF   0 [EBX] EBX MOV                8B1B
46B0D1   4 # EBP SUB                    83ED04
46B0D4   EBX 0 [EBP] MOV                895D00
46B0D7   EBX SHL                        D1E3
46B0D9   0 [EBP] EBX ADD                035D00
46B0DC   4 # EBP ADD                    83C504
46B0DF   0 [EBP] EBX ADD                035D00
46B0E2   4 # EBP ADD                    83C504
46B0E5   16 # EBX ADD                   83C316
46B0E8   4 # EBP SUB                    83ED04
46B0EB   EBX 0 [EBP] MOV                895D00
46B0EE   68F34 [EDI] EBX LEA            8D9F348F0600
46B0F4   C # EBX ADD                    83C30C
46B0F7   0 [EBP] EAX MOV                8B4500
46B0FA   EAX 0 [EBX] MOV                8903
46B0FC   4 [EBP] EBX MOV                8B5D04
46B0FF   8 # EBP ADD                    83C508
46B102   RET                            C3 ok

33 relevant instructions. Not too bad.

Here is iForth. The optimizer vocabulary doesn't know about parsing
numbers yet, therefore LITERAL must be used explicitly. There is no
object generator, the output is Forth assembler (text).

iForth version 1.12.3230, generated 14:25:06, December 2, 2001.
i6 binary, native floating-point, double precision.
Copyright 1996 - 2002 Marcel Hendrix.

optimizer> CREATE ape 20 4 * ALLOT  ok
optimizer> insRESET


                + 22 LITERAL +
                ape LITERAL 3 LITERAL CELLS + !
           insFLUSH
$004BA300 dword-ptr -> ebx mov,
$004BA310 dword-ptr -> ecx mov,
[ecx ecx*2 0 +] -> ecx lea,
[ebx ebx*2 0 +] -> ebx lea,
[ebx ecx*1 #22 +] -> eax lea,
eax -> $004BA2FC dword-ptr mov, ok

Six instructions.

It will be very interesting to benchmark this against current
C compilers (this stage is still a "few weeks" off :-)

-marcel



Wed, 26 May 2004 03:56:51 GMT  
 0.01 dragons slain

[..]

Quote:
> It will be very interesting to benchmark this against current
> C compilers (this stage is still a "few weeks" off :-)

[..]

No C-compilers yet, but it is already possible to insert the generated code
for basic blocks by hand in Forth benchmarks.

Let's try the ubiquitous Sieve from MPE's benchm.fth:

: DO-PRIME
     #1000 0 DO
                FLAGS SIZE 1 FILL
                0 SIZE 0
                  DO
                    frag1
                       IF frag2
                          BEGIN  DUP SIZE
                       U< WHILE  frag3
                          REPEAT
                          frag4
                    ENDIF
                LOOP
                DROP
           LOOP ;

The fragx's are the basic blocks. Stacks pictures ( IN/OUT ) for these
blocks are found by SEEing the code for the standard iForth (see below).
Given that, the assembly language text of the frags is generated by the
experimental optimizer and inserted in the word with a macro.

Here is a comparison for my two machines (old Pentium/newer Athlon)
between two compilers (VFX 3.3 and the experimental iForth).

Compiler           | VFX 3.30   | iForth 1.12.3230
-------------------+------------+-----------------
Machine P54C-166   |  1832 ms   |      1119 ms
Machine Athlon-900 |   350 ms   |       240 ms

(The current iForth on the Athlon runs this benchmark in 340 ms.)

I'm somewhat amazed that such a speedup is still possible. Evidently,
when the basis blocks become very small one instruction more or less
has a very large influence.

-marcel

PS: Code for the basic blocks above is as follows:

ALSO ASSEMBLER
: frag1 0 1 IN/OUT
        POSTPONE ASM{
        [ebp 0 +] -> ebx mov,
        [ebx FLAGS +] -> ebx movzx,
        }ASM ADJUST-STACK ; IMMEDIATE COMPILE-ONLY

: frag2 0 2 IN/OUT
        POSTPONE ASM{
        [ebp 0 +] -> ebx mov,
        1 b# -> ebx shl,
        [ebp 0 +] -> eax mov,
        [ebx 3 +] -> ecx lea,
        [ebx eax*1 3 +] -> ebx lea,
        }ASM ADJUST-STACK ; IMMEDIATE COMPILE-ONLY

: frag3 2 2 IN/OUT
        POSTPONE ASM{
        0 b# -> [ebx FLAGS +] byte mov,
        [ebx ecx*1 0 +] -> ebx lea,
        }ASM ADJUST-STACK ; IMMEDIATE COMPILE-ONLY

: frag4 2 0 IN/OUT
        POSTPONE ASM{
        [esp 0 +] dword inc,
        }ASM ADJUST-STACK ; IMMEDIATE COMPILE-ONLY
PREVIOUS



Thu, 27 May 2004 23:48:38 GMT  
 0.01 dragons slain
Here is a clip from the The Do's and Don'ts of Posting on Google
Groups

Use descriptive subject lines.

The subject line enables a person with limited time to decide whether
or not to read your article. A title like "Car for Sale" posted to
rec.autos doesn't convey as much as "66 MG Midget for sale: Campbell
CA". Keep your subjects short and to the point.

Thankd



Fri, 28 May 2004 03:49:27 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. strongForth 0.01 available

2. ANNOUNCEMENT: TOM, version 0.01

3. ANN: TOGA (Timetables Optimised with Genetic Algorithms) v 0.01

4. PyStemmer 0.01 available

5. ANNOUNCE: SCons 0.01 now available

6. TMDA 0.01 - A qmail-based anti-SPAM system

7. TreeWidget 0.01

8. PyKstat 0.01 - Solaris kstat(3k) access

9. mpegCam 0.01

10. pmpp 0.01 - HTML Pre-Processor

11. Rubum-tools version 0.01

12. Ruby-ODE (Open Dynamics Engine Ruby binding) 0.01

 

 
Powered by phpBB® Forum Software