Re(try): A native-code compiler in Forth 
Author Message
 Re(try): A native-code compiler in Forth

I know this posted right, since I got replies via e-mail, but I can't see it on
my newsserver! Sorry for the inconsistent posting format. I promise to stop
babbling about boring old Java soon.

I wrote a few days ago:

|After reading the Java Virtual Machine specification, it struck me that since both
|Forth and Java are largely stack-based, it should be relatively easy to get Java
|bytecode "converted" on the fly to Forth, and then run in a threaded fashion. The main points that I am |unsure about are:

|-Garbage Collection. Is there anything out there that uses this already? Forth
|probably isn't the kind of language where this is really needed, but Java
|requires it.

|-exceptions. Is the ANS Exception word set pretty common?
|Ie would a generic C-coded PC version be hard to come by?

|-fixups. The VM needs to ensure type safety, which can largely be done at load time, but
|certain operations can only be checked at run time. Sun, in thier VM spec, mentions
|optimizations that involve overwriting the opcodes with (transparent) less costly ones.
|Can somebody give me an working example of this?
|ie, I want to do:

\ this code would be hand coded

: invokevirtual ( _constant_ref -- item ) \ load constant data
  ( check that constant[_constant_ref] is a class )
  ( check that the class is loaded )
  { check that the virtual method exists )

: invokevirtual_quick ( _constant_ref -- item )
  ( do normal processing )

\ this code would be machine generated from Java .class file (bytecode)

: java-method ( -- )  \ method parameters are passed into local variables
  <op> <op> invokevirtual <op> ...

Outline of JVM and why I'm thinking Forth:
[snipped for brevity]
| ...All integers are stored in big-endian order. The stack is 32-bits wide.  
|Floating point arithmetic is done with IEEE-754 numbers. (does this have any
|significance to a Forth implementation? ie Do recent Intel processors use

|My plan is to build a disassembler, then make words for each of the opcodes that
|act as necessary. Current implementations of Java use a switch statement to
|dispatch on opcodes this should be much faster

|Only one namespace for the Forth code is necessary, so I can handle that. I
|haven't checked in great detail, but I think objects could be reference-counted

|Jump opcodes are another tricky part I haven't resolved in my mind yet.
Elaboration: is the data put on the R-stack for IF ELSE THEN constructs during
compiling at all portable? Since jump opcodes and thier destinations may
overlap, I can't use IF directly... I need a (yecch) goto!

|Any help appreciated,

|-Tony Lownds

Here's the response I got from (which was also posted):

|Why ist it, that everyone who encounters garbage collection for the first
|time, thinks that reference counting will better do the trick ?
Probably because its the only solution thats obvious to implement.

|You should
|take a look at a recent thread on c.l.lisp and see why this assumption is
Thanks for the tip, I found some (possibly) helpful papers.

|What's the scope of your project ? You only want to provide the Java VM ?
|Right ?
|For inclusion in what ? A browser of your own ? A general VM to execute
|Java applets within ?
A general VM.

|Obviously, you don't seem to be interested in getting Java source code and
|compile the corresponding byte-codes ?

|Maybe your best bet would be to make a token-threaded forth, where all the
|opcode of Java byte-code would magically happen to be the token of the
|corresponding forth word ?
I'm thinking of that kind of thing, but in two steps; A converter from bytecode
to a Java Assembler language, and then compiling the Java Assembler with a
simple colon definition. The converter will dispatch by opcode into a table by
index rather than looking an opcodes translator word up in the dictionary.

|Finaly beware that Java does support 64 bit integers, and that making the
|stack 32 bits wide is not a so clear cut choice... Finally, since Java is
|a more ``traditional'' language than Forth, with respect to algebra, the
|choice of making a unique stack for both integer and floating points or
|separate stacks plus special opcodes to convert back and forth is not so
|clear either... maybe a unique stack with 80 bits cells
|(sizeof(IEEE-754-double)) ?
The JVM specifies a 32-bit stack... and isn't IEEE-754-double 64-bits wide? Long
double is 80 bits wide, at least in C terminology.

I hope I can get this VM to work - it'll be my first (significant) project in
Forth, all I know about it right now is the syntax ;)

-Tony Lownds

Mon, 06 Jul 1998 03:00:00 GMT  
 [ 1 post ] 

 Relevant Pages 

1. A native-code Java compiler in Forth

2. CRC-32 native code for VFX Forth

3. Forth-like Simplifications for Native One-Stack Code

4. F68K - a native code Forth for 68000

5. Forth to native code generation; one more iteration

6. Native code compilers

7. QuickSort(), native code, Force compiler

8. What is a 'native code' compiler

9. public relase of HiPE native-code Erlang compiler

10. XDS native code compiler for NT

11. XDS native code Modula/Oberon compilers: DEMO is available for MS-DOS

12. XDS native code compiler for NT


Powered by phpBB® Forum Software