TRAP - generic compiler prototyping with Python 
Author Message
 TRAP - generic compiler prototyping with Python

Dear "Pythoneers":

A trial version of TRAP, a Python-based generic prototyping system
for translators and compilers, is available at
                    http://www.*-*-*.com/
Feedback is very welcome.

[The remainder of this posting provides background info and is
 largely a text-only extract from that web page.]

Best regards,


TRAP
A Python-based generic compiler prototyping system
______________________________________________________________________

OVERVIEW

TRAP is a generic prototyping system for translators and compilers,
especially suited for medium-complexity special purpose languages.

TRAP is written in python and supports (only) the implementation of
translators/compilers in this language. (More information on Python can
be found at www.python.org. )

TRAP integrates concepts from various other compiler construction
systems under the umbrella of Python. The system eases the construction
of front-ends (i.e. lexical and syntactical analyzers), but beyond that
provides substantial support for complex semantic analysis and
transformation phases in which abstract syntax tree-like intermediate
representation (IR) data structures are manipulated.

Relying on an established, very high level, object-oriented
implementation language, TRAP can avoid "reinventing wheels" in its
compiler description language to a large extent. Instead,  Python is
used directly to express algorithmic aspects and auxilliary data
structures of  the compiler being developed. This provides a unique
level of abstraction, but (as Python also is a mature component
integration framework) in addition allows this compiler to be easily
integrated with other components in a large and complex software
system.

TRAP processes a compiler description, which covers (only) the grammar
of the input language and the structure of the IR node hierarchy (the
compiler's main data structure). From this description, TRAP generates
a ready-to-use Python compiler frame module containing the front-end
and  a hierarchy of IR node classes with prefabricated methods (e.g.
for IR pattern matching) that support the subsequent transformations.
The transformations then are implemented directly in Python, using the
generated compiler frame, but with no further direct involvement of
TRAP - there is no separate "transformation language". This means that
standard Python development tools like 'pdb' can be used and that
Pythons rapid development turnaround is not disturbed when working on
transformations.

MAIN FEATURES

Integrated, high level lexics / grammar / initial semantics description

- EBNF-style grammar description (optional and repetitive constituents
with separators & terminators)
- semantics actions specified as Python code directly "in" the
corresponding grammar rule or token definition
- automatic default semantics easing source-to-source translation
- automatic sequence (list/tuple/string) construction semantics for
repetitions and optionals
- type constraints for nonterminals (semantics values are dynamically
checked on rule reduction)

Automatic front-end generation

- By synthesizing appropriate nontertminals and rules (for repetition
and optionals), TRAP reduces the input grammar to an equivalent LR
grammar
- TRAP then invokes (a marginally modified version of) Aaron Watters'
kwParsing package, an LR parser generator written in Python, and
packages the resulting  front-end for the input language into the
generated compiler frame module

High-level IR description

- concise description of node types ordered in a domain hierarchy
- type constraints for node fields (dynamically checked on node
construction)
- constraint types can be: elementary types, nodes, domains, sequences

Automatic IR code and compiler frame generation

- TRAP generates a Python class hierarchy materializing the node/domain
hierarchy specified; in addition standard methods for dumping, type
checking, pattern matching,  traversal and visiting are synthesized.
- generated IR code can be relied upon in the grammar's semantics
actions (e.g. node constructor invocation, pattern matching)
- IR dump method outputs Python constructor syntax -  which is
human-readable and at the same time provides a basic IR persistence
mechanism
- IR dump method bound to special class method '__repr__', allowing
instant IR inspection (also in the de{*filter*})
- all generated code is packaged into a compiler frame module which can
be used (imported) in arbitrary Python programs, but (by means of a
minimal compiler driver which is generated-in, too) can also be
immediately tried out from the (Unix) shell command line.

EXAMPLES

Here is the TRAP description for a toy language:
-----------------------------------------------------------------
compiler t0

# plain grammar, no semantics specified ...

  token INTCONST  r'[1-9][0-9]*'          #  Python regexes
  token IDENT  r'[A-Za-z][A-Za-z0-9_]*'

  nterm operand
    <- INTCONST
    <- IDENT

  nterm expr
    <- operand "+" operand                #  inline literal tokens
    <- operand

  nterm stmt
    <- IDENT "=" expr
    <- "print"  <"," expr*>               #  repetition* w/ separator

  root nterm prog
    <- <stmt+ ";">  <"end"?>              #  repetition+ w/ terminator
                                          #  and optional
-----------------------------------------------------------------
Just the concrete syntax is specified;
with TRAP's automatic default semantics, an (almost) 1:1
source-to-source translator is generated from it.

B. An extended version of the same example.
An IR description part and semantics actions (mostly invoking the
generated node constructors) were added:
-----------------------------------------------------------------
compiler t3

# from concrete to abstract syntax:
# semantics actions = node constructor invocations

  token INTCONST  r'[1-9][0-9]*' :
      int(str) # convert to integer

  token IDENT  r'[A-Za-z][A-Za-z0-9_]*'

  nterm operand::Expr            # type constraint: domain name
    <- INTCONST=i:
       Int_const(i)              # node construction (nc)
    <- IDENT=s :
       V_ref(s)                  # nc

  nterm expr::Expr               # domain name
   <- operand=X "+" operand=Y:
      Bin_expr(X, Y, "+")        # nc
   <- operand

  nterm stmt::Stmt               # domain name
    <- IDENT=i "=" expr=e :
      Asgn_stmt (V_ref(i), e)    # nc (two nodes!)

    <- "print" ["," expr*]=elist :
      Print_stmt ( elist )       # nc

  root nterm prog::Program       # domain name

    <- [stmt+ ";"]=slist <"end"?>=e:
      Program (slist, e)              # nc

  comment r'/\*[^\*]*\*/' # /* ... */

  init :
    def ERR(str):
      import sys
      print("ERROR: "+str)
      sys.exit(1)

# definition of node types ( => Python classes )

  domain Expr
    Bin_expr ( opd1::Expr, opd2::Expr, op::StringType )
    Int_const (val::IntType)
    V_ref(var::<>)

  domain Stmt
    Print_stmt ( exprs::[Expr] )
    Asgn_stmt  (
      lhs::V_ref,                
      rhs::Expr
    )

  Program ( stmts::[Stmt], end::StringType )
-----------------------------------------------------------------

Here is a Python module using the corresponding generated compiler
frame and implementing an IR "unparser".
-----------------------------------------------------------------
#!/usr/bin/env python

# use generated compiler frame

from t3COMP import *

def Unparse(node):
    # explicit recursive traversal
    if node^Program: # match node type
        res=""
        for s in node.stmts: res=res+Unparse(s) + "; "
        return res+node.end
    elif node^Asgn_stmt:
        return Unparse(node.lhs) +"="+Unparse(node.rhs)
    elif node^Print_stmt:
        return "print "+string.joinfields( map(Unparse, node.exprs), ",")
    elif node^Bin_expr:
        return Unparse(node.opd1)+node.op+Unparse(node.opd2)
    elif node^Int_const:
        return str(node.val)
    elif node^V_ref:
        return node.var
    else:
        ERR("Unparse: node of unknown type: "+str(node))

def test():

  # construct AST using generated stuff
  C=Compiler()
  tree, context = C.CompileFile("t3.in")

  # dump tree
  print "TREE:"
  print tree

  # call Unparse()
  print "UNPARSED:"
  up= Unparse(tree)
  print up

  # now send the unparsed output through the compiler again ...
  tree2=C.CompileString(up)[0]
  up2=Unparse(tree2)
  print "UNPARSED2:"
  print up2

  # ... and compare both outputs
  if up==up2:
      print "Equal."
  else:
          print "Not Equal."

if __name__ == '__main__':
    test()
-----------------------------------------------------------------

MORE INFORMATION

- The following short paper gives an overview:
T. Ernst: TRAPping Modelica with Python. 8th Int'l Conf.on
Compiler Construction (CC'99, Amsterdam), to appear in LNCS.
- A draft paper available at
http://www.*-*-*.com/ 's
compiler description language in detail.

DOWNLOAD, INSTALLATION AND USAGE

A trial version including simple examples can be downloaded at
http://www.*-*-*.com/
the kwParsing package are included.
As Python is cross-platform, there is just one major system
requirement: having a working Python (V1.5.1 or higher) interpreter
installed. (Which is quite easy; visit www.python.org for more
information.) After downloading, refer to the README file in the
distribution.

CONTACT

Opinions, bug reports, proposals for improvement, etc. can be sent to:

T. Ernst, 1999/03/18

<P><A HREF=" http://www.*-*-*.com/ ;>TRAP990318</A>
- Generic Translator Prototyping System.  (18-Mar-99)

[Moderator's note: the package contains the following licensing notice:
"Licensing issues have not been worked out yet.  Until further notice,
this trial distribution (which does not contain all sources) can be
downloaded and used freely for evaluation, research and educational
purposes."]

--
----------- comp.lang.python.announce (moderated) ----------

Python Language Home Page:   http://www.*-*-*.com/
Python Quick Help Index:     http://www.*-*-*.com/
------------------------------------------------------------



Sun, 09 Sep 2001 03:00:00 GMT  
 
 [ 1 post ] 

 Relevant Pages 

1. ANN: TRAP - generic compiler prototyping with Python

2. PROPOSAL: A Generic Python Object Interface for Python C Modules

3. PROPOSAL: A Generic Python Object Interface for Python C Modules

4. PROPOSAL: A Generic Python Object Interface for Python C Modules

5. rapid prototyping/scripting/extension with Python

6. Python for prototyping of the user interface

7. Prototyping -- Yet another reason python rules :-)

8. Python, Prototyping, and... Managers

9. Python, JPython & Prototyping

10. rapid prototyping/scripting/extension with Python

11. Newbie Q : Python & rapid functional prototyping

12. rapid prototyping/scripting/extension with Python

 

 
Powered by phpBB® Forum Software