Type Checking POP-11 
Author Message
 Type Checking POP-11

==========    Static Type Checking for POP-11 ===============================

For some years  I have been  trying to  get static type  checking working  for
POP-11. I have just achieved a substantial measure of success, having  checked
all defined procedures  in a  POP-11 file of  some 1049  lines, including  one
single procedure of  some 79  lines which employs  some fairly  characteristic
open-stack use. Moreover, the type-checker found some actual bugs in what  was
a modified version of  an already debugged program.  Incidentally, one of  the
most useful roles for the typechecker -is- in modifying a complicated  program
to  perform  essentially  the  same   computation  but  with  different   call
conventions, data-structures or whatever.

Declarations are made quite independent of the main program text, using a
single macro "declare". The types of most procedures have to be declared, but
that of their local variables is inferred.

Parametric polymorphism  and  overloading  are  supported.  For  example,  the
checker  will  typecheck  a  definition  of  -applist-,  overloaded  with  two
signatures which encompass many of the uses of that versatile procedure.

Below is an initial stab at documentation for the type checker:

         CONTENTS - (Use <ENTER> g to access required sections)

 --  Introduction
 --  Subsystem required
 --  The concept of type
 --      Monologs
 --      Pseudonyms
 --      Primitive types
 --      Singletons
 --      Unit and Top types
 --  The Syntax of Type Declarations
 --    The Product of Types - type "*" type
 --    The Union of Types - type "+" type
 --    The Kleene Star and related constructs - "{" type "}"
 --    Procedure Types and Grammatical Quotient - type "->" type, type "/" type
 --    Monologs are expressed by the form: "<" type ">"
 --    Parametrically Polymorphic Monologs
 --    Universal Quantification.
 --    Type Functions - Hd and Tl
 --      Recognisers
 --  Overloading
 --  Parametric Polymorphism of Procedures.
 --  The "Int" type versus the "Integer" type.
 --  How the type-checker works
 --    The code predicate analyses a list of VM instructions
 --    An environment holds declarations, productions and constraints
 --    Predicates for basic operations on type-grammars
 --      subtype(Ty1,Ty2,LE)
 --      sum(Ty1,Ty2,Ty)
 --      product(Ty1,Ty2,Ty)  forms the product of grammars.
 --      divide(Ty1,Ty2,Ty_quot,Ty_div,Env_in,Env_out)
 --    solve_grammar solves productions and equational constraints.
 --  Limitations
 --    Limitations of current system - Status report
 --  A Hard Example
 --  References

Introduction
------------
Consider the POP-11 procedure:

    define test_fail(x);
      lvars x; x + 'fred'
    enddefine;

If we have the type-checker loaded, and do

    <enter> lcpt

("load current procedure checking type") the type-checker will print an error
message.

    Error in type-checker:
    In procedure " test_fail "
    Can't resolve constraints  .ty14 = <_2>*(String) / (Int)*(Int)

This means "I had on the stack something of undetermined type (<_2>) and
a string, and I needed a pair of integers.

However, if your POP-11 is OK, you will get something more reassuring
In response to:

    define fact(n); lvars n;
      if n=0 then 1 else n*fact(n-1) endif
    enddefine;

you will get, after a commentary whose verbosity you can limit:

    Type -- fact  :  ((Int) -> (Int))

Here the (...) brackets around "Int" indicate simply that "Int", which is an
alias for the actual type representation, has been expanded using a
type-equation.

Another type for -fact- is also correct, and can be verified  by making the
declaration:

    declare fact:Number->Number;

before invoking the type-checker. We say that -fact- is -overloaded-.

We can specify the type of lists with more precision than is available in SML:

    declare solve_quadratic  :
          Number*Number*Number-> (List of (Number*Number));

    define solve_quadratic(a,b,c);
      lvars a,b,c,d = sqrt(b**2-4*a*c);
      [% (-b + d)/(2*a), (-b -d)/(2*a) %]
    enddefine;

That is to say "solve_quadratic" takes three numbers and returns a list
of -exactly- two numbers.

The types -inferred- for local variables may also be printed out. For
-solve_quadratic- they are:

  Variable declarations in procedure
    d  :  (Number)
    c  :  (Number)
    b  :  (Number)
    a  :  (Number)

An annotated version of the Virtual Machine Code can also be output.
In the extract which follows, the "CODE" column is the VM instruction, the
"Type of STACK" specifies what is on the stack, and the "Val" column is a
limited symbolic representation of the value of the top of the stack. This is
used primarily for conditionals. The forms "_1", "_2" ... are unbound Prolog
variables.

5   pop(c)              Number*Number
6   pop(b)              Number
7   pop(a)              ()
8   push(b)             (Number)                         b
9   pushq(2)            (Number)*(Int)                   b, 2
10  callw(**)           Number                           b ** 2
11  pushq(4)            (Number)*(Int)                   b ** 2, 4
12  push(a)             (Number)*(Int)*(Number)          .., 4, a
13  callw(*)            (Number)*(Number)                b ** 2, 4 * a
14  push(c)             (Number)*(Number)*(Number)       .., 4 * a, c

15  callw(*)            (Number)*(Number)                b ** 2, 4 * a * c
16  callw(-)            Number                           b ** 2 - 4 * a * c
17  callw(sqrt)         Number                           sqrt(b ** 2 - 4 *
                                                         a * c)

The "open stack" aspects of POP-11 are supported. Given just the definition:

    define oneton(n);
      lvars i=0,n;
      [% while i<n do i,i+1->i endwhile %]
    enddefine;

the type-checker will infer:

    Type -- oneton         : (Number) -> <List of {(Int)}>

that is to say "-oneton- takes a number and produces a list of integers".
The {....} brackets mean the Kleene-star operation, in this case "zero or more
short integers". The <....> brackets enclose "monologs", which will be
explained later.

Subsystem required
------------------
The type checker requires Prolog to be loaded. If Prolog is not loaded, all
declarations will be ignored.

The concept of type
-------------------
A -type- is a  sequence of zero  or more POP-11 objects.  (The word "item"  is
used in 2 ways in the POP-11  literature, either as something that is read  in
by the -itemiser- or as single  thing that exists internally. We use  "object"
to mean something  that exists internally  to POP-11, while  using "token"  to
mean something read in by the itemiser.)

Such sequences can  be regarded formally  as a -language-,  whose alphabet  is
POP-11 objects (see Hopcroft and Ullman [1979] for a definition of  language).
A language is  described by a  -grammar-. The descriptive  apparatus that  the
type-checker provides  for  users  to  describe types  is  based  on  -regular
expressions-. For example, the type-grammar:

    Int * Number

characterises a sequence of two objects, the first of which is a short integer
and the second of which is a number.

    Monologs
------------
A monolog is a type all of whose members are sequences of length one. In
POP-11, the type of any  (non-active) variable is a monolog. Thus -Int- is
a monolog. Angle brackets <...> are used to enclose monologs in certain
circumstances.

    Pseudonyms
--------------
A pseudonym is an identifier which is equated to a type. Thus -Integer- is
a pseudonym for objects whose dataword  is either "integer" and "biginteger".

    Primitive types
-------------------
A primitive type is a set of objects which have the same -datakey-. For
example -Int- is a built-in pseudonym for the primitive type of short
integers.

    Singletons
--------------
A singleton is a type which consists of a monolog  of just one object. For
example -False- is a singleton which consists of a sequence of just the
<false> object.

    Unit and Top types
----------------------
-Unit- is a synonym  for the type  which consists just  of the null  sequence.
-Top- is the -maximum  legal monolog-. That  is to say it  is a monolog  which
contains all monologs that can legally be bound to variables. The type-checker
regards it as -illegal- to bind a stack-mark to a variable. -Top- is -not- the
least upper bound of the type-grammar.

The Syntax of Type Declarations
--------------------------------

In the syntax definitions below, we use the convention that non-terminals  are
written without distinguishing marks, while  terminal symbols are quoted.  The
only extension to POP-11  is a -declare-  statement. This may  be added to  an
existing program file, or placed in a separate file.

declaration  -> "declare" decls  ";"

decls        -> decl "," decls

decl         -> var_list ":" type
               |  var "?" type

var_list      -> var "," var_list

type -> typeid
       |  "(" type ")"
       |  type "*" type
       |  type "+" type
       |  "{" type "}"
       |  "{" type "}" integer
       |  type "->" type
       |   "<" var ">"
       |  type "of" type
       |  "All" var_list ";" type
       |  type_function "(" type ")"
       |  type "/" type

The final, -quotient- form is not supported on input, but may appear in
output.

The non-terminal -var- is a POP-11 word, as is -typeid-. The system uses words
with a prepended ".".

The form  "(" type ")" is used in input to override operator precedences.

  The Product of Types - type "*" type
---------------------------------------

The form type "*" type is a -product- of regular expressions, and corresponds
roughly to the cartesian product of types in SML. It differs from the SML form
in that it is -associative-, so that (Int*Number)*Word is the same type as
Int*(Number*Word).

  The Union of Types - type "+" type
------------------------------------
The union of types is expressed by:

    type "+" type

Many built-in types are unions, for ...

read more »



Wed, 08 May 1996 00:48:48 GMT  
 Type Checking POP-11
Having pored over Robin Popplestone's report on static type-checking
I have to admit that I am impressed.  Although it is not (yet) integrated
into Pop11 in a satisfactory way, his results show that it is surprisingly
effective.

To turn this work into a product, I think there are several points for
improvement.  Here are the points I think are important:

(1) Types must become values rather than names.  Otherwise this work
    will be impossible to extend to sections and lexical types.  

    However, this raises the problem of how type-names in declarations
    get translated into type-values.  (I met this problem in method
    declarations in the ObjectClass work.)  I think that the right
    resolution is to have a separate name space for values and types.

    This suggestion may seem like heresy to those of us who have
    grown up with Pop11 having a single name space.  But there is an
    established and (to my mind) reasonable convention that the
    dataword is a valid variable name i.e.  The local variable -vector-
    is a perfectly sensible name for a typical vector.

        define revv( vector ); .... enddefine;

    There are many parts of Pop11 that would benefit from the ability
    to create new name spaces.  The best examples to date are
    -p_typespecs-, -l_typespecs-, and the Prolog subsystem.  There are
    three obvious implementation techniques:

        (a) Use a special prefix/suffix for names that are conceptually
            in the "other" namespace.  This is the solution adopted
            currently.

        (b) Implement a way of creating new, unpopulated top-level
            sections.  (This is an especially clean solution, in my view.)

        (c) Simply create a hash table from idents to values i.e.
            adds a virtual field to idents.

(2) The scheme for introducing type variables is too clumsy.  I would
    prefer a syntactic marker for type variables, such as ?x for monologs
    and ??x for polylogs.  e.g.

    Instead of
        All a, b; List of {<a>} * (<a> -> <b>) -> List of {<b>}
    one would write
        list of { ?a } * ( ?a -> ?b ) -> list of { ?b }

(3) Instead of using an unfamiliar infix syntax ( "of" ) for type
    parameterisation I would prefer simple brackets.  This makes the extent
    of the parameter much clearer.  In the above example, the type could
    be read in (at least) two ways

        ( List of {<a>} ) * (<a> -> <b>) -> List of {<b>}
    or
         List of ( {<a>} * (<a> -> <b>) -> List of {<b>} )

    Using brackets for type parameterisation, it might look like
    this

         list( ?a ) * ( ?a -> ?b ) -> list( ?b )
    or, to distinguish the brackets,
         list[ ?a ] * ( ?a -> ?b ) -> list[ ?b ]

(4) Some of the limitations in the system would be overcome by employing
    the "high-level" code-planting instructions that I have previously
    described.  For example, instead of writing the "#| ... |#" syntax
    in terms of explicit labels, variables, and arithmetic, one would
    create the instructions sysCOUNT( dummy ) ... sysENDCOUNT( dummy ).

    Defining these high level instructions is reasonably straightforward
    provided sysCOMPILE (and friends) are defined to localise a state
    variable.  This state variable allows the high-level instructions
    to keep a stack of the ``expected'' instructions.  Other examples
    would be high-level loop instructions (sysWHILE/DO/ENDWHILE) and
    conditinals (sysIF/THEN/ELSEIF/ELSE/ENDIF).  These are not especially
    difficult to define -- see the pop9x demo code for an example.

    This would mean that the type-system would be able to reason much
    more effectively about a variety of common constructs.

    Alternatively, adding parse tree creation to Pop11 would be a
    rather superior approach.



Wed, 08 May 1996 23:58:20 GMT  
 
 [ 2 post ] 

 Relevant Pages 

1. Type Checking for POP-11 - a partial reply to Steve K.

2. POP-11 types

3. Type checking in POP (Oops)

4. A POP-11 Book Available via FTP

5. calling prolog from pop-11

6. Pop-11 Procedures for encoding and decoding BASE64 files (mimencode)

7. Development of Pop-11 for Windows

8. GSL interface: another Pop-11 development task?

9. vectors in pop-11

10. Running Pop-11

11. forwarding message from Steve Leach re Running Pop-11

12. Pop-11 XML tools by Steve Leach

 

 
Powered by phpBB® Forum Software