========== Static Type Checking for POP-11 ===============================

For some years I have been trying to get static type checking working for

POP-11. I have just achieved a substantial measure of success, having checked

all defined procedures in a POP-11 file of some 1049 lines, including one

single procedure of some 79 lines which employs some fairly characteristic

open-stack use. Moreover, the type-checker found some actual bugs in what was

a modified version of an already debugged program. Incidentally, one of the

most useful roles for the typechecker -is- in modifying a complicated program

to perform essentially the same computation but with different call

conventions, data-structures or whatever.

Declarations are made quite independent of the main program text, using a

single macro "declare". The types of most procedures have to be declared, but

that of their local variables is inferred.

Parametric polymorphism and overloading are supported. For example, the

checker will typecheck a definition of -applist-, overloaded with two

signatures which encompass many of the uses of that versatile procedure.

Below is an initial stab at documentation for the type checker:

CONTENTS - (Use <ENTER> g to access required sections)

-- Introduction

-- Subsystem required

-- The concept of type

-- Monologs

-- Pseudonyms

-- Primitive types

-- Singletons

-- Unit and Top types

-- The Syntax of Type Declarations

-- The Product of Types - type "*" type

-- The Union of Types - type "+" type

-- The Kleene Star and related constructs - "{" type "}"

-- Procedure Types and Grammatical Quotient - type "->" type, type "/" type

-- Monologs are expressed by the form: "<" type ">"

-- Parametrically Polymorphic Monologs

-- Universal Quantification.

-- Type Functions - Hd and Tl

-- Recognisers

-- Overloading

-- Parametric Polymorphism of Procedures.

-- The "Int" type versus the "Integer" type.

-- How the type-checker works

-- The code predicate analyses a list of VM instructions

-- An environment holds declarations, productions and constraints

-- Predicates for basic operations on type-grammars

-- subtype(Ty1,Ty2,LE)

-- sum(Ty1,Ty2,Ty)

-- product(Ty1,Ty2,Ty) forms the product of grammars.

-- divide(Ty1,Ty2,Ty_quot,Ty_div,Env_in,Env_out)

-- solve_grammar solves productions and equational constraints.

-- Limitations

-- Limitations of current system - Status report

-- A Hard Example

-- References

Introduction

------------

Consider the POP-11 procedure:

define test_fail(x);

lvars x; x + 'fred'

enddefine;

If we have the type-checker loaded, and do

<enter> lcpt

("load current procedure checking type") the type-checker will print an error

message.

Error in type-checker:

In procedure " test_fail "

Can't resolve constraints .ty14 = <_2>*(String) / (Int)*(Int)

This means "I had on the stack something of undetermined type (<_2>) and

a string, and I needed a pair of integers.

However, if your POP-11 is OK, you will get something more reassuring

In response to:

define fact(n); lvars n;

if n=0 then 1 else n*fact(n-1) endif

enddefine;

you will get, after a commentary whose verbosity you can limit:

Type -- fact : ((Int) -> (Int))

Here the (...) brackets around "Int" indicate simply that "Int", which is an

alias for the actual type representation, has been expanded using a

type-equation.

Another type for -fact- is also correct, and can be verified by making the

declaration:

declare fact:Number->Number;

before invoking the type-checker. We say that -fact- is -overloaded-.

We can specify the type of lists with more precision than is available in SML:

declare solve_quadratic :

Number*Number*Number-> (List of (Number*Number));

define solve_quadratic(a,b,c);

lvars a,b,c,d = sqrt(b**2-4*a*c);

[% (-b + d)/(2*a), (-b -d)/(2*a) %]

enddefine;

That is to say "solve_quadratic" takes three numbers and returns a list

of -exactly- two numbers.

The types -inferred- for local variables may also be printed out. For

-solve_quadratic- they are:

Variable declarations in procedure

d : (Number)

c : (Number)

b : (Number)

a : (Number)

An annotated version of the Virtual Machine Code can also be output.

In the extract which follows, the "CODE" column is the VM instruction, the

"Type of STACK" specifies what is on the stack, and the "Val" column is a

limited symbolic representation of the value of the top of the stack. This is

used primarily for conditionals. The forms "_1", "_2" ... are unbound Prolog

variables.

5 pop(c) Number*Number

6 pop(b) Number

7 pop(a) ()

8 push(b) (Number) b

9 pushq(2) (Number)*(Int) b, 2

10 callw(**) Number b ** 2

11 pushq(4) (Number)*(Int) b ** 2, 4

12 push(a) (Number)*(Int)*(Number) .., 4, a

13 callw(*) (Number)*(Number) b ** 2, 4 * a

14 push(c) (Number)*(Number)*(Number) .., 4 * a, c

15 callw(*) (Number)*(Number) b ** 2, 4 * a * c

16 callw(-) Number b ** 2 - 4 * a * c

17 callw(sqrt) Number sqrt(b ** 2 - 4 *

a * c)

The "open stack" aspects of POP-11 are supported. Given just the definition:

define oneton(n);

lvars i=0,n;

[% while i<n do i,i+1->i endwhile %]

enddefine;

the type-checker will infer:

Type -- oneton : (Number) -> <List of {(Int)}>

that is to say "-oneton- takes a number and produces a list of integers".

The {....} brackets mean the Kleene-star operation, in this case "zero or more

short integers". The <....> brackets enclose "monologs", which will be

explained later.

Subsystem required

------------------

The type checker requires Prolog to be loaded. If Prolog is not loaded, all

declarations will be ignored.

The concept of type

-------------------

A -type- is a sequence of zero or more POP-11 objects. (The word "item" is

used in 2 ways in the POP-11 literature, either as something that is read in

by the -itemiser- or as single thing that exists internally. We use "object"

to mean something that exists internally to POP-11, while using "token" to

mean something read in by the itemiser.)

Such sequences can be regarded formally as a -language-, whose alphabet is

POP-11 objects (see Hopcroft and Ullman [1979] for a definition of language).

A language is described by a -grammar-. The descriptive apparatus that the

type-checker provides for users to describe types is based on -regular

expressions-. For example, the type-grammar:

Int * Number

characterises a sequence of two objects, the first of which is a short integer

and the second of which is a number.

Monologs

------------

A monolog is a type all of whose members are sequences of length one. In

POP-11, the type of any (non-active) variable is a monolog. Thus -Int- is

a monolog. Angle brackets <...> are used to enclose monologs in certain

circumstances.

Pseudonyms

--------------

A pseudonym is an identifier which is equated to a type. Thus -Integer- is

a pseudonym for objects whose dataword is either "integer" and "biginteger".

Primitive types

-------------------

A primitive type is a set of objects which have the same -datakey-. For

example -Int- is a built-in pseudonym for the primitive type of short

integers.

Singletons

--------------

A singleton is a type which consists of a monolog of just one object. For

example -False- is a singleton which consists of a sequence of just the

<false> object.

Unit and Top types

----------------------

-Unit- is a synonym for the type which consists just of the null sequence.

-Top- is the -maximum legal monolog-. That is to say it is a monolog which

contains all monologs that can legally be bound to variables. The type-checker

regards it as -illegal- to bind a stack-mark to a variable. -Top- is -not- the

least upper bound of the type-grammar.

The Syntax of Type Declarations

--------------------------------

In the syntax definitions below, we use the convention that non-terminals are

written without distinguishing marks, while terminal symbols are quoted. The

only extension to POP-11 is a -declare- statement. This may be added to an

existing program file, or placed in a separate file.

declaration -> "declare" decls ";"

decls -> decl "," decls

decl -> var_list ":" type

| var "?" type

var_list -> var "," var_list

type -> typeid

| "(" type ")"

| type "*" type

| type "+" type

| "{" type "}"

| "{" type "}" integer

| type "->" type

| "<" var ">"

| type "of" type

| "All" var_list ";" type

| type_function "(" type ")"

| type "/" type

The final, -quotient- form is not supported on input, but may appear in

output.

The non-terminal -var- is a POP-11 word, as is -typeid-. The system uses words

with a prepended ".".

The form "(" type ")" is used in input to override operator precedences.

The Product of Types - type "*" type

---------------------------------------

The form type "*" type is a -product- of regular expressions, and corresponds

roughly to the cartesian product of types in SML. It differs from the SML form

in that it is -associative-, so that (Int*Number)*Word is the same type as

Int*(Number*Word).

The Union of Types - type "+" type

------------------------------------

The union of types is expressed by:

type "+" type

Many built-in types are unions, for
...

**read more »**