Crash-proofing 
Author Message
 Crash-proofing

Can I return to the idea of crash-proofing Forth?
I mean an optional layer of code on top of an ANS Forth which will catch
the errors which all Forth users (and especially beginners) make.

Some of the catchable errors are:
        mismatching >R and R>
        executing a value which is not an execution token
        writing to a code address
        stack underflow
        divide by zero

Can you add to this list?  What other traps can you fall into?

Bye for now                                                      
                                           ____/  /  __  /   /   /   /   /
Chris Jakeman                             /      /  /  _/   /   /   /  /
                                         __/    /  / __    /   /   /_/
at Peterborough, a cathedral            /      /  /   /   /   /   /  \
city 80 miles north of London.       __/    __/  ____/    ___/ __/   _\  
Where do you come from?          
                              Forth Interest Group United Kingdom Chapter
Voice +44 (0)1733 753489      at http://www.*-*-*.com/



Sun, 16 Jan 2000 03:00:00 GMT  
 Crash-proofing

On Wed, 30 Jul 1997 06:26:40 +0100, Chris Jakeman

Quote:

>Can I return to the idea of crash-proofing Forth?
>I mean an optional layer of code on top of an ANS Forth which will catch
>the errors which all Forth users (and especially beginners) make.

>Some of the catchable errors are:
>        mismatching >R and R>
>        executing a value which is not an execution token
>        writing to a code address
>        stack underflow
>        divide by zero

Hi, I've written a crash proof Forth, called 4tH. But that is driven
by a very different kind of engine than usual. These are the errors it
catches. Indefinite loops are not considered to be an error, although
you might time them out if you wish (like some databases do with
queries). In order to do this there are different segments for code,
integers, strings and stringconstants. That makes it easy to limit
access of certain operators. These are the errors I catch:

  "Out of memory",                    \ no comment needed
  "Bad object",                               \ when loading an image
  "Bad string",                               \ stringconstant address out
of bounds
  "Bad variable",                             \ variable address out
of bounds
  "Bad token",                                \ code address out of bounds
  "Bad address",                              \ string addres out of
bounds
  "Bad radix",                                \ BASE not between 2 and 36
  "Bad pointer",                              \ invalid exception
frame
  "Stack overflow",                   \ no comment needed
  "Stack empty",                              \ no comment needed
  "Return stack overflow",            \ no comment needed
  "Return stack empty",                       \ no comment needed
  "Divide by zero",                   \ no comment needed
  "I/O error",                                \ no comment needed
  "Assertion failed" ,                        \ this version got assertions,
like C and Gforth
  "Unhandled exception" ,             \ THROW without CATCH
  "Bad literal",                              \ 4tH specific
  "Undefined name",                   \ word not in dictionary
  "Nesting too deep",                 \ special stack for program
flow
  "No program",                               \ 4tH specific
  "Incomplete declaration",           \ 4tH specific
  "Unmatched conditional",            \ program flow stack underflow
or mismatched reference
  "Unterminated string",                      \ runaway string
  "Null string",                              \ ." " not allowed in
4tH
  "Duplicate name"                    \ warning in Forth

Note that this is hardly an optional layer. This one is hard-coded.
4tH is ANS-Forth compatible up to a point, but not compliant. It is
however, excellently suited for educational purposes. Take a look at
ftp.taygeta/pub/Forth/Applications. Both Unix and DOS/Win versions are
available, lots of docs including a 75 page primer and best of all:
it's free!

Hans



Sun, 16 Jan 2000 03:00:00 GMT  
 Crash-proofing

Quote:

> Can I return to the idea of crash-proofing Forth?
> I mean an optional layer of code on top of an ANS Forth which will catch
> the errors which all Forth users (and especially beginners) make.

> Some of the catchable errors are:
>         mismatching >R and R>
>         executing a value which is not an execution token
>         writing to a code address
>         stack underflow

And overflow! Simply getting the stack effect wrong. E.g. forgetting a
drop somewhere between
IF ... ELSE ... THEN

Quote:
>         divide by zero

Another ulgy mistake (which I make always after 10 years of forth):

cu, Jens

--------------------------------------------------------
JW-Datentechnik GmbH    Tel +49 89 897689-0
Brunhamstr. 21          Fax +49 89 8714548

Germany



Mon, 17 Jan 2000 03:00:00 GMT  
 Crash-proofing


Quote:
>> Some of the catchable errors are:
>>         mismatching >R and R>
>>         executing a value which is not an execution token
>>         writing to a code address
>>         stack underflow

>And overflow! Simply getting the stack effect wrong. E.g. forgetting a
>drop somewhere between
>IF ... ELSE ... THEN

>>         divide by zero

>Another ulgy mistake (which I make always after 10 years of forth):


Having thought some more, I would separate:
        1. preventing a crash
        2. detecting errors which might lead to a crash

1.PREVENTING A CRASH
To take your example,
if you write    SOME-VARIABLE EXECUTE

then the value executed is the address of data, not an execution token,
and will probably lead to a crash.

We might prevent this by re-defining EXECUTE to include some sort of check.

2.DETECTING ERRORS
This might be done by checking the stack action of each word and also by
tracking the type of value (eg address or execution token).
I've done this sort of thing once before, but I don't want to pursue it until we've
exhausted PREVENTING A CRASH.

I suggested:

Quote:
>>         mismatching >R and R>
>>         executing a value which is not an execution token
>>         overwriting code (causing problems when executed later)
>>         stack underflow

Hans has suggested
         Out of memory
and undesired loops are just as troublesome as a crash.

But are there any other ways to crash Forth?

Bye for now                                                      
                                           ____/  /  __  /   /   /   /   /
Chris Jakeman                             /      /  /  _/   /   /   /  /
                                         __/    /  / __    /   /   /_/
at Peterborough, a cathedral            /      /  /   /   /   /   /  \
city 80 miles north of London.       __/    __/  ____/    ___/ __/   _\  
Where do you come from?          
                              Forth Interest Group United Kingdom Chapter
Voice +44 (0)1733 753489      at http://www.forth.org/fig/uk/homepage.htm



Tue, 18 Jan 2000 03:00:00 GMT  
 Crash-proofing

Quote:


> >> Some of the catchable errors are:
> >>         mismatching >R and R>
> >>         executing a value which is not an execution token
> >>         writing to a code address
> >>         stack underflow

Ok, just how do you tell whether a value is an execution token or a code
address?
On some systems code is in a known region of address space, but on many
(possibly
most) it's intermingled with other things.

Quote:

...
> 1.PREVENTING A CRASH
> To take your example,
> if you write    SOME-VARIABLE EXECUTE

> then the value executed is the address of data, not an execution token,
> and will probably lead to a crash.

> We might prevent this by re-defining EXECUTE to include some sort of check.


also
checks that the fetched value is non-zero (all in code, of course).
Zero values are treated as no-ops.  In our practice, that's the only way
we execute vectors of that sort.

Quote:
> 2.DETECTING ERRORS
> ...
> I suggested:
> >>         mismatching >R and R>
> >>         executing a value which is not an execution token
> >>         overwriting code (causing problems when executed later)
> >>         stack underflow
> Hans has suggested
>          Out of memory
> and undesired loops are just as troublesome as a crash.

We do routinely check for "out of memory" in defining/allotting words,
and for stack underflow.  But how could you detect "overwriting code"?

--
===============================================
Elizabeth D. Rather              1-800-55-FORTH
FORTH Inc.                         310-372-8493
111 N. Sepulveda Blvd.        Fax: 310-318-7130
Manhattan Beach, CA 90266
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
===============================================



Tue, 18 Jan 2000 03:00:00 GMT  
 Crash-proofing

Quote:

>Having thought some more, I would separate:
>        1. preventing a crash
>        2. detecting errors which might lead to a crash

>1.PREVENTING A CRASH
>To take your example,
>if you write    SOME-VARIABLE EXECUTE

>then the value executed is the address of data, not an execution token,
>and will probably lead to a crash.

>We might prevent this by re-defining EXECUTE to include some sort of check.

>2.DETECTING ERRORS
>This might be done by checking the stack action of each word and also by
>tracking the type of value (eg address or execution token).
>I've done this sort of thing once before, but I don't want to pursue it until we've
>exhausted PREVENTING A CRASH.

>I suggested:
>>>         mismatching >R and R>
>>>         executing a value which is not an execution token
>>>         overwriting code (causing problems when executed later)
>>>         stack underflow
>Hans has suggested
>         Out of memory
>and undesired loops are just as troublesome as a crash.

>But are there any other ways to crash Forth?

I see this goes into larger detail. May be I should elaborate on that. I don't
think that a single layer can tackle all problems. I think though that it is
possible to tackle this problem by using the same approach I took:
- Segment your Forth
  o Code
  o Data
     o Different datatypes
        o Constants
        o Variables
- Restrict acces to the proper segments
- Do bounds checking


since they can only operate within a single segment. When you make some
segments (generally) read-only you can narrow access even more. Which is not

quickly get out of bounds when accessing data. I'm convinced you could use this
approach even in classical implementations. May be different vocubularies could
be used that do more or less strict checking. Thus giving the developer a
choice between compiling in 'safe' or 'fast' mode, even with different levels.

I don't think you can balance e.g. >R and R>, especially with words like
CS-PICK and CS-ROLL. You CAN check whether the value EXIT pops is within bounds
and if not, ABORT. May be that means ripping apart the Forth dictionary, but
who cares.. In my experience Forth can accept quite a lot, without sourcecode
changes, as long as you do your programming properly. E.g. like changing the
parameter field to a parameter pointer and a separate segment for parameter
fields (code is a separate segment too). Just an idea Ben Hoyt and I have been
playing with. Any comment? Too wild an idea?

Hans



Tue, 18 Jan 2000 03:00:00 GMT  
 Crash-proofing


Quote:

>> We might prevent this by re-defining EXECUTE to include some sort of check.
>Ok, just how do you tell whether a value is an execution token or a code
>address?
>On some systems code is in a known region of address space, but on many
>(possibly most) it's intermingled with other things.

I can remember a DTC FORTH implementation, where the first machine
opcode was either a NOOP, or a specific CALL instruction for high level
words. You can check if the "address" given to EXECUTE points to such an
opcode.

This gives some speed penalty in normal execution. So an alternative can
be that any valid FORTH word MUST be preceded by a magic number. The two
bytes "OK" seem quite appropriate...  ;-)

        Bart Lateur



Wed, 19 Jan 2000 03:00:00 GMT  
 Crash-proofing

Just a quick question.

Does any know if there is an example of the ANS forth floating point
word set released anywhere under the GNU GPL or similar. I want to do a
set for the Arm and need somewhere to start.



Wed, 19 Jan 2000 03:00:00 GMT  
 Crash-proofing

Quote:

> I don't think you can balance e.g. >R and R>, especially with words like
> CS-PICK and CS-ROLL. You CAN check whether the value EXIT pops is within bounds
> and if not, ABORT.

For checking >R and R> balancing at run-time, you simply have another
stack on which you push the current return stack depth when entering a
colon definition. On EXIT, you compare the current return stack depth
with the top of this special stack, and then you pop this value from
this stack.

Static return stack depth checking is not possible in general; the
checker would complain about legal programs; however, in practice most
code has a statically determinable data stack depth (WRT the start of
the definition), and even more code has a statically determinable
return stack depth (WRT the start of the definition). It is quite
feasible to do such a check, and neither CS-PICK nor CS-ROLL pose any
problems for this. I can elaborate if you want.

- anton
--
M. Anton Ertl                    Some things have to be seen to be believed

http://www.complang.tuwien.ac.at/anton/home.html



Fri, 21 Jan 2000 03:00:00 GMT  
 Crash-proofing


Quote:


Bezemer) writes:
>> I don't think you can balance e.g. >R and R>, especially with words
>> like CS-PICK and CS-ROLL. You CAN check whether the value EXIT pops
>> is within bounds and if not, ABORT.
>For checking >R and R> balancing at run-time, you simply have another
>stack on which you push the current return stack depth when entering a
>colon definition. On EXIT, you compare the current return stack depth
>with the top of this special stack, and then you pop this value from
>this stack.

That sounds like a good idea.  Why not go farther and actually push the
values onto the extra stack?  Then if you fail to balance it, you get
to notice that it's not balanced but you don't crash.

Quote:
>Static return stack depth checking is not possible in general; the
>checker would complain about legal programs;

I've lost you here.  Why isn't it possible?
Quote:
>however, in practice most
>code has a statically determinable data stack depth (WRT the start of
>the definition), and even more code has a statically determinable
>return stack depth (WRT the start of the definition). It is quite
>feasible to do such a check, and neither CS-PICK nor CS-ROLL pose any
>problems for this. I can elaborate if you want.



Sat, 22 Jan 2000 03:00:00 GMT  
 Crash-proofing

Quote:



> >For checking >R and R> balancing at run-time, you simply have another
> >stack on which you push the current return stack depth when entering a
> >colon definition. On EXIT, you compare the current return stack depth
> >with the top of this special stack, and then you pop this value from
> >this stack.

> That sounds like a good idea.  Why not go farther and actually push the
> values onto the extra stack?  Then if you fail to balance it, you get
> to notice that it's not balanced but you don't crash.

You mean: keep the return address on the extra stack? That's certainly
possible.

BTW, the return stack also has to be balanced within the body of a
DO..LOOP. The technique could be extended for this application.

Quote:
> >Static return stack depth checking is not possible in general; the
> >checker would complain about legal programs;

> I've lost you here.  Why isn't it possible?

: foo ( n -- )
 dup begin
   dup 0>=
 while
   0 >r 1-
 repeat
 drop
 begin
   dup 0>=
 while
   r> drop 1-
 repeat
 drop ;

This is a legal program for all input values. It would be pretty hard
to check statically that it is balanced for all input words. In
general I think the problem is undecidable.

- anton
--
M. Anton Ertl                    Some things have to be seen to be believed

http://www.complang.tuwien.ac.at/anton/home.html



Sun, 23 Jan 2000 03:00:00 GMT  
 Crash-proofing

Quote:

> NNTP-Posting-Host: lax-ca38-54.ix.netcom.com
> X-NETCOM-Date: Wed Aug 06  8:29:27 AM PDT 1997


> (Anton Ertl) writes:


> >> That sounds like a good idea.  Why not go farther and actually push
> >>the values onto the extra stack?  Then if you fail to balance it, you
> >>get to notice that it's not balanced but you don't crash.

> >You mean: keep the return address on the extra stack? That's certainly
> >possible.

> My thought was to leave the real return address on the real return
> stack, and put a nest-sys token on the extra stack to represent it.

I'm not sure my original point got across, which is: There's an extra
stack that contains one item per active call. This item is the return
stack depth on entry to the definition; the EXIT (whether explicit or
implicit through ";") compares the current return stack depth depth
with the old return stack depth on the extra stack. If the depths are
equal, everything is ok, otherwise, you get an error message (not a
crash).

Quote:
> Ah.  Good point.  Suppose that the static type-checker warns when a
> loop leaves an unbalanced return stack as well as when a definition
> does.  When you really need this construction you could ignore the
> warning.  Are there other situations where it can't work?

This can happen with any control structure of sufficient complexity
(not just pairs of loops), and the solution is the same: issue a
warning; warnings are probably not frequent enough to be a bother (at
least for return stack depth checking).

- anton
--
M. Anton Ertl                    Some things have to be seen to be believed

http://www.complang.tuwien.ac.at/anton/home.html



Mon, 24 Jan 2000 03:00:00 GMT  
 Crash-proofing


[..]

Quote:
> OK.  My thought was to have an extra stack that held the actual values,
> and held dummies for actual returns.  You can not only check the depth,
> you can check that the top item at return is really a return.  (At
> least within error 1/MAXNUM.)  You don't have to CATCH for every user
> word.  It could have some advantages although it's extra overhead,

[ etcetera ]

Quote:
> I've done this and it works, but it's extra complexity and extra
> overhead.

Is it possible to give data which errors this catches, for whom under
which circumstances? I do not recognize my problems with Forth _at
all_ in this discussion.

Parameter and return stackbugs may happen when I enter a {*filter*} at the
keyboard (the OS and hardware exception handlers catch these immediately),
but IMO that's not a problem to rethink Forth for.

I've never found a stackbug in old code, Forth with a stackbug is almost
sure not to work at all. Occasionally I encounter out of bounds memory
accesses through pointers (like in Anton's BIND,). These are more difficult
to find, because you need intimate knowledge of the datastructures involved,
and the documentation is sometimes inadequate (which field is an offset,
which an address, is it initialized now or later or never, alignment...).

I can imagine it is nice to have a very safe system when compiles are slow
or the program is enormous. I remember being very frustrated with Charles
Moore's CMForth because it didn't have compiler security for control
structures. It happily compiled 32K through a serial 9600 baud line, then
crashed with the very first word I tried.

I've spend many, many days deciphering other people's Forth programs. The
biggest problems are:
- blocks (no room for documentation, _DON'T_ tell me this
  is a lie, just read through the older FD's, even Forth Inc's
  programmers squeeze[d])
- stack gymnastics -- 6 PICK 4 ROLL, with the old offset-by-one problem
- In the old days I also had constant fights with COMPILE [COMPILE] [ ]
  IMMEDIATE and tricks with the Forth vocabularies.
- lack of input/output test data with which to verify a word or the
  complete program. This is the biggest problem of all! The FSL has done
  away with this problem...

The differences between ." and .( , ASCII, IF not being able to be used
interactively, it being ": name" not "name :" -- no problem. Using CREATE
DOES> never bothered me, implementing it is another story.

-marcel



Mon, 24 Jan 2000 03:00:00 GMT  
 
 [ 24 post ]  Go to page: [1] [2]

 Relevant Pages 

1. crash-proofing -- stack-effect checking

2. Crash-proofing - execution tokens

3. Crash proofing a clip applic's input

4. Eiffel and Crash-Proof Computing

5. Crashproofing - test cases

6. Ariane Crash (Was: Adriane crash)

7. Where is your proof Ermine Todd?

8. "Death of Proof": reports exagerated

9. Collection API's VI (was: OO-Process - Proof [T])

10. Cool Proofs + Refactoring (Re: Scenes)

11. Speed proof

12. Blowfish algorithm - Is it full proof?

 

 
Powered by phpBB® Forum Software