Forth lecture 10. Scripting.(Suite) 
Author Message
 Forth lecture 10. Scripting.(Suite)

Subject: Forth lecture 10. Scripting.

This is part of an attempt to formulate my Forth idea's in
lectures that are supposed to improve over time and are present
on my web site. Forth Scripting is lecture 10.

General disclaimer. My lectures are intended to convey insight and
show techniques that are hopefully useful. The code is in general
not ISO, nor portable, unless explicitly said so.

The first part of lecture 10 scripting was about getting rid of
the kind of start up message, prompting etc. that is inherent
to an interactive system like Forth. Before going on with
environment strings I have a small addition triggered by comment
of Anton Erl.

---------------------------------------------------------------
Simple scripting.

Let's say we have a Forth that shuts up if it senses that we
are talking to it through a channel, so not an interactive
terminal. Then in a Unix system we already have a practical
scripting system, in combination with the powers of the Unix
command interpreters, (called "shell").
For example a script to add 1 to 2 and print the results:

    forth < 'THEEND'
    1 2 + .
    BYE
    THEEND

This uses a feature called a here document. The remainder up till
"THEEND" is passed to the forth program.

Of course it is more useful to have a script called add, and pass
it the parameters 2 and 3:

    add 2 3
    5

The script would now look like:

    forth < THEEND
    $1 $2 + . CR
    BYE
    THEEND

The quotes are missing from THEEND. To the shell this means that it must
interpret the lines before passing them on.
In this case $1 gets replaced by 2 and $2 by 3.
The shell will also make the Unix environment available, a set of strings
with information about the environment a program is running in. An
environment variable is a name, not a number.
It is likewise preceded by $ , for example "$HOME" and expanded by
the shell to what it was set to.
Environment variables contains such things as the current
directory, the users name, and all sorts of information you
care to pass to programs, such as library names, or the
preferred place for video editing and cd writer programs to
write huge scratch files. The most famous is undoubtedly PATH.
It is a row of directories where the shell looks for programs.

Of course passing Forth code through a shell
is dangerous. Unix shells are the kind of tools as on
that picture of Brody. (On my page I will show the
hammer-screwdriver-whatnot if I can get permission.) It will do so many
things that at least one is unexpected, causing problems.
(Careful people can put all lines between single quotes by
default, but that is ugly.)

As an aside, the command interpreters on MSDOS systems are
plain bad in comparison. The default ones are all called
COMMAND , they change without notice, they are not powerful
and they are not sufficiently documented. There seems to
be an official Korn-shell for WINDOWS, but it is not according
to the specification (says a man named Korn. 1) ) However, that
being said, the above techniques apply to MSDOS mutatis
mutandis and can achieve useful results.
[ 1) I hope that is no urban legend. Even if it is, it is the kind of
anecdote that is true, even if it isn't. ]

The environment

A Forth running on a host operating system needs access to the
information available to all programs running there, called the
environment. This is especially true for scripts, because they
are mostly parts of a large body of cooperating small programs.
We have seen that even a simple Forth can do such scripting,
because a shell can give us the content of parameters and
environment variables. But to get serious, we must be able to
access them directly.

The Unix system, the Bourne shell and the Kernighan&Ritchie
c-compiler were all designed together. No wonder that they cooperate
well. A shell passes the command line arguments and the environment
variables to C as you can see in the declaration of main :

int main(int argc, char *argv[], char *env[]);

A c-program has nothing to translate, the parameters are just
there because the shell is expecting a c-program. On operating
system oriented towards other languages, such as MSDOS where
the systems programming languages is BASIC, a c-program needs a
preamble to analyse data area's. And is in that respect no
better off than Forth.

You see that a program also passes in int back. A zero indicates
a successful completion, any other number identifies an error condition,
comparable with a throw code.
It is a pity that Forth has no provision in BYE to pass
information back. (The standard could stipulate that "BYE takes
a system defined number of parameters". I don't think this
would break any existing code.)

What hook do we need in a Forth system to get at this information?
Under a Unix system this is typically extremely simple. On a Forth
that relies on C for the connection with the operating system,
such a gForth, it is both simple and portable. On a Forth defined
in assembler it is still quite simple, but system dependant.

A c-function gets its arguments via the stack. The function
main is no exception to this. It is sufficient to remember the
stack pointer.

The following example is from ciforth for GNU-Linux on Intel 386:

        MOV      LONG[USINI+(CW*(31))],ESP ;Remember ARGS.

ARGS is defined as a user variable with an offset of 31 cells
in the user area.

This is the dictionary entry:

ARGS    "arguments"        --- addr
Return the addr of ARGS, a user variable that
contains a system dependant pointer to any arguments that are
passed from the operating system to ciforth during startup.
In this ciforth it points to an area with the argument count,
followed by a a null ended array of arguments strings,
then by a null ended array of environment strings.

This leads to the following code.
The comment uses the Stallman convention, see lecture 3 (forth coming.)

    \  Return the NUMBER of arguments passed by Linux

    \ Return the argument VECTOR passed by Linux

    \ Return the environment POINTER passed by Linux

An indispensable word to deal with c-strings is also

    \ For a CSTRING (pointer to zero ended chars) return a STRING.

For example if forth is started with

    lina lHELLO_WORLD

The code

    HELLO_WORLD OK

would print the second argument, i.e. the first argument passed to forth.

Looking up an environment string

C-data structures are territory alien to Forth.
Looking up an environment string is not totally trivial.
Lets first define what we want:

GET-ENV    "get environment string"     sc1 -- sc2

So a string constant SC1 is passed in, and another one is passed out.
A string constant is an address length pair where you are not supposed
to reach through to change at the character level.
See forth lecture 13. (forth coming.)
For the possibility that an environment string is not found, the
following convention is used. The address of sc2 is zero.
This is called a NULL-string. Of course an environment string can
have zero characters. Then sc2 has a length of zero, but a non-zero address.

This convention is c-ish, and born from the impossibility to pass more
than one parameter back. In Forth you could define the stack diagram
as (sc1 -- sc2 false/true), But I don't like that. If you prefer that you
can always do

    ; GET-ENV   GET-ENV OVER ;

In programming the word GET-ENV I learned something. If you test a
word, and it fails, it may be too complicated. If a word contains
more than say 7 words or it contains a nested control structure,
you may conclude it is is too complicated from the very fact that
it fails a test. What did Jeff Fox say about Chuck Moore?
"He doesn't spend time debugging." The reason is that he makes
the words so simple that they work the first time. I may never
become as good a programmer as Chuck, but I can try to do the same
trick. As can you.
(And maybe Chuck doesn't get regular expressions right the first
time as often as I do.)

At first I tried to put GET-ENV in one word, but I found it
hard to debug it. Then I decided to split it up.

Back to looking up strings in the environment, we see that one
of three possibilities can occur in comparing with a particular
environment string. That environment string can be a
NULL-string, meaning we have reached the end of environment.
Otherwise it can compare equal, or unequal. This is
sufficiently complicate to warrant generating a new word for
it. Note that in addition we need a flag whether we
must go on searching.

For some reason I cannot recall, I have named this word (MENV)
Its implementation is rather straight forward now.

    \ For SC and ENVSTRING leave SC / CONTENT and GOON flag.
    : (MENV)   DUP 0= IF   DROP 2DROP 0. 0   ELSE

            IF   RDROP RDROP 1   ELSE   2DROP R> R> 0   THEN
        THEN ;

(&= is a denotation, see forth lecture 1 denotations. forthcoming.
read CHAR = or [CHAR] = for it in the mean time.)
If I didn't get that one right the first time, I would have factored out
the second line. That is the tricky part.

After $S ("string split") (see forthlecture 12 forthcoming) we have
three strings, the one to look up, the environment name and the
environment content. The environment content is put on the return
stack. Then we compare, keeping the string to lookup. Depending on the
outcome the content or the original string is dropped.

GET-ENV itself is now easy and need no further comment.

    ( Find a STRING in the environment, -its VALUE or NULL string)
    : GET-ENV ENV BEGIN $+ SWAP >R (MENV) WHILE R> REPEAT RDROP ;

And at last an example:

    "HOME" GET-ENV TYPE
    /home/albert OK

(" starts a denotation, it leaves a string constant.
See lecture 1 forthcoming. )

---------------------------------------------------------------------------------

Coming up next: cooroutines
Don' go away.

Coroutines form a separate lecture, but it is inserted because it is
needed for the regular expressions, we need in scripting.

I hope to recieve as much valuable feedback as from the previous
lecture post.

Albert
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
To suffer is the prerogative of the strong. The weak -- perish.



Fri, 02 Apr 2004 01:58:29 GMT  
 
 [ 1 post ] 

 Relevant Pages 

1. Forth lecture 10. Scripting.

2. modulus 10 script?

3. Problems with running Expect script on AIX 4.x (works fine on HP-UX 10.x)

4. Lectures in Forth: Regular expressions.

5. Lectures in Forth 15: code highlighting.

6. Top 10 Language Constructs (Forth)

7. Hello all-new to group Forth Programmer for 10+ years

8. Mind.Forth PD AI: 10.Aug.1999 Progress Report

9. North (S.F.) Bay Forth Interest Group (10/14, Berkeley)

10. Forth Opinion Poll from internet 10/94 results

11. Mind.Forth PD AI: 10.Aug.1999 Progress Report

12. Numbers from 1 to 10 (was Numbers from 1 to 10 in Over 4500 Languages)

 

 
Powered by phpBB® Forum Software