Yacc/Lex Description of Eiffel v3 
Author Message
 Yacc/Lex Description of Eiffel v3

Does anyone know where I can get a Yacc/Lex description
of the current v3 Eiffel language definition? I found
a v2 definition, but can't seem to locate a newer version.

        Bill
--
-------------------------------------------------------------------

 XMX Corporation                     |
 46 Manning Road                     |Telephone: (508)670-8444



Sun, 01 Mar 1998 03:00:00 GMT  
 Yacc/Lex Description of Eiffel v3

Quote:

> I tried to write one of these and came to the conclusion that
> Eiffel 3 is not LALR(1). The problem surfaces around assertions
> where:

>   assertion : ID ':' expression | expression

> gets confused with expressions in general. This would probably
> not happen with LL parsers which are top-down and would not get
> into this situation in the first place.

My YACC code (sorry, not available) didn't have any problem about
that one, but found two cases where END could either be the end of
the class or of a construct (one was DEBUG, was the other SELECT?),
depending on the next token or lack of same.

On much the same subject, has anyone had a problem parsing code
containing free operators? It seems that you can't tell what is a
free operator just by parsing it; you have to consider whether it
is declared within the scope of the expression (or even worse,
inherited!)  Consider "a|-f>b" (from ETL), which parses differently
depending on whether the free operator is "|-f" or as intended
"|-f>". The problem is that you can't treat special characters as
automatic delimiters anymore. (FWIW, Eiffel/S 1.0 failed to recognize
"p|-|q", also from ETL, as three tokens even when the free operator
is declared immediately prior to the routine containing the
expression.)



Sun, 08 Mar 1998 03:00:00 GMT  
 Yacc/Lex Description of Eiffel v3

Quote:


>: Does anyone know where I can get a Yacc/Lex description
>: of the current v3 Eiffel language definition? I found
>: a v2 definition, but can't seem to locate a newer version.

>I tried to write one of these and came to the conclusion that
>Eiffel 3 is not LALR(1). The problem surfaces around assertions
>where:

>  assertion : ID ':' expression | expression

>gets confused with expressions in general. This would probably
>not happen with LL parsers which are top-down and would not get
>into this situation in the first place.

When writing Eon/Eiffel, I used{*filter*}tail. This can cope with non LALR
situations by repairing it. It complains a lot but seems to come up with
the right answers.

There are other similar situations in Eiffel. One (if I remember correctly)
is the result of 'end' in the inheritance clause. All this is a consequence
of trying to do away with semi-colons.

--
Ian


EON Software                       Phone and Fax: +44 (0)1865 741452
                                   19 Stapleton Road
                                   Headington, Oxford, OX3 7LX, UK

     Authors of Eon/Eiffel, a shareware & commercial Eiffel compiler



Mon, 09 Mar 1998 03:00:00 GMT  
 Yacc/Lex Description of Eiffel v3
Instead of ordinary yacc, you could also use a backtracking variant, which
accepts a superset of standard yacc grammar specifications. (Since there's
usually only a small amount of backtracking necessary, until the ambiguity
in the input is resolved, performance shouldn't be much different from the
normal yacc if the grammar only has few simple problems). I don't remember
where I got it from, but you may be able to find out by mailing the author
who's email address is given in the README file which I'm appending below.

------------------------------------------------------------------------------

BTYACC -- backtracking yacc.

BTYACC was created by Chris Dodd using ideas from many places and lots
of code from the Berkeley Yacc distribution, which is a public domain
yacc clone put together by the good folks at Berkeley.  This code is
distributed with NO WARRANTEE and is public domain.  It is certain to
contain bugs, which you should report to:

See the README.BYACC, NEW_FEATURES, and ACKNOWLEDGEMENTS files for more
about Berkeley Yacc and other sources of info.

BTYACC is a modified version of yacc that supports automatic
backtracking and semantic disambiguation to parse ambiguous grammars, as
well as syntactic sugar for inherited attributes (which tend to
introduce conflicts).  Whenever a btyacc generated parser runs into a
shift-reduce or reduce-reduce error in the parse table, it remembers
the current parse point (yacc stack and input stream state), and goes
into trial parse mode.  It then continues parsing, ignoring most rule
actions.  If it runs into an error (either through the parse table or
through an action calling YYERROR), it backtracks to the most recent
conflict point and tries a different alternative.  If it finds a
successful parse (reaches the end of the input or an action calls
YYVALID), it backtracks to the point where it first entered trial parse
mode, and continues with a full parse (executing all actions),
following the path of the successful trial.

Actions in btyacc come in two flavors -- {}-actions, which are only
executed when not in trial mode, and []-actions which are executed
regardless of mode.  There are also inherited attributes, which look
like arguments (they're enclosed in `()') and act like []-actions.

What this buys you:

* No more lexer feedback hack.  In yacc grammars for C, a standard hack,
know as the `lexer feedback hack' is used to find typedef names.  The
lexer uses semantic information to decide if any given identifier is
a typedef-name or not and returns a special token.  With btyacc, you
no longer need to do this; the lexer should just always return an
identifier.  The btyacc grammar then needs a rule of the form:

typename: ID [ if (!IsTypeName(LookupId($1))) YYERROR; ]

While the hack works adequately well for parsing C, it becomes a
nightmare when you try to parse something like C++, where treating
an ID as a typedef becomes heavily dependent on context.

* Easy disambiguation via simple ordering.  Btyacc runs its trials via
the rule ``try shifting first, then try reducing by the order that the
conflicting rules appear in the input file''.  This means you can deal
with semantic a disambiguation rule like:
    [1] If it looks like a declaration it is, otherwise
    [2] If it looks like an expression it is, otherwise
    [3] it is a syntax error
            [Ellis & Stroustrup, Annotated C++ Reference Manual, p93]

To deal with this, you need only put all the rules for declarations
before the rules for expressions in the grammar file.

* No extra cost if don't use it.  Backtracking is only triggered when
the parse hits a shift/reduce or reduce/reduce conflict in the table.  If
you have no conflicts in your grammar, there's no extra cost, other than
some extra code which will never be invoked.

* C++ and ANSI C compatible parsers.  The parsers produced by btyacc
can be compiled with C++ correctly.  If you `#define' YYSTYPE to be
some C++ type with constructor and destructor, everything will work fine.
My favorite is `#define YYSTYPE SmartPointer', where SmartPointer is a
smart pointer type that does garbage collection on the pointed to
objects.

BTYACC was originally written to make it easy to write a C++ parser
(my goal was to be able to use the grammar out of the back of the
ARM with as few modifications as possible).  Anyone who has ever looked
at Jim Roskind's public domain C++ yacc grammar, or the yacc-based
grammar used in g++ knows how difficult this is.  BTYACC is very
useful for parsing any ambiguous grammar, particularly ones that
come from trying to merge two (or more) complete grammars.

Inherited attributes in btyacc:

Inherited attributes look a lot like function arguments to non-terminals,
which is what they end up being in a recursive descent parser, but NOT
how they're implemented in btyacc.  Basically they're just syntactic
sugar for embedded semantic actions and $0, $-1, ... in normal yacc.
btyacc gives you two big advantages besides just the syntax:
    1. it does type checking on the inherited attributes, so you don't
       have to specify $<type>0 and makes sure you give the correct`
       number of arguments (inherited attributes) to every use of a
       non-terminal.
    2. It `collapses' identical actions from that are produced from
       inherited attributes.  This eliminates many potential reduce-reduce
       conflicts arrising from the inherited attributes.

You use inherited attributes by declaring the types of the attributes in
the preamble with a type declaration and declaring names of the attributes
on the lhs of the yacc rule.  You can of course have more than one
rule with the same rhs, and you can even give them different names in each,
but the type and number must be the same.

Here's a small example:
%type <t1> lhs(<t1>, <t2>)    /* lhs takes two inherited attributes */
           stuff(<t1>, <t2>)
%%
lhs($i1, $i2) : { $$ = $i1 }
              | lhs($i1, $i2) stuff($1,$i2) { $$ = $2; }

This is roughly equivalent to the following yacc code:
lhs : { $$ = $<t1>-1; }
    | lhs [ $<t1>$ = $1; ] [ $<t2>$ = $<t2>0; ] stuff { $$ = $4; }
    ;

See the file `test/t2.y' for a longer and more complete example.
At the current time, the start symbol cannot have any arguments.


--


------------------------------------------------------------------------------
   *   wonder everyday   *   nothing in particular   *   all is special   *



Tue, 10 Mar 1998 03:00:00 GMT  
 Yacc/Lex Description of Eiffel v3

Quote:

> On much the same subject, has anyone had a problem parsing code
> containing free operators? It seems that you can't tell what is a
> free operator just by parsing it; you have to consider whether it
> is declared within the scope of the expression (or even worse,
> inherited!)  Consider "a|-f>b" (from ETL), which parses differently
> depending on whether the free operator is "|-f" or as intended
> "|-f>".

You have to use 'breaks' (e.g. blanks) to achieve the desired effect:

            a |-f> b

This follows from the definition of 'words' on page 414, ETL2 and
the syntax rule which says:

  It is permitted to write two adjacent tokens without an inter-
  vening break if and only if one is a word and the other is a
  symbol.

(from page 414, ETL2). An ordinary (non-free) operator is a symbol
but free operators are words. Since identifiers are also words, it
follows that you need a break between an identifier and a free
operator (likewise you need a break between identifiers and keywords
as in 'if x then'), etc.

Quote:
> (FWIW, Eiffel/S 1.0 failed to recognize "p|-|q", also from ETL,
> as three tokens even when the free operator is declared imme-
> diately prior to the routine containing the expression.)

Again, this is not the compiler's fault but the fault of the
person who forgot the intervening breaks. The above mentioned
syntax rule is there precisely to avoid ambiguities.

-- MS

SwisSoft, Michael Schweitzer      Geismar Landstr. 16  D-37083 Goettingen



Tue, 10 Mar 1998 03:00:00 GMT  
 Yacc/Lex Description of Eiffel v3


Quote:
(Michael Schweitzer) writes:

>> On much the same subject, has anyone had a problem parsing code
>> containing free operators? It seems that you can't tell what is a
>> free operator just by parsing it; you have to consider whether it
>> is declared within the scope of the expression (or even worse,
>> inherited!)  Consider "a|-f>b" (from ETL), which parses differently
>> depending on whether the free operator is "|-f" or as intended
>> "|-f>".

>You have to use 'breaks' (e.g. blanks) to achieve the desired effect:

>            a |-f> b

>This follows from the definition of 'words' on page 414, ETL2 and
>the syntax rule which says:

>  It is permitted to write two adjacent tokens without an inter-
>  vening break if and only if one is a word and the other is a
>  symbol.

>(from page 414, ETL2). An ordinary (non-free) operator is a symbol
>but free operators are words. Since identifiers are also words, it
>follows that you need a break between an identifier and a free
>operator (likewise you need a break between identifiers and keywords
>as in 'if x then'), etc.

I agree.  And I was essentially going to respond exactly the same way.
However, I think there is still a problem with the sentence which you
quoted. Since it is permitted to write two adjacent tokens without an
intervening break if (and only if) one is a word and the other is a
symbol, *and* a free operator is a "word", how can you correctly parse
this:
           a |-f>-b

in which the free operator is "|-f>" and the right-hand operand is "-b".
It isn't possible unless you follow the free operator with a break. I
think that the language specification should *clearly* state that
free operators *must* be surrounded by breaks (or at least followed by
a break), since all "symbols" are allowed as characters in a free operator
and without the break you can't possible correcly parse expressions
using free operators (ie: you can't tell where it ends).
---

Hainburg Germany (near Frankfurt, on the banks of the River Main)
"We have to believe in free will.  We've got no choice" -- Isaac B. Singer



Thu, 12 Mar 1998 03:00:00 GMT  
 Yacc/Lex Description of Eiffel v3

Quote:


> (Michael Schweitzer) writes:

> >> On much the same subject, has anyone had a problem parsing code
> >> containing free operators? It seems that you can't tell what is a
> >> free operator just by parsing it; you have to consider whether it
> >> is declared within the scope of the expression (or even worse,
> >> inherited!)  Consider "a|-f>b" (from ETL), which parses differently
> >> depending on whether the free operator is "|-f" or as intended
> >> "|-f>".

> >You have to use 'breaks' (e.g. blanks) to achieve the desired effect:

> >            a |-f> b

> >This follows from the definition of 'words' on page 414, ETL2 and
> >the syntax rule which says:

> >  It is permitted to write two adjacent tokens without an inter-
> >  vening break if and only if one is a word and the other is a
> >  symbol.

> >(from page 414, ETL2). An ordinary (non-free) operator is a symbol
> >but free operators are words. Since identifiers are also words, it
> >follows that you need a break between an identifier and a free
> >operator (likewise you need a break between identifiers and keywords
> >as in 'if x then'), etc.

> I agree.  And I was essentially going to respond exactly the same way.
> However, I think there is still a problem with the sentence which you
> quoted. Since it is permitted to write two adjacent tokens without an
> intervening break if (and only if) one is a word and the other is a
> symbol, *and* a free operator is a "word", how can you correctly parse
> this:
>       a |-f>-b

> in which the free operator is "|-f>" and the right-hand operand is "-b".
> It isn't possible unless you follow the free operator with a break. I
> think that the language specification should *clearly* state that
> free operators *must* be surrounded by breaks (or at least followed by
> a break), since all "symbols" are allowed as characters in a free operator
> and without the break you can't possible correcly parse expressions
> using free operators (ie: you can't tell where it ends).
> ---

> Hainburg Germany (near Frankfurt, on the banks of the River Main)
> "We have to believe in free will.  We've got no choice" -- Isaac B. Singer

Bravo to both postings. I had missed the lexical dictum quoted by
Michael Schweitzer - or rather its applicability to the problem.
As David Wasser points out, it still only solves half of the problem.
"|-f>-b" can be parsed legally in the following ways: "| -f>-b",
"|-f >-b", "|-f> -b" (Wasser's), "|-f>-b". Also note:

   1. If Wasser's parsing "|-f> -b" is syntactically valid, so is
      "| -f>-b".  Which one will compile depends on the type of the
      missing left-hand operand (and of course on which free operator
      is defined in the scope of the expression). In order to determine
      these things, the entire class and all its parents need to be
      compiled first.
   2. two out of three examples on page 68 of ETL are invalid.

      your hand. Me too.

Free-operators need to be delimited by breaks on both sides. Any

called out.



Fri, 13 Mar 1998 03:00:00 GMT  
 
 [ 7 post ] 

 Relevant Pages 

1. Looking for VHDL language description files for LEX and YACC

2. lex yacc fortran description

3. lex and yacc with Eiffel

4. lex, and yacc spec for eiffel ?

5. LEX-YACC interfacing with Eiffel

6. Request for ISE post of Eiffel lex/parser for Eiffel

7. Q: Lex&YACC smalltalk version ?

8. awk lex yacc (beginner question)

9. LEX and YACC in Smalltalk

10. lex/yacc

11. Lex & Yacc for Smalltalk

12. Harbour Project | Lex & Yacc Guide

 

 
Powered by phpBB® Forum Software