Regular expression string pattern matching: Embedding pop-11 procedures, and more 
Author Message
 Regular expression string pattern matching: Embedding pop-11 procedures, and more

Someone started a thread recently on this subject (which I unfortunately
failed to archive, which is why I'm not directly responding to it),
about enhancing poplog's regular expression string pattern matching.

Steve Knight in 1992 produced a library called lmatches (similar to
matches, but more efficient, and with additional functionality), one of
the main benefits of which was to allow matching to be constrained by a
procedural requirement, such that a pattern element could contain an
identifier and procedure, such that the pattern element would only match
if the procedure, when applied to the pattern element, would return a
true result:

E.g.,

 vars x, y;
  ( [1 2 3] lmatches [ %?x : isnumber, ??y% ] )=>
   x =>
  y =>

would yield:

  ** <true>
  ** 1
  ** [2 3]

This kind of thing would be useful in regexp pattern matching on

beginning and end of a Pop-11 procedural expression. It would also be
useful to do away with regexp's limitation of only having 1-9
subexpressions, and allow something more general (more akin to matches
or lmatches). The subexpressions could be identified by Pop-11
identifiers, or at least properties associated with the regular
expression procedure, rather than a numbered look-up table
(regexp_subexp).

(Steve also produced pmatches, here's a 1992 e-mail thread of ours on
the subject:
luc> There's a modification of matching that recently occurred to me. It
 luc> would involve doing something like matching a list against a
pattern,
 luc> except that the result would not be a boolean, (and there need not
be
 luc> side effects on identifiers, local or otherwise, though there
could be)
 luc> but the structure elements 'merged' into the pattern.

 Yes -- this seems like a nice variant on the pattern matching theme.
By a
 strange coincidence, -lmatches- can be easily modified to have pretty
much
 the effect you want.

 What I am thinking of is adding the idea of a transforming procedure to
a
 pattern variable.  In other words a pattern variable now looks like
this
     ? x : restriction # converter
 with the obvious optional bits. i.e.
     ? x
     ? x : r
     ? x # c
     ? x : r # c
 and the same for ?? variables.  The point of the converter procedure
would be
 to arrange for the pattern variables to be converted to different
results if
 the pattern match was successful.  Hmmmm not expressing myself well on
this.

 Here's an example of what I'm thinking about.

     instantiate( [1 2 3 4], [% ?? x # length, ?? y # length %] ) =>
     ** [0 4]

 Anyway, I think I'll experiment & let you know what I come up with.

)

I have a copy of the library, if anyone wants a copy (something for the
bham archive?)

I'll try to find some time to give this some thought and let the group
know if I come up with something interesting.

--
Luc Beaudoin
Abatis Systems Corporation
4190 Still Creek Drive, Suite 200
Burnaby BC V5C 6C6
E-mail address obfuscated for anti-span purpose:
e-mail username: lbeaudoin



Fri, 18 Jan 2002 03:00:00 GMT  
 Regular expression string pattern matching: Embedding pop-11 procedures, and more
Someone started a thread recently on this subject (which I unfortunately
failed to archive, which is why I'm not directly responding to it),
about enhancing poplog's regular expression string pattern matching.

Steve Knight in 1992 produced a Pop-11 library called lmatches (similar
to matches, but more efficient, and with additional functionality), one
of the main benefits of which was to allow matching to be constrained by
a procedural requirement, such that a pattern element could contain an
identifier and procedure, such that the pattern element would only match
if the procedure, when applied to the pattern element, would return a
true result:

E.g.,

 vars x, y;
  ( [1 2 3] lmatches [ %?x : isnumber, ??y% ] )=>
   x =>
  y =>

would yield:

  ** <true>
  ** 1
  ** [2 3]

This kind of thing would be useful in regexp pattern matching on

beginning and end of a Pop-11 procedural expression. It would also be
useful to do away with regexp's limitation of only having 1-9
subexpressions, and allow something more general (more akin to matches
or lmatches). The subexpressions could be identified by Pop-11
identifiers, or at least properties associated with the regular
expression procedure, rather than a numbered look-up table
(regexp_subexp).

sfk>(Steve also produced pmatches, here's a 1992 e-mail thread of ours
on the subject:
sfk>luc> There's a modification of matching that recently occurred to
me. It
sfk> luc> would involve doing something like matching a list against a
pattern,
sfk> luc> except that the result would not be a boolean, (and there need
not be
sfk> luc> side effects on identifiers, local or otherwise, though there
could be)
sfk> luc> but the structure elements 'merged' into the pattern.
sfk>
sfk> Yes -- this seems like a nice variant on the pattern matching
theme.  By a
sfk> strange coincidence, -lmatches- can be easily modified to have
pretty much
sfk> the effect you want.
sfk>
sfk> What I am thinking of is adding the idea of a transforming
procedure to a
sfk> pattern variable.  In other words a pattern variable now looks like
this
sfk>     ? x : restriction # converter
sfk> with the obvious optional bits. i.e.
sfk>     ? x
sfk>     ? x : r
sfk>     ? x # c
sfk>     ? x : r # c
sfk> and the same for ?? variables.  The point of the converter
procedure would be
sfk> to arrange for the pattern variables to be converted to different
results if
sfk> the pattern match was successful.  Hmmmm not expressing myself well
on this.
sfk>
sfk> Here's an example of what I'm thinking about.
sfk>
sfk>     instantiate( [1 2 3 4], [% ?? x # length, ?? y # length %] ) =>
sfk>     ** [0 4]
sfk>
sfk> Anyway, I think I'll experiment & let you know what I come up with.
sfk>

)

I have a copy of the library, e-mail me if you want it (something for
the bham archive?)

I'll try to find some time to give this some thought and let the group
know if I come up with something interesting.

--
Luc Beaudoin
Abatis Systems Corporation          http://www.abatis-sys.com
4190 Still Creek Drive, Suite 200
Burnaby BC V5C 6C6
E-mail address obfuscated for anti-span purpose:
e-mail username: lbeaudoin



Fri, 18 Jan 2002 03:00:00 GMT  
 Regular expression string pattern matching: Embedding pop-11 procedures, and more

Quote:

> Date: Mon, 02 Aug 1999 17:36:59 GMT
> Organization: Abatis-sys

> Someone started a thread recently on this subject (which I unfortunately
> failed to archive, which is why I'm not directly responding to it),

I don't remember this. Perhaps it was in another another news group?
(Actually not all news items get here: eg. over the weekend our news
feed was down because of a power failure on Friday night, and I know
some things did not reach us. I don't know if they will come later.)

Quote:
> about enhancing poplog's regular expression string pattern matching.

Are you talking about string matching or list matching?

A few years ago Pop-11 acquired for the first time a full regular
expression *string* matcher written by Jonathan Meyer. I can't
recall whether this was before or after you left us.

See
    TEACH REGEXP
    HELP  REGEXP
    REF   REGEXP

If you are talking about an extension to that, perhaps it would be
worth specifying what you think its limitations are.

Quote:
> Steve Knight in 1992 produced a library called lmatches (similar to
> matches, but more efficient, and with additional functionality), one of
> the main benefits of which was to allow matching to be constrained by a
> procedural requirement, such that a pattern element could contain an
> identifier and procedure, such that the pattern element would only match
> if the procedure, when applied to the pattern element, would return a
> true result:

This has always been a feature of the pop-11 list pattern matcher
since the mid 70s. See HELP MATCHES, which explains the use of
restriction procedures. However the restriction procedure is applied
not to the pattern element but to whatever in the data item matches
the pattern element (which could be a sub-list, in the case of
segment variables).

Quote:

> E.g.,

>  vars x, y;
>   ( [1 2 3] lmatches [ %?x : isnumber, ??y% ] )=>
>    x =>
>   y =>

It looks here as if you are talking about list matching, not string
matching.

The above would not compile unless you defined "?" and "??" as macros.

The standard pop11 matcher syntax for this is as follows:

    vars x, y;
    [1 2 3] matches [?x:isnumber ??y] =>
    ** <true>
    x, y =>
    ** 1 [2 3]

If you use my pattern prefix "!" then you can use lvars instead of
vars.

However, the pop11 list pattern matcher is not based on the full regexp
idea.

The idea of a "converter" procedure in a pattern is partially
catered for in the Pop-11 list pattern matcher. That's because of the
following fact documented in HELP MATCHES but not often noticed:
if the result of applying a restriction procedure to the value of a
pattern variable is neither false nor true, then that value is
assigned to the variable. (This was Steve Hardy's idea originally, I
think.)

This is used heavily in LIB GRAMMAR, which uses the pattern matcher
to define a parser compiler.

The basic idea is illustrated in two teach files:
    TEACH ISASENT
shows how to use the matcher to produce a sentence recogniser,
and
    TEACH PARSESENT
shows how to extend it to produce a parse tree, first of all by
separately recognising and parsing then by collapsing the two.

It includes this paragraph:

| The solution is to use yet another little known feature of MATCHES
| (sorry to keep introducing new features...). This feature will allow
| us to do away with the ISA procedures altogether and simply have the
| PARSE procedures. Moreoever, the PARSE procedures will be much
| shorter than those already described. This has to be good - less
| than half as much program to write and it works quicker too!

However, since I don't know what applications you have in mind, I
may have missed the point.

Incidentally, in Poplog version 15.52 John Gibson enhanced the
operator "=" to become a generalised structure matcher. See
    HELP EQUAL

which talks about the extension to "=", and the even more general
"equals" which can do back-tracking into non-deterministic matches
involving more than one segment variable, and therefore handles
things like this:

    ;;; matches fails
    vars x, y;
    [[1 2][2 1]]  matches  [[??x ??y][??y ??x]] =>
    ** <false>

    ;;; but equals, with the new pattern element syntax, succeeds
    lvars x, y;
    [[1 2][2 1]]  equals  [[=??x =??y][=??y =??x]] =>
    ** <true>
    x, y =>
    ** [1] [2]

Doing that sort of thing is impossible without either backtracking,
which means constructing a continuation after completing EVERY
potentially ambiguous segment match, or working in breadth-first
mode and constructing ALL POSSIBLE matches for every sub-list, then
returning lists of compatiable combinations.

This efficiency cost is why that capability is not built into "=".

But you now get it with "equals", if you want it.

The new pattern syntax also handles lvars, and allows vectors to be
used in patterns (though they can't contain segment variables), e.g.

    lvars x;
    [the cat {sat on the mat} this morning]
            equals [=** {sat on the =?x} =**], x  =>

    ** <true> mat

(That example would work with "=" instead of "equals").

In the new syntax

    old pattern element     new pattern element
            ?                   =?
            ??                  =??
            =                   =*
            ==                  =**

There are two restriction syntaxes, the old and a new one.

Later I may change my "!" pattern prefix so that it compiles the old
pattern syntax into the new form, which would then mean that the
examples in all my teach files will work if "matches" is redefined as a
synonym for equals, reducing the need to change dozens of teach files.

Aaron
===
--
Aaron Sloman, ( http://www.cs.bham.ac.uk/~axs/ )
School of Computer Science, The University of Birmingham, B15 2TT, UK
EMAIL A.Sloman AT cs.bham.ac.uk   (NB: Anti Spam address)
PAPERS: ftp://ftp.cs.bham.ac.uk/pub/groups/cog_affect/0-INDEX.html



Fri, 18 Jan 2002 03:00:00 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. Regular Expression for Match Pattern (string) Function

2. iss-matching - the free Regular Expression / Pattern Matching cluster

3. Bug in regular expression pattern matching?

4. Pattern-matching regular-expression algorithm?

5. Regular Expression Pattern Matching "State" Object

6. Regular expressions, pattern matching

7. pattern matching using regular expressions

8. Need code to match strings with regular expressions

9. problem with string match and regular expression

10. Pop-11 Pattern variables: the ANSWER??

11. Pop-11 Pattern variables: the ANSWER??

12. Reading POP-11 expressions without tears.

 

 
Powered by phpBB® Forum Software