Pattern-matching algorithm info wanted 
Author Message
 Pattern-matching algorithm info wanted

I am looking for information that will help me write a pattern-matching
algorithm.  The patterns that I am interested in are less powerful than
full-strength regular expressions.  They will be used to check input in
a data entry program, for example, to make sure that a zip code field
holds 5 digits or that a telephone number field has a dash in the right
place.  A pattern might also contain a "*" to match zero or more items.

Thanks,
Greg



Tue, 27 May 1997 23:57:31 GMT  
 Pattern-matching algorithm info wanted

Quote:
> I am looking for information that will help me write a pattern-matching
> algorithm.  The patterns that I am interested in are less powerful than

    One fairly common representation for this sort of pattern is a
    "mask" field, for example an account number might be...

        "99-9999-9999999-999"

    Matching the input data to this sort of mask should be fairly
    easy - each byte in the mask represents a type of character or a
    literal, so you compare each byte in the mask with the byte in the
    data field as follows...

        int     check(sp,mp) /*=====================================*/
        char    *sp,*mp;
        {
            while (*mp && *sp) {
                switch (*mp) {
                    case ('9'):
                        if (!isdigit(*sp)) return(0);
                        break;
                    case ('!'):
                        if (!isascii(*sp) || !isupper(*sp)) return(0);
                        break;
                    case ('X'):
                        if (!isascii(*sp)) return(0);
                        break;
                    default:
                        if (*sp != *mp) return(0);
                }
                mp++;
                sp++;
            }
            if (*sp || *mp) return(0);
            return(1);
        }

    Note that it may not be an error if the entire pattern is not
    matched. Sometimes the programmer will want to let the end of the
    mask be optional.

Quote:
> place.  A pattern might also contain a "*" to match zero or more items.

    This complicates things. Does * mean zero or more of anything, or
    zero or more of the next mask value? If it's of anything, then you
    can use recursion to apply the rest of the mask to the rest of the
    string.

    If its zero or more of the next mask value, then it's a simple
    matter of adding something like...

                    case ('*'):
                        mp++;
                        switch (*mp) {
                            case (0):           /* Invalid mask */
                                return(0);
                            case ('9'):
                                while (isdigit(*sp)) sp++;
                                break;
                            case ('!'):
                                while (isascii(*sp) && isupper(*sp)) sp++;
                                break;
                            ...

    Note that these kinds of masks can be much more useful if applied
    _when the user is entering the data_, rather than afterwards. A
    good user interface package will not only provide masks, it will
    apply them as the user types in data, and display the literals
    automatically, as in displaying...

            "  -    -       -   "

    for the account number mask above. It can be tricky to implement
    though, particularly in these days of edit fields that use
    proportional fonts.

Quote:
> Greg

--

***             Count Templar, ELITE, Cobra Mk III (FM-287)             ***


Wed, 28 May 1997 15:23:53 GMT  
 Pattern-matching algorithm info wanted


Quote:
> I am looking for information that will help me write a pattern-matching
> algorithm.  The patterns that I am interested in are less powerful than
> full-strength regular expressions.  They will be used to check input in
> a data entry program, for example, to make sure that a zip code field
> holds 5 digits or that a telephone number field has a dash in the right
> place.  A pattern might also contain a "*" to match zero or more items.

> Thanks,
> Greg


The use of "*" makes the problem more complicated because you need the
ability to backtrack.  The other cases are reasonably simple.  You can
write code for inspecting the fields as arrays of character or go to a
more general solution which uses a field descriptor string as a template
for the check.  These methods are often used in windows dialog boxes so
you may have the functions already in your program library.

Alternatively if you are using a unix environment there may be a function
called regex (from memory) which is just what you need since it handles
"*?" and more besides.  Another alternative is to use lex or flex which
are general purpose lexical analyser generators.  These are easy to use
and allow changes to be made very simply.

Finally if you are really keen and want to write your own you might look
up finite automata in:

"The Design and Analysis of Computer Algorithms". Aho, Hopcroft,
Ullman.  Addison Wesley.

"Introduction to Automata Theory, Languages and Computation". Hopcroft,
Ullman.  Addison Wesley.

"Compilers Principles, Techniques and Tools". Aho, Sethi, Ullman.  
Addison Wesley.

Regards

Bob Camp
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+              Analogue and digital ic design.                         +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



Sat, 14 Jun 1997 23:58:59 GMT  
 Pattern-matching algorithm info wanted


Quote:
> I am looking for information that will help me write a pattern-matching
> algorithm.  The patterns that I am interested in are less powerful than
> full-strength regular expressions.  They will be used to check input in
> a data entry program, for example, to make sure that a zip code field
> holds 5 digits or that a telephone number field has a dash in the right
> place.  A pattern might also contain a "*" to match zero or more items.

> Thanks,
> Greg


Check out ftp.cdrom.com, /pub/algorithms/c/patmat.c.  It's part of the
"Algorithms" CD-ROM I'm compiling, and looks like it might do the trick
for you.

Regards,
  Neal Bridges



Mon, 16 Jun 1997 02:47:05 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. glob pattern matching algorithm wanted

2. Algorithm : pattern matching

3. pattern matching algorithm

4. Regular Expressions/Pattern Matching/Unordered pattern

5. Graph matching algorithms wanted

6. Bit-string matching algorithm wanted

7. WANTED: string wildcard matching algorithm

8. Pattern Matching in C -

9. Pattern Matching Tool

10. Pattern Matching in C

11. Help, Pattern used in Regex.Matches(...)

12. pattern matching and string replacement

 

 
Powered by phpBB® Forum Software