Pattern-matching algorithm info wanted
Author Message
Pattern-matching algorithm info wanted

I am looking for information that will help me write a pattern-matching
algorithm.  The patterns that I am interested in are less powerful than
full-strength regular expressions.  They will be used to check input in
a data entry program, for example, to make sure that a zip code field
holds 5 digits or that a telephone number field has a dash in the right
place.  A pattern might also contain a "*" to match zero or more items.

Thanks,
Greg

Tue, 27 May 1997 23:57:31 GMT
Pattern-matching algorithm info wanted

Quote:
> I am looking for information that will help me write a pattern-matching
> algorithm.  The patterns that I am interested in are less powerful than

One fairly common representation for this sort of pattern is a
"mask" field, for example an account number might be...

"99-9999-9999999-999"

Matching the input data to this sort of mask should be fairly
easy - each byte in the mask represents a type of character or a
literal, so you compare each byte in the mask with the byte in the
data field as follows...

int     check(sp,mp) /*=====================================*/
char    *sp,*mp;
{
while (*mp && *sp) {
switch (*mp) {
case ('9'):
if (!isdigit(*sp)) return(0);
break;
case ('!'):
if (!isascii(*sp) || !isupper(*sp)) return(0);
break;
case ('X'):
if (!isascii(*sp)) return(0);
break;
default:
if (*sp != *mp) return(0);
}
mp++;
sp++;
}
if (*sp || *mp) return(0);
return(1);
}

Note that it may not be an error if the entire pattern is not
matched. Sometimes the programmer will want to let the end of the

Quote:
> place.  A pattern might also contain a "*" to match zero or more items.

This complicates things. Does * mean zero or more of anything, or
zero or more of the next mask value? If it's of anything, then you
can use recursion to apply the rest of the mask to the rest of the
string.

If its zero or more of the next mask value, then it's a simple

case ('*'):
mp++;
switch (*mp) {
case (0):           /* Invalid mask */
return(0);
case ('9'):
while (isdigit(*sp)) sp++;
break;
case ('!'):
while (isascii(*sp) && isupper(*sp)) sp++;
break;
...

Note that these kinds of masks can be much more useful if applied
_when the user is entering the data_, rather than afterwards. A
good user interface package will not only provide masks, it will
apply them as the user types in data, and display the literals
automatically, as in displaying...

"  -    -       -   "

for the account number mask above. It can be tricky to implement
though, particularly in these days of edit fields that use
proportional fonts.

Quote:
> Greg

--

***             Count Templar, ELITE, Cobra Mk III (FM-287)             ***

Wed, 28 May 1997 15:23:53 GMT
Pattern-matching algorithm info wanted

Quote:
> I am looking for information that will help me write a pattern-matching
> algorithm.  The patterns that I am interested in are less powerful than
> full-strength regular expressions.  They will be used to check input in
> a data entry program, for example, to make sure that a zip code field
> holds 5 digits or that a telephone number field has a dash in the right
> place.  A pattern might also contain a "*" to match zero or more items.

> Thanks,
> Greg

The use of "*" makes the problem more complicated because you need the
ability to backtrack.  The other cases are reasonably simple.  You can
write code for inspecting the fields as arrays of character or go to a
more general solution which uses a field descriptor string as a template
for the check.  These methods are often used in windows dialog boxes so

Alternatively if you are using a unix environment there may be a function
called regex (from memory) which is just what you need since it handles
"*?" and more besides.  Another alternative is to use lex or flex which
are general purpose lexical analyser generators.  These are easy to use
and allow changes to be made very simply.

Finally if you are really keen and want to write your own you might look
up finite automata in:

"The Design and Analysis of Computer Algorithms". Aho, Hopcroft,

"Introduction to Automata Theory, Languages and Computation". Hopcroft,

"Compilers Principles, Techniques and Tools". Aho, Sethi, Ullman.

Regards

Bob Camp
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

+              Analogue and digital ic design.                         +
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Sat, 14 Jun 1997 23:58:59 GMT
Pattern-matching algorithm info wanted

Quote:
> I am looking for information that will help me write a pattern-matching
> algorithm.  The patterns that I am interested in are less powerful than
> full-strength regular expressions.  They will be used to check input in
> a data entry program, for example, to make sure that a zip code field
> holds 5 digits or that a telephone number field has a dash in the right
> place.  A pattern might also contain a "*" to match zero or more items.

> Thanks,
> Greg

Check out ftp.cdrom.com, /pub/algorithms/c/patmat.c.  It's part of the
"Algorithms" CD-ROM I'm compiling, and looks like it might do the trick
for you.

Regards,
Neal Bridges

Mon, 16 Jun 1997 02:47:05 GMT

 Page 1 of 1 [ 4 post ]

Relevant Pages