Help with REGEX and AWK 
Author Message
 Help with REGEX and AWK

Can anyone help me with this regular expression, I cannot seem to figure out
what I'm doing wrong.  I've been referencing the Unix in a Nutshell book on
pattern matching and AWK and can't understand how to write the expression.

I have a record in a file that contains 8 fields (blank delimited).  I want to
verify that the 2nd field matches either an single asterisk or any number of
alphanumeric characters but not to exceed 255 characters.

I tried this much but I'm having problems the the second part.

echo "*" | awk '{ if ( $1 ~ /^[*]$/ ) print "asterisk"}'

then trying to expand on that

echo "abcxyz" | awk '{ if ( $1 ~ /^([*])$|^(.*)${1,7}/ ) print $1}'

Any help appreciated.
Thanks



Sat, 02 Aug 2003 09:20:06 GMT  
 Help with REGEX and AWK

Quote:

> I tried this much but I'm having problems the the second part.

> echo "*" | awk '{ if ( $1 ~ /^[*]$/ ) print "asterisk"}'

> then trying to expand on that

> echo "abcxyz" | awk '{ if ( $1 ~ /^([*])$|^(.*)${1,7}/ ) print $1}'

> Any help appreciated.

$ cat foo
*
abc
abcxyz
abcxyz123
$ awk '$1 ~ /^\*$/' foo
*
$ awk '$1 ~ /^[[:alnum:]]{1,7}$/' foo
abc
abcxyz
$ awk '$1 ~ /^(\*|[[:alnum:]]{1,255})$/' foo
*
abc
abcxyz
abcxyz123
$ awk '$1 ~ /^([*]|[A-Za-z0-9_]{1,9})$/' foo
*
abc
abcxyz
abcxyz123
$ mksinfo | head -1 | sed 's/Serial Number.*//'
MKS Toolkit for Windows NT and Windows 95/98 Release 6.2a
$

--
Jim Monty

Tempe, Arizona USA



Sat, 02 Aug 2003 10:44:34 GMT  
 Help with REGEX and AWK

Quote:

> I tried this much but I'm having problems the the second part.

> echo "*" | awk '{ if ( $1 ~ /^[*]$/ ) print "asterisk"}'

> then trying to expand on that

> echo "abcxyz" | awk '{ if ( $1 ~ /^([*])$|^(.*)${1,7}/ ) print $1}'

The gawk manual offers some help.  Read th last para below carefully:
    `wh{3}y'
          matches `whhhy' but not `why' or `whhhhy'.

    `wh{3,5}y'
          matches `whhhy' or `whhhhy' or `whhhhhy', only.

    `wh{2,}y'
          matches `whhy' or `whhhy', and so on.

     Interval expressions were not traditionally available in `awk'.
     As part of the POSIX standard they were added, to make `awk' and
     `egrep' consistent with each other.

     However, since old programs may use `{' and `}' in regexp
     constants, by default `gawk' does _not_ match interval expressions
     in regexps.  If either `--posix' or `--re-interval' are specified
     (*note Command Line Options: Options.), then interval expressions
     are allowed in regexps.

You need the `--re-interval' command line flag

 $ echo "whhhy"|awk --re-interval  '/wh{2,}y/'
whhhy



Sat, 02 Aug 2003 12:10:44 GMT  
 Help with REGEX and AWK

Quote:

> Can anyone help me with this regular expression, I cannot seem to figure out
> what I'm doing wrong.  I've been referencing the Unix in a Nutshell book on
> pattern matching and AWK and can't understand how to write the expression.

> I have a record in a file that contains 8 fields (blank delimited).  I want to

example record:  "ok11    f12 f13 f14 f14 f15 f16 f17 f18"    

Quote:
> verify that the 2nd field matches either an single asterisk or any number of
> alphanumeric characters but not to exceed 255 characters.

i.e.:
     $2~/^[*]$/             # the single asterisk
 or  $2~/^[a-zA-Z0-9]+$/  &&  length($2)<=255  
     # all alnum characters,    no more than 255 characters

summed up:

awk '
1 {
    if ( $2~/^[*]$|^[a-zA-Z0-9]+$/ && length($2)<=255 ) {
         print $1, $2
    }
  } '      <<EOF

    ok11    f12 f13 f14 f14 f15 f16 f17 f18
    ok21    *   f23 f24 f24 f25 f26 f27 f28
not_ok31    ^   f33 f34 f34 f35 f36 f37 f38
EOF

Quote:

> I tried this much but I'm having problems the the second part.

> echo "*" | awk '{ if ( $1 ~ /^[*]$/ ) print "asterisk"}'

> then trying to expand on that

> echo "abcxyz" | awk '{ if ( $1 ~ /^([*])$|^(.*)${1,7}/ ) print $1}'

                                            ^^^^^^^^^^^    

^(.*)${1,7}         # not valid
regexp                                      

^      "anchor" to start of string
.*     any character, zero or more (zero is probably not what you want)
$      "anchor" to end of string    
{1,7}  repetitions of previous character, e.g.: X{1,7} means 1 to 7 X's,
        ${1,7} is meaningless as string has only one end ($).

see   man regexp(5)

Best regards, Torfinn



Sat, 02 Aug 2003 17:27:18 GMT  
 
 [ 4 post ] 

 Relevant Pages 

1. awk and regex

2. Q:regex, not really awk

3. Arrays in awk/awk help please!

4. Help with Awk, totally new to AWK programing

5. AWK newbie is looking for a AWK help with his 1st program

6. grep regex to ruby regex...

7. Regex++ or other Regex lib for Fortran?

8. Python regex / libc regex interactions

9. RegEx help need!

10. Need regex help (or bug in match)

11. Regex Help Please

12. regex help please please !

 

 
Powered by phpBB® Forum Software