How is this done in awk/nawk/gawk? 
Author Message
 How is this done in awk/nawk/gawk?

Hi,

I have this very simple shell script which is working fine:

***********
#!/bin/ksh

for i in 1 5 13 44 59 65 101 200 330 449 660
do
        grep "^test" $i > $i.out
done
***********

        The files 1 5 13 44 59 65 101 200 330 449 660 starts with the
word "test" in some of the lines in these files.

I'm just curious as to how to do this with awk/nawk/gawk?

        I have tried using the script shown below but I can't make it
work.  May be the "$i" is not being read or maybe the "for" statement is
differently used in awk/nawk/gawk.  If this is the case, how is this
done in awk/nawk/gawk?

***********
#!/bin/ksh

for i in 1 5 13 44 59 65 101 200 330 449 660
do
        nawk '$1 ~ /test/ {print $0}' $i > $i.out
done
***********

Thanks,
Albert



Fri, 04 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?

Quote:

> Hi,

> I have this very simple shell script which is working fine:

> ***********
> #!/bin/ksh

> for i in 1 5 13 44 59 65 101 200 330 449 660
> do
>         grep "^test" $i > $i.out
> done
> ***********

>         The files 1 5 13 44 59 65 101 200 330 449 660 starts with the
> word "test" in some of the lines in these files.

> I'm just curious as to how to do this with awk/nawk/gawk?

>         I have tried using the script shown below but I can't make it
> work.  May be the "$i" is not being read or maybe the "for" statement is
> differently used in awk/nawk/gawk.  If this is the case, how is this
> done in awk/nawk/gawk?

> ***********
> #!/bin/ksh

> for i in 1 5 13 44 59 65 101 200 330 449 660
> do
>         nawk '$1 ~ /test/ {print $0}' $i > $i.out
> done
> ***********

> Thanks,
> Albert

First of all, your regular expression is different in the nawk version
(although at worst, you would only get more lines than you bargained
for). I just made the small change in the regexp. Apart from that, it
should work. I tested on our SUN using nawk and ksh, and it gave proper
results.

#!/bin/ksh

for i in 1 5 13 44 59 65 101 200 330 449 660
do
        nawk '$1 ~ /^test/ {print $0}' $i > $i.out
done

Cesar
--
Please remove the uppercase characters from my e-mail address for the
real thing



Fri, 04 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?

Quote:

>> I have this very simple shell script which is working fine:
>> ***********
>> #!/bin/ksh
>> for i in 1 5 13 44 59 65 101 200 330 449 660
>> do
>>         grep "^test" $i > $i.out
>> done
>> ***********
>>         The files 1 5 13 44 59 65 101 200 330 449 660 starts with the
>> word "test" in some of the lines in these files.

>> I'm just curious as to how to do this with awk/nawk/gawk?

>> ***********
>> #!/bin/ksh
>> for i in 1 5 13 44 59 65 101 200 330 449 660
>> do
>>         nawk '$1 ~ /test/ {print $0}' $i > $i.out
>> done
>> ***********
>> Thanks,
>> Albert

To be pedantic, you dont need the {print $0} construct...
awk (by default) writes any record that satisfies the 'test'

ie.       nawk ' $1 ~ /^test/' $i > $i.out

Is sufficient
Mark
--
Mark Katz
ISPC, London - Innovation in data-delivery tools
Tel: (44) 181-455 4665, Fax (44) 181-458 9554
** See our website at http://www.efiche.com **



Fri, 04 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?

Quote:


>>> I have this very simple shell script which is working fine:
>>> ***********
>>> #!/bin/ksh
>>> for i in 1 5 13 44 59 65 101 200 330 449 660
>>> do
>>>         grep "^test" $i > $i.out
>>> done
>>> ***********
>>>         The files 1 5 13 44 59 65 101 200 330 449 660 starts with the
>>> word "test" in some of the lines in these files.

>>> I'm just curious as to how to do this with awk/nawk/gawk?

>>> ***********
>>> #!/bin/ksh
>>> for i in 1 5 13 44 59 65 101 200 330 449 660
>>> do
>>>         nawk '$1 ~ /test/ {print $0}' $i > $i.out
>>> done
>>> ***********
>>> Thanks,
>>> Albert

>To be pedantic, you dont need the {print $0} construct...
>awk (by default) writes any record that satisfies the 'test'

>ie.       nawk ' $1 ~ /^test/' $i > $i.out

>Is sufficient
>Mark

If we're going to get pedantic, then in this problem, it is not
necessary to specify the first field, but rather to go with the
default and check the line for "test" at the beginning:

i.e.       awk '/^test/' $i > $i.out

Is sufficient  :-)

But, I think the question may have been how to do the entire
script in awk/gawk, and to do that I'd suggest this:

gawk -f awkfile.awk 1 5 13 44 59 65 101 200 330 449 660

where awkfile.awk is:

{fileout=FILENAME ".out"}
/^test/ {print > fileout}

The above could also be put on the command line, something like:

gawk '{fileout=FILENAME ".out"}
      /^test/ {print > fileout}' 1 5 13 44 59 65 101 200 330 449 660

Chuck Demas
Needham, Mass.

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.



Fri, 04 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?

Quote:
> I have tried using the script shown below but I can't make it
> work. May be the "$i" is not being read or maybe the "for" statement
> is differently used in awk/nawk/gawk. If this is the case, how is this
> done in awk/nawk/gawk?

If the Bourne/Korn shell loop is working with the grep command, it
will work with the awk command.

This grep command

     grep '^test' $i >$i.out

translates into this awk command

     awk '$0 ~ /^test/ { print $0 }' $i >$i.out

which reduces to this

     awk '/^test/ { print $0 }' $i >$i.out

then this

     awk '/^test/ { print }' $i >$i.out

then this

     awk '/^test/' $i >$i.out

Note that this expression

     $1 ~ /test/

is NOT the same as this

     $0 ~ /^test/

and will, in general, not match the same text!

For this specific (trivial) case, grep is the better choice of tool.

--
Jim Monty

http://www.primenet.com/~monty/
Tempe, Arizona USA



Fri, 04 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?

Quote:




>>>> I have this very simple shell script which is working fine:
>>>> ***********
>>>> #!/bin/ksh
>>>> for i in 1 5 13 44 59 65 101 200 330 449 660
>>>> do
>>>>         grep "^test" $i > $i.out
>>>> done
>>>> ***********
>>>>         The files 1 5 13 44 59 65 101 200 330 449 660 starts with the
>>>> word "test" in some of the lines in these files.

>>>> I'm just curious as to how to do this with awk/nawk/gawk?

>>>> ***********
>>>> #!/bin/ksh
>>>> for i in 1 5 13 44 59 65 101 200 330 449 660
>>>> do
>>>>         nawk '$1 ~ /test/ {print $0}' $i > $i.out
>>>> done
>>>> ***********
>>>> Thanks,
>>>> Albert

>>To be pedantic, you dont need the {print $0} construct...
>>awk (by default) writes any record that satisfies the 'test'

>>ie.       nawk ' $1 ~ /^test/' $i > $i.out

>>Is sufficient
>>Mark

> If we're going to get pedantic, then in this problem, it is not
> necessary to specify the first field, but rather to go with the
> default and check the line for "test" at the beginning:

> i.e.       awk '/^test/' $i > $i.out

> Is sufficient  :-)

> But, I think the question may have been how to do the entire
> script in awk/gawk, and to do that I'd suggest this:

> gawk -f awkfile.awk 1 5 13 44 59 65 101 200 330 449 660

> where awkfile.awk is:

> {fileout=FILENAME ".out"}
> /^test/ {print > fileout}

> The above could also be put on the command line, something like:

> gawk '{fileout=FILENAME ".out"}
>       /^test/ {print > fileout}' 1 5 13 44 59 65 101 200 330 449 660

> Chuck Demas
> Needham, Mass.

Continuing the pedantry:

gawk '1==FNR {
   if (fileout) close(fileout) # don't exhaust file descriptors
   fileout=FILENAME ".out"
   }
/^test/ {print > fileout}' 1 5 13 44 59 65 101 200 330 449 660

would be more efficient.  And rather than building a separate
awk script file,  make the file a standalone executable:

#!/path/to/gawk -f
1==FNR {
if (fileout) close(fileout) # don't exhaust file descriptors
   fileout=FILENAME ".out"
   }
/^test/ {print > fileout}

then you would simply:

awkfile 1 5 13 44 59 65 101 200 330 449 660

Quote:

> --
>   Eat Healthy    |   _ _   | Nothing would be done at all,

>   Die Anyway     |    v    | That no one could find fault with it.


Opinions expressed herein are my own and may not represent those of my employer.


Fri, 04 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?

Quote:


> > To be pedantic, you dont need the {print $0} construct...
> > awk (by default) writes any record that satisfies the 'test'

> > ie.       nawk ' $1 ~ /^test/' $i > $i.out

> > Is sufficient

> If we're going to get pedantic, then in this problem, it is not
> necessary to specify the first field, but rather to go with the
> default and check the line for "test" at the beginning:

> i.e.       awk '/^test/' $i > $i.out

> Is sufficient  :-)

The difference between

     $1 ~ /^test/

and

     $0 ~ /^test/

is not a pedantic one and, as I stated in my earlier post, these
two operations will not, in general, match the same text. The two
expressions above are no more the same than these two similar
expressions:

     $1 == "test"

     $0 == "test"

For the specific case where a record contains exactly the four
characters "t", "e", "s", and "t", in that order, it happens that
both of these expressions will have the same value: 1 (or true).
But this coincidence does not imply that the two tests are
equivalent.

The same principle applies to the regular expression pattern
matching operations above. The value of both expressions will be
1 only if, by chance, the first four characters of the record are
"t", "e", "s", and "t". (Some pedantry about the value of FS goes
here. ;-)

--
Jim Monty

http://www.primenet.com/~monty/
Tempe, Arizona USA



Sat, 05 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?


Quote:


>> > To be pedantic, you dont need the {print $0} construct...
>> > awk (by default) writes any record that satisfies the 'test'

>> > ie.       nawk ' $1 ~ /^test/' $i > $i.out

>> > Is sufficient

>> If we're going to get pedantic, then in this problem, it is not
>> necessary to specify the first field, but rather to go with the
>> default and check the line for "test" at the beginning:

>> i.e.       awk '/^test/' $i > $i.out

>> Is sufficient  :-)

>The difference between

>     $1 ~ /^test/

>and

>     $0 ~ /^test/

>is not a pedantic one and, as I stated in my earlier post, these
>two operations will not, in general, match the same text. The two
>expressions above are no more the same than these two similar
>expressions:

My point is, that a line that starts with "test" (original poster's
problem constraint) will be selected by any of the following:

$1 ~ /^test/
$0 ~ /^test/
/^test/

but using $1 will cause the line to be parsed by awk/gawk.
That's added overhead.  And that overhead should be considered
in choosing which way to, "skin the cat."

Alright, I didn't spell it all out.  I type slowly, and I'm lazy.  :-)

Yes, a line that starts as "    test" will be selected by $1 ~ /test/
and not by the others, so they aren't the same, there is a difference;
but you're being far TOO PEDANTIC about this.  :-)

Try to remember how the original problem was stated.

Chuck Demas
Needham, Mass.

- Show quoted text -

Quote:

>     $1 == "test"

>     $0 == "test"

>For the specific case where a record contains exactly the four
>characters "t", "e", "s", and "t", in that order, it happens that
>both of these expressions will have the same value: 1 (or true).
>But this coincidence does not imply that the two tests are
>equivalent.

>The same principle applies to the regular expression pattern
>matching operations above. The value of both expressions will be
>1 only if, by chance, the first four characters of the record are
>"t", "e", "s", and "t". (Some pedantry about the value of FS goes
>here. ;-)

>--
>Jim Monty

>http://www.primenet.com/~monty/
>Tempe, Arizona USA

--
  Eat Healthy    |   _ _   | Nothing would be done at all,

  Die Anyway     |    v    | That no one could find fault with it.



Sat, 05 May 2001 03:00:00 GMT  
 How is this done in awk/nawk/gawk?

Quote:





>>> > To be pedantic, you dont need the {print $0} construct...
>>> > awk (by default) writes any record that satisfies the 'test'

>>> > ie.       nawk ' $1 ~ /^test/' $i > $i.out

>>> > Is sufficient

>>> If we're going to get pedantic, then in this problem, it is not
>>> necessary to specify the first field, but rather to go with the
>>> default and check the line for "test" at the beginning:

>>> i.e.       awk '/^test/' $i > $i.out

>>> Is sufficient  :-)

>>The difference between

>>     $1 ~ /^test/

>>and

>>     $0 ~ /^test/

>>is not a pedantic one and, as I stated in my earlier post, these
>>two operations will not, in general, match the same text. The two
>>expressions above are no more the same than these two similar
>>expressions:

> My point is, that a line that starts with "test" (original poster's
> problem constraint) will be selected by any of the following:

> $1 ~ /^test/
> $0 ~ /^test/
> /^test/

Of these,  only "$0 ~ /^test/" and "/^test/" match the original
grep pattern of "^test".  "$1 ~ /^test/"  will also match lines
that begin with whitespace (blanks and tabs) before the word test:

$ echo "     \t\t\ttest" | awk '$1 ~ /^test/ { print $1}'
test

This is a curiosity of the default field separator.  If you
change the FS,  leading FS chars will create blank fields.

In tests I ran,  I could detect no improvement in execution time
between "$0 ~ /^test/" and  "/^test/" either with awk (nawk) or
gawk.

- Show quoted text -

Quote:

> but using $1 will cause the line to be parsed by awk/gawk.
> That's added overhead.  And that overhead should be considered
> in choosing which way to, "skin the cat."

> Alright, I didn't spell it all out.  I type slowly, and I'm lazy.  :-)

> Yes, a line that starts as "    test" will be selected by $1 ~ /test/
> and not by the others, so they aren't the same, there is a difference;
> but you're being far TOO PEDANTIC about this.  :-)

> Try to remember how the original problem was stated.

> Chuck Demas
> Needham, Mass.

>>     $1 == "test"

>>     $0 == "test"

>>For the specific case where a record contains exactly the four
>>characters "t", "e", "s", and "t", in that order, it happens that
>>both of these expressions will have the same value: 1 (or true).
>>But this coincidence does not imply that the two tests are
>>equivalent.

>>The same principle applies to the regular expression pattern
>>matching operations above. The value of both expressions will be
>>1 only if, by chance, the first four characters of the record are
>>"t", "e", "s", and "t". (Some pedantry about the value of FS goes
>>here. ;-)

>>--
>>Jim Monty

>>http://www.primenet.com/~monty/
>>Tempe, Arizona USA

> --
>   Eat Healthy    |   _ _   | Nothing would be done at all,

>   Die Anyway     |    v    | That no one could find fault with it.


Opinions expressed herein are my own and may not represent those of my employer.


Sat, 05 May 2001 03:00:00 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. awk, nawk & gawk

2. AWK/NAWK/GAWK questions !

3. Differences between awk/nawk/gawk/mawk

4. Gawk bug, gawk won't nawk.

5. large nawk program migrating to gawk

6. RS problem in nawk (it works with gawk and mawk)

7. nawk -> gawk/mawk

8. RE different in NAWK and GAWK?

9. Performance gawk v nawk

10. help for new user looking to use awk nawk or mawk

11. Limit for the Line handle by NAWK/AWK on SOLARIS 2.5.7

12. dynamic formatting not available in awk/oawk/nawk?

 

 
Powered by phpBB® Forum Software