AWK vs PERL speed question 
Author Message
 AWK vs PERL speed question

I have an AWK script that I can run that basically searches through a
file looking for match using this script:

"BEGIN { FOUND=0; } /^[0]*1547/ { FOUND=1; print; next; } /^\/\// { if
(FOUND) { print; next } else { next } } { if (FOUND) { exit } else {
next } } " filename

Using AWK this process takes about .85 seconds to execute.

I have (based on very limited PERL knowledge) translated the above
script into the following PERL script:

$found = 0;
if (! open (MSGFILE, "<$errfile"))
{
  print "ERROR - Can't open error message file $errfile...exiting\n";
  exit;

Quote:
}

while (<MSGFILE>)
{
  if ($found == 0)
  {
    if ($_ =~ /^[0]*1547/)
    {
      $found=1;
      print "$_";
    }
  }
  else
  {
    if ($_ =~ /^\/\//)
    {
      print "$_";
    }
    else
    {
      last;
    }
  }

Quote:
}

close MSGFILE;

Using PERL it now takes somewhere around 3 seconds to exeucte. I assume
that this is due to PERL reading the file sequentially line-by-line. I
don't know how AWK processes the file, but it is obviously much faster.

Does anyone have an alternative way to write this script so that it will
run faster?

Email preferred but not required!

--
Thanks,
================================================================
Mike Campbell                  Phone:    1-800-603-5500 x5472

Intergraph Corp.               Fax:      (205)730-1110
Huntsville, AL 35894-0001      Mailstop: CR071



Sat, 15 Aug 1998 03:00:00 GMT  
 AWK vs PERL speed question
: I have an AWK script that I can run that basically searches through a
: file looking for match using this script:

: "BEGIN { FOUND=0; } /^[0]*1547/ { FOUND=1; print; next; } /^\/\// { if
: (FOUND) { print; next } else { next } } { if (FOUND) { exit } else {
: next } } " filename

: Using AWK this process takes about .85 seconds to execute.

: I have (based on very limited PERL knowledge) translated the above
: script into the following PERL script:

[script deleted]

: Does anyone have an alternative way to write this script so that it will
: run faster?

Depending on your CPU and I/O speed, part of the difference is
Perl's load time -- you can't get around that.  But this should
cut your number of ops per line down, especially on that critical
first loop:

        open(MSGFILE, "$errfile") ||
          die "ERROR - Can't open error message file $errfile...exiting\n";

        while (<MSGFILE>) {
            (print, last) if /^[0]*1547/;
        }

        while (<MSGFILE>) {
            (print, next) if m#//#;
            last;
        }

        close MSGFILE;

Of course my awk (and system!) may not be as fast as yours, but on a
38,400 line file, with the match at line 38,000, this was awk:

        bill:/tmp% date; awk -f test.awk < test.words ; date
        Tue Feb 27 15:18:11 EST 1996
        00001547
        //
        //
        //
        Tue Feb 27 15:19:06 EST 1996
        bill:/tmp%

This was Perl, with my program above:

        bill:/tmp% date ; perl test.pl < test.words ; date
        Tue Feb 27 15:22:19 EST 1996
        00001547
        //
        //
        //
        Tue Feb 27 15:22:23 EST 1996
        bill:/tmp%

4 seconds for Perl vice 47 seconds for awk.  I don't know if
it is the awk on my Linux, but....

--
Regards,
Mike Heins     [mailed and posted]  http://www.iac.net/~mikeh ___       ___
                                    Internet Robotics        |_ _|____ |_ _|
Few blame themselves until they     131 Willow Lane, Floor 2  | ||  _ \ | |
have exhausted all other            Oxford, OH  45056         | || |_) || |
possibilities.                                               |___|  _ <|___|



Sat, 15 Aug 1998 03:00:00 GMT  
 AWK vs PERL speed question
[comp.lang.perl deleted from newsgroups line. c.l.p. left us a long
while ago now]


Quote:


>: I have an AWK script that I can run that basically searches through a
>: file looking for match using this script:
>: "BEGIN { FOUND=0; } /^[0]*1547/ { FOUND=1; print; next; } /^\/\// { if
>: (FOUND) { print; next } else { next } } { if (FOUND) { exit } else {
>: next } } " filename

I ran this using "time" (38002 lines of random numbers, with your number
at line 38001):

5.129s real  2.780s user  0.520s system  64% awk  /tmp/t

I've not tried your posted way, but here are two variants:
% time perl -ne 'print if /^0*1547/..1' /tmp/t
This uses the ".." bistable operator - which works a little like a sed
address-pair. The second operand needs to be true all the time, so I
used "1".
5.596s real  2.380s user  0.510s system  51% perl

The execution is a little faster than awk, but not much.

If you want it to go more quickly, at the expense of some memory, read
the lot in in one go by undefining $/ first:

perl -e 'undef $/; $_=<>; s/.*^0*1547//ms; print' /tmp/t

3.074s real  0.390s user  0.560s system  30% perl

Note the use of both the "m" and "s" flags at once to the regular
expression. Despite the mnemonics traditionally associated with this, it
does make sense.

Quote:
>4 seconds for Perl vice 47 seconds for awk.  I don't know if
>it is the awk on my Linux, but....

There's something wrong there. Most of the GNU utilities are pretty
brisk, and my awk (Gnu Awk (gawk) 2.15, patchlevel 5) seems to keep pace
quite well.

But do use the "time" command - you get more meaningful results.

BTW, I used 0* rather than [0]* - those brackets are certainly not
required in Perl, and I'd be surprised if they had any effect in awk,
either, apart from to slow things down a little.

Ian
--
    I am confident this explanation will dispell any feelings
    of certainty that may have been troubling you.



Sat, 15 Aug 1998 03:00:00 GMT  
 AWK vs PERL speed question
: [comp.lang.perl deleted from newsgroups line. c.l.p. left us a long
: while ago now]




: >: I have an AWK script that I can run that basically searches through a
: >: file looking for match using this script:

: >: "BEGIN { FOUND=0; } /^[0]*1547/ { FOUND=1; print; next; } /^\/\// { if
: >: (FOUND) { print; next } else { next } } { if (FOUND) { exit } else {
: >: next } } " filename
[snip]
: >4 seconds for Perl vice 47 seconds for awk.  I don't know if
: >it is the awk on my Linux, but....

: There's something wrong there. Most of the GNU utilities are pretty
: brisk, and my awk (Gnu Awk (gawk) 2.15, patchlevel 5) seems to keep pace
: quite well.

: But do use the "time" command - you get more meaningful results.

Here it is, with the posted awk program (- the quotes and the filename):

    bill:/tmp% time awk -f test.awk test.words
    00001547
    //
    //
    //
    6.910u 38.380s 0:45.74 99.0% 0+0k 0+0io 76pf+0w
    bill:/tmp% time perl test.pl test.words
    00001547
    //
    //
    //
    2.980u 0.280s 0:03.29 99.0% 0+0k 0+0io 18pf+0w
    bill:/tmp%  gawk --version
    Gnu Awk (gawk) 2.15, patchlevel 6
    bill:/tmp%

: BTW, I used 0* rather than [0]* - those brackets are certainly not
: required in Perl, and I'd be surprised if they had any effect in awk,
: either, apart from to slow things down a little.

There was no meaningful difference in either the Perl or the awk
with that change.  Perhaps Perl handles my selection of /usr/dict/words
better than awk (short lines, first char never matches).

--
Regards,
Mike Heins     [mailed and posted]  http://www.iac.net/~mikeh ___       ___
                                    Internet Robotics        |_ _|____ |_ _|
Few blame themselves until they     131 Willow Lane, Floor 2  | ||  _ \ | |
have exhausted all other            Oxford, OH  45056         | || |_) || |
possibilities.                                               |___|  _ <|___|



Sat, 15 Aug 1998 03:00:00 GMT  
 AWK vs PERL speed question
Get a life


Mon, 17 Aug 1998 03:00:00 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. AWK vs PERL speed question

2. Perl vs. C speed question

3. Perl vs C: CGI speed question

4. Speed question: patern matching vs. spliting

5. Perl VS Shell, Sed, Awk, etc.

6. Performance of perl vs awk and sed

7. AWK vs. PERL system variables

8. Perl vs Awk/Sed

9. AWK vs PERL - splitting fields

10. AWK vs Perl For Misc Data Processing Tasks

11. perl vs [n]awk ?

12. Sorting Speeds - UNIX vs PERL

 

 
Powered by phpBB® Forum Software