CONTEST: Range Searching 
Author Message
 CONTEST: Range Searching

We're lately been stricken with questions about what comes down to
variations on the same essential theme: how to search a stream for matches
and then produce as output some number of lines before or after the match
as well as the matched line.  This has to work efficiently for an input
stream of unlimited length.

Perl's two range operators lend themselves to simple solutions for some
of these kinds of things, like

    # print lines between PAT1 and PAT2
    while (<>) {
        print if /PAT1/ .. /PAT2/;
    }

But that's not good for the general case of catching things before or
counting things afterwards

So here's the challenge: devise idiomatic solutions to the
following problems.  

    1) Write a "patfore" program that prints out up to N lines
       before the match as well as the match itself.  Here's
       the usage message:

            patfore [-B N] pattern [files ...]

    2) Write a "pataft" program that prints out up to N lines
       after the match as well as the match itself.  Here's
       the usage message:

            pataft [-A N] pattern [files ...]

    3) Write a "patba" program that prints out up to
       X lines before the match and Y lines after the match.

            patba [-A X] [-B T] pattern [files ...]

       or both N before and after:

            patba [-C N] pattern [files ...]

For Extra Credit:
    Provide alternate solutions that also coalesce with overlapping
    ranges.  For example, if you ask for 2 lines before and after, and
    lines 2, 4, and 5 all contain matches, the output should comprise
    lines [1-7] rather three separate output blocks showing lines.
    [1-4], [2-6], and [3-7].

Solutions will be judged upon these overall criteria:

    Objective:

      * correctness: does it actually do the right thing?  if not,
        nothing else matters. :-)
      * space efficiency: don't use more space than minimally needed
      * time efficiency: is your solution fast?
      * test coverage: do you include test data to check all border cases?

    Subjective:

      * conciseness:  keep it short.  no frills.  just a few lines.
      * clarity: is this understandable to a native perl speaker?
      * idiom: does this look like natural perl?  does it use cool
        perl features absent in other languages?
      * creativity: is your solution cleverly different from those
        of others?

To see what I mean by "idiomatic", here's an example of something I just
provided that is of a similar nature to these problems:

    # Print all line pairs where the first one ends
    # in YIN and the next starts with YANG.  
    while (<DATA>) {
        $n = /YIN$/ ... $n or next;
        $pair .= $_;
        1 - $n || next;
        /^YANG/ && print $pair;
        $pair = '';
    }

If you wanted to be more obvious, you could certainly write that as:

    while (<DATA>) {
        $n = /YIN$/ ... $n or next;
        $pair .= $_;
        next if $n == 1;
        print $pair if /^YANG/;
        $pair = '';
    }

The point of that demo is that these problems shouldn't take zillions of
lines to code up, and also that I consider range operators plenty obvious
to native perl speakers and potentially useful to this problem domain.
You don't *have* to use them, and some solutions don't so clearly call
for them as did this one, but do please feel free to employ them if you're

The prize is that winners will have their solutions published in the FAQ,
and if it's cool enough, perhaps even featured in the Perl Journal.

Please post all solutions and discussion to Usenet.

Good luck, and thanks for helping save the world!

--tom
--
"Without knowing what I am and why I am here, life is impossible."
                                - Leo Tolstoy  



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching

Quote:

> [...]
>     # Print all line pairs where the first one ends
>     # in YIN and the next starts with YANG.  
>     while (<DATA>) {
>        $n = /YIN$/ ... $n or next;
>        $pair .= $_;
>        1 - $n || next;
>        /^YANG/ && print $pair;
>        $pair = '';
>     }

Perhaps it was deliberate, but the above code doesn't snag the second
pair if it overlaps with the first, as in

YIN
YANGYIN
YANG

Cheers, Lew



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching


Quote:
>So here's the challenge: devise idiomatic solutions to the
>following problems.  

>    1) Write a "patfore" program that prints out up to N lines
>       before the match as well as the match itself.  Here's
>       the usage message:

>        patfore [-B N] pattern [files ...]

#!/usr/bin/perl -w
use strict;
# Find a pattern and print lines before the pattern.  By Kragen Sitaker,
# 1999-09-16.

# Number of lines to output.
my $n = 4;
my $pattern;


        $ARGV[0] eq '-B' and do {
                shift;  
                $n = 1 + shift;
                next arg;
        };
        do {
                $pattern = shift;
                last arg;
        };

Quote:
}

# This is probably a lousy way to determine whether we ran out of args.
die "Usage: patfore [-B N] pattern [files ...]\n" if not defined $pattern;


while (<>) {


        if (/$pattern/o) {

                # If the next line contains the pattern again, we don't
                # want to output this line again.

        }

Quote:
}
>    2) Write a "pataft" program that prints out up to N lines
>       after the match as well as the match itself.  Here's
>       the usage message:

>        pataft [-A N] pattern [files ...]

#!/usr/bin/perl -w
use strict;
# Print out lines matching a pattern and lines following them.
# Kragen Sitaker, 1999-09-16.

# Number of lines to print.
my $after = 4;
my $pattern;


        $ARGV[0] eq '-A' and do {
                shift;  
                $after = 1 + shift;
                next arg;
        };
        do {
                $pattern = shift;
                last arg;
        };

Quote:
}

# This is probably still a lousy way to determine whether we ran out of args.
die "Usage: pataft [-A N] pattern [files ...]\n" if not defined $pattern;

my $linestoprint = 0;
while (<>) {
        $linestoprint = $after if /$pattern/o;
        print if $linestoprint;
        --$linestoprint if $linestoprint;

Quote:
}
>    3) Write a "patba" program that prints out up to
>       X lines before the match and Y lines after the match.

>        patba [-A X] [-B T] pattern [files ...]

>       or both N before and after:

>        patba [-C N] pattern [files ...]

This program is *much* worse than the above programs.  I am much less
confident in its correctness.

#!/usr/bin/perl -w
use strict;
# Print out lines matching a pattern, a certain number of lines preceding
# the match lines, and a certain number of lines following them.
# Kragen Sitaker, 1999-09-16.

# Number of lines to print starting at the match and continuing on.
my $after = 4;
# Number of lines to print up to and including the match.
my $before = 4;

my $pattern;
# write sensible output?  Default is no.
# sensible output inserts a line saying ... in every place lines
# from the input file was left out, and prepends each output line copied
# from the input file with a space.  It's not the default because it wasn't
# in tchrist's spec.
# It doesn't affect the choice of which lines to output, just how to output
# them.
my $sensible = undef;


        $ARGV[0] eq '-A' and do {
                shift;  
                $after = 1 + shift;
                next arg;
        };
        $ARGV[0] eq '-B' and do {
                shift;  
                $before = 1 + shift;
                next arg;
        };
        $ARGV[0] eq '-C' and do {
                shift;
                $after = $before = 1 + shift;
                next arg;
        };
        $ARGV[0] eq '-s' and do {
                shift;
                $sensible = 'yes!';
                next arg;
        };
        do {
                $pattern = shift;
                last arg;
        };

Quote:
}

# This is probably even now a lousy way to determine whether we ran out
# of args.
die "Usage: patba [-s] [-A X] [-B T] [-C N] pattern [files ...]\n"
        if not defined $pattern;

# Every line goes exactly one place: either printed out or pushed on the
# preceding lines list.

my $linestoprint = 0;

# to determine if what we're printing out is contiguous with the last
# output text.

while (<>) {
        if ($linestoprint) {
                print $sensible ? (" ", $_) : $_;
                --$linestoprint;
        } else {


        }
        if (/$pattern/o) {

                # not full.
                print "...\n" if $sensible and $preceding_lines[0];

                # If the next line contains the pattern again, we don't
                # want to output this line again.

                $linestoprint = $after - 1;
        }

Quote:
}

# Did we read any lines we didn't print?
print "...\n" if $sensible and $preceding_lines[$#preceding_lines];

Quote:
>For Extra Credit:
>    Provide alternate solutions that also coalesce with overlapping
>    ranges.

Oops, sorry -- didn't see that.  I wouldn't want to try to use these
programs without that.

Quote:
>      * correctness: does it actually do the right thing?  if not,
>    nothing else matters. :-)

Well, I think that what I was hoping to do is the right thing.  I'm
certain my patfore and pataft do what I was hoping to do.  My patba
appears to do the right thing, but I have to go home now, so I am
posting this now :)

Quote:
>      * space efficiency: don't use more space than minimally needed

patba and pataft are probably perfect here.  patba uses more space than
minimally needed, but not a huge amount -- it keeps around an extra
line of lookbehind when, with just a few more lines of code, it could
just keep a boolean.  And it prepends a space to every line in
'sensible' mode.

Quote:
>      * time efficiency: is your solution fast?

Dunno.  :)

Quote:
>      * test coverage: do you include test data to check all border cases?

Here's patba.test, which includes only input, not expected output.  And
I don't think it's really complete, although it's not obvious to me
what I'm missing.  (Well, I know I'm missing the non-sensible cases,
but I'm confident those are at least as correct as the sensible ones.)

#!/bin/sh
../patba
read foo
../patba -C 3
read foo
../patba -s -C 0 pattern patba | less
../patba -s -C 1 pattern patba | less
../patba -s -C 2 pattern patba | less

patfore and pataft I tested by hand.

Quote:
>      * conciseness:  keep it short.  no frills.  just a few lines.

True of patfore and pataft.   Probably not true of patba.  You'll
probably think $sensible is a frill, but I think it's a necessity to
understand the output.

Quote:
>      * clarity: is this understandable to a native perl speaker?

I've never met one.  And don't you mean Perl, not perl?

Quote:
>      * idiom: does this look like natural perl?  does it use cool
>    perl features absent in other languages?

I don't know.  :)

Quote:
>      * creativity: is your solution cleverly different from those
>    of others?

Probably not.
--

Thu Sep 16 1999
53 days until the Internet stock bubble bursts on Monday, 1999-11-08.
<URL:http://www.pobox.com/~kragen/bubble.html>


Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching
 [courtesy cc of this posting mailed to cited author]


:Perhaps it was deliberate, but the above code doesn't snag the second
:pair if it overlaps with the first, as in
:
:YIN
:YANGYIN
:YANG

No, it wasn't really deliberate.  My test cases weren't good enough
to catch that.  

--tom
--
"Politics is a pendulum whose swings between anarchy and tyranny are fueled by
 perennially rejuvenated illusions."
                                - Albert Einstein



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching

Quote:

> For Extra Credit:
>     Provide alternate solutions that also coalesce with overlapping
>     ranges.  For example, if you ask for 2 lines before and after, and
>     lines 2, 4, and 5 all contain matches, the output should comprise
>     lines [1-7] rather three separate output blocks showing lines.
>     [1-4], [2-6], and [3-7].

The extra credit problem is IMHO the only fun one so that's what
I'll try.  Also it's the most sensible, the first three problems
produce unexpected (to a human) output.

Quote:
> Solutions will be judged upon these overall criteria:

>     Objective:

>       * correctness: does it actually do the right thing?  if not,
>         nothing else matters. :-)
>       * space efficiency: don't use more space than minimally needed
>       * time efficiency: is your solution fast?
>       * test coverage: do you include test data to check all border cases?

I notice readability is not on the list, heh.  I believe this
solution is correct but I haven't put together truly rigorous
test cases so I can't be sure.  It's pretty space-efficient,
printing and forgetting unneeded data, and I suspect time-
efficiency is OK (though being I/O-bound is likely of course).

Quote:
>     Subjective:

>       * conciseness:  keep it short.  no frills.  just a few lines.
>       * clarity: is this understandable to a native perl speaker?
>       * idiom: does this look like natural perl?  does it use cool
>         perl features absent in other languages?
>       * creativity: is your solution cleverly different from those
>         of others?

The core code is just a few lines, please pardon the command
line arguments and whatnot.  Opinions may differ as to how
"native" my coding style is:  if I look too much like a
reformed Pascal programmer I can scuzz it up a bit!  :-)

#!/usr/bin/perl -w

# Parse the command-line switches.

$after = $before = 0; # Defaults.

   SWITCH: {
      $1 eq 'A' and $after  = $ARGV[1];
      $1 eq 'B' and $before = $ARGV[1];
      $1 eq 'C' and $before = $after = $ARGV[1];
   }
   shift, shift;

Quote:
}

# Parse the remaining command-line argument(s).



# Set constants and globals.

$prelude  = ">>>\n"; # Or the empty string;

$match = $skimmed = 0;

# The main loop.  The key is to track the last-matched
# line number in $match and of course the current line
# number in $. -- the difference between them tells us
# whether we're done matching and thus need to dump the
# context of the match(es) to STDOUT.

while (defined($line = <>)) {

   $match = $. if $line =~ /$regex/o;
   if ($match) {
      if ($. - $match > $before+$after) {
         &dump;
      } elsif ($. - $match > $before) {
         # One of the design goals is to run in the smallest
         # space possible.  That means immediately printing
         # any lines we know for a fact can be printed, and
         # shifting them off the buffer to free up their RAM.
         # The beauty is that we always know how near to the
         # _end_ of the buffer to dump, so we're free to skim
         # values off its _front_ if appropriate.
         print $prelude if !$skimmed++;

      }
   }
   # If we're not matching and we're saving too much context,
   # we can shift off one line of that context.

Quote:
}

# After we run off the end of the last file, if we are still
# matching, it can be a little tricky to properly print context
# around the last lines (consider the case where the last match
# was less than $before+$after but more than $after lines ago).
# The simplest way to handle it is to pretend we have already
# "read" lines containing the empty string, up to the point
# where we would normally dump the buffer.  That way there is
# no special case.

if ($match) {

   &dump;

Quote:
}

# Dump the contents of the circular buffer to screen.

sub dump {
   print
      $skimmed ? "" : $prelude,

      $postlude;

   $match = $skimmed = 0;

Quote:
}

__END__

Sample runs:

$ ./patba.pl -C 0 '^Aa|^Zu|perl' /usr/dict/words
Aarhus
Aaron
<<<
improperly
<<<
properly
<<<
superlative
superlatively
superlatives
<<<
Zulu
Zulus
Zurich
<<<

$ cat /usr/dict/words | ./patba.pl -C 2 '^Aa|^Zu|perl'
Aarhus
Aaron
Ababa
aback
<<<
impromptu
improper
improperly
impropriety
improve
<<<
propensity
proper
properly
properness
propertied
<<<
superiority
superiors
superlative
superlatively
superlatives
supermarket
supermarkets
<<<
Zoroaster
Zoroastrian
Zulu
Zulus
Zurich
<<<

--
 Jamie McCarthy



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching

Quote:
>So here's the challenge: devise idiomatic solutions to the
>following problems.  

Can we abbreviate the solutions a bit --- omit the command-line
parsing as being boring?  While essential to an actual utility
the command-line parsing tends to clutter newsgroup postings
and obscure the stark beauty of any truly elegant solution.
(I am not making the claim that my code below qualifies as
"truly elegant", though I think my solution to (1) comes close.)

Quote:
>    1) Write a "patfore" program that prints out up to N lines
>       before the match as well as the match itself.  Here's
>       the usage message:

>        patfore [-B N] pattern [files ...]




Quote:
>    2) Write a "pataft" program that prints out up to N lines
>       after the match as well as the match itself.  Here's
>       the usage message:

>        pataft [-A N] pattern [files ...]

  /pattern/ and $end = $. + $N;
  print if $. <= $end;

Quote:
>    3) Write a "patba" program that prints out up to
>       X lines before the match and Y lines after the match.

>        patba [-A X] [-B T] pattern [files ...]

>       or both N before and after:

>        patba [-C N] pattern [files ...]

Assume that the command-line parsing of -C N sets $X=$T=$N .


  if (($matched = /pattern/) || $. <= $end) {


    $end = $matched && $. + $X;
  } else {

  }

Quote:
>For Extra Credit:
>    Provide alternate solutions that also coalesce with overlapping
>    ranges.  For example, if you ask for 2 lines before and after, and
>    lines 2, 4, and 5 all contain matches, the output should comprise
>    lines [1-7] rather three separate output blocks showing lines.
>    [1-4], [2-6], and [3-7].

The above solutions all coalesce; they become more cumbersome if
they are altered to avoid coalescing.

                --Ken Pizzini



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching

Quote:

> We're lately been stricken with questions about what comes down to
> variations on the same essential theme: how to search a stream for matches
> and then produce as output some number of lines before or after the match
> as well as the matched line.  This has to work efficiently for an input
> stream of unlimited length.
<snip>
> So here's the challenge: devise idiomatic solutions to the
> following problems.  

>     1) Write a "patfore" program that prints out up to N lines
>        before the match as well as the match itself.
>     2) Write a "pataft" program that prints out up to N lines
>        after the match as well as the match itself.

>     3) Write a "patba" program that prints out up to
>        X lines before the match and Y lines after the match.

It seems to me that 3) does everything 1) and 2) do.

Therefore, here is my single solution:

#!perl -w

use Getopt::Std;

foreach ($opt_A, $opt_B, $opt_m) {$_ ||= 0}
$pat =~ s!^/(.*)/$!$1!;


  open IN or die "Can't open $IN: $!";



    if (/$pat/o) {
      print "---\n" if $noncontiguous;

      seek(IN, $lindex[0], 0);

        my $y = <IN>; last if !defined $y;
        print(($opt_m ? ($i == $#lindex ? '+ ' : '  ') : '') . $y);
      }

      $noncontiguous = 0;
    } else {$noncontiguous = 1}
  }

Quote:
}

sub usage {print <<"END" and exit}
$0 [-m] [-B lines] [-A lines] /pattern/ [files...]

  -m    Print indication of matched line
  -B n  Print n lines before the match
  -A n  Print n lines after the match

  Slashes on the pattern are optional.

END

__END__

Quote:
> Solutions will be judged upon these overall criteria:
<snip>
>     Subjective:
>       * conciseness:  keep it short.  no frills.  just a few lines.

Nope - plenty of frills here.

--
 Kevin Reid: |    Macintosh:      
  "I'm me."  | Think different.



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching
     [courtesy cc of this posting mailed to cited author]

In comp.lang.perl.misc,



:}      
:$last_n++;

:       my $open_name = $file_name;
:       if ($file_name eq '-') {
:               $open_name = "<&STDIN";
:               $file_name = '';

Is there some reason why normal <ARGV> isn't what you want?
You seem to be emulating what it already does.


    while ($ARGV = shift) {
        open(ARGV, $ARGV) || do {
            warn "cannot open $ARGV: $!\n";
            next;
        };
        while (<ARGV>) {
        ...         # code for each line
        }
    }

--tom
--
If you want capitalism, go to Russia!



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching


Quote:
> So here's the challenge: devise idiomatic solutions to the
> following problems.

>     1) Write a "patfore" program that prints out up to N lines
>        before the match as well as the match itself.  Here's
>        the usage message:

system("/gnu/egrep -B linesbefore whatever");

Quote:
>     2) Write a "pataft" program that prints out up to N lines
>        after the match as well as the match itself.  Here's
>        the usage message:

system("/gnu/egrep -A linesafter whatever");

Quote:
>     3) Write a "patba" program that prints out up to
>        X lines before the match and Y lines after the match.

system("/gnu/egrep -B linesbefore -A linesafter whatever");

Quote:
>     Provide alternate solutions that also coalesce with overlapping
>     ranges.  For example, if you ask for 2 lines before and after, and
>     lines 2, 4, and 5 all contain matches, the output should comprise
>     lines [1-7] rather three separate output blocks showing lines.
>     [1-4], [2-6], and [3-7].

Not quite sure what GNU egrep does there.  OK this was a silly post
but semi seriously if you don't mind the toolbox approach...

Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.



Mon, 04 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching


Quote:
>So here's the challenge: devise idiomatic solutions to the
>following problems.  

>    1) Write a "patfore" program that prints out up to N lines
>       before the match as well as the match itself.  Here's
>       the usage message:

>        patfore [-B N] pattern [files ...]

Sorry, only have time to solve one of them. Not as elegant
as some of the others, but it works and is conservative
with resources. It also collates ranges and processes
STDIN as an option.

#!/usr/bin/perl -w

use strict;
use Getopt::Long;

my $opts = {};
GetOptions($opts,'B=i');
my $last_n = defined ($opts->{'B'}) ? $opts->{'B'} : 0;

my $pattern   = shift;
if (not defined $pattern) {
        die <<"USAGE";
patfor [-B N] pattern [files ...]
USAGE

Quote:
}




Quote:
}      

$last_n++;

        my $open_name = $file_name;
        if ($file_name eq '-') {
                $open_name = "<&STDIN";
                $file_name = '';
        } else {
                $file_name .= ':';
        }      
        if (not open (THE_FILE,$open_name)) {
                warn("WARNING: Unable to open '$open_name' ($!) - skipping\n");
                next;
        }

        my $buffer_pointer  = -1;
        my $buffer_n        = 0;
        my $last_line       = 0;
        while (<THE_FILE>) {
                chomp;
                $buffer_pointer++;
                $buffer_pointer %= $last_n;
                $circular_buffer[$buffer_pointer] = "$file_name\[$.] $_\n";
                $buffer_n++ if ($buffer_n != $last_n);
                if (m/$pattern/o) {
                        my $buffer_offset = 1 - $buffer_n;
                        if (($. + $buffer_offset) <= $last_line) {
                                $buffer_offset = $last_line  - $. + 1;
                        }
                        my $start_at = $buffer_pointer + $buffer_offset;

                        $last_line = $.;
                }
        }
        close(THE_FILE) || die ("Failed to close handle for file '$open_name': $!\n");

Quote:
}

--
Benjamin Franz


Tue, 05 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching

Quote:

>     2) Write a "pataft" program that prints out up to N lines
>        after the match as well as the match itself.  Here's
>        the usage message:

>             pataft [-A N] pattern [files ...]

Okay, I'll take the easy one.

    perl -ne 'print if ($c=0, /pat/) .. $c++ == 3' files

If one really wanted to save this then something like this would do.

#!/usr/local/bin/perl -s

$A &&= shift;
$pat = shift;

pataft [-A N] pattern [files ...]
USAGE

while(<>){
    print if ($c = 0, /$pat/o) .. $c++ == $A;

Quote:
}

__END__

Of course we probably shouldn't encourage skipping -w (though I see no
need for 'use strict' for something so short).

Adding -w means changing

    $A &&= shift;
to
    $A = $A ? shift : 0;

Quote:
> For Extra Credit:
>     Provide alternate solutions that also coalesce with overlapping
>     ranges.  For example, if you ask for 2 lines before and after, and
>     lines 2, 4, and 5 all contain matches, the output should comprise
>     lines [1-7] rather three separate output blocks showing lines.
>     [1-4], [2-6], and [3-7].

A slight rearrangement after USAGE give this alternate solution:

$c = $A + 1;
while(<>){
    $c = 0 if /$pat/o;
    print  if $c++ <= $A;

Quote:
}

, which is pretty much like the other posters' solutions.  

But I don't like it because it doesn't make a nice one-liner as above.
I guess I still have grep.

--
Rick Delaney



Tue, 05 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching

Quote:
>    1) Write a "patfore" program that prints out up to N lines
>       before the match as well as the match itself.  Here's
>       the usage message:

>        patfore [-B N] pattern [files ...]

Two versions.  The first will output each match.  The second will coalesce
overlaps.  Both actually store more (maybe 100) lines than necessary, trading
a bit of space for a bit of time (less shifting of arrays).  They are still
space efficient though in that they won't slurp whole files.

#!/usr/bin/perl
# patfore [-B N] pattern [files ...]
# Each match by itself.


    ? (shift, shift)[1]
    : 3;
$max = $b > 100 ? $b : 100;

$re = shift;
defined $re or die "usage: patfore [-B N] pattern [files ...]\n";

while (<>) {



Quote:
}

__END__

#!/usr/bin/perl
# patfore [-B N] pattern [files ...]
# Will coalesce overlaps.


    ? (shift, shift)[1]
    : 3;
$max = $b > 100 ? $b : 100;

$re = shift;
defined $re or die "usage: patfore [-B N] pattern [files ...]\n";

while (<>) {

    unless (/$re/o) {

        next;
    }

    COAL: for (1..$b) {
        last if eof;

        if ($lines[-1] =~ /$re/o) {

            goto COAL;
        }
    }
    print "]\n\n";

Quote:
}

__END__

Quote:
>    2) Write a "pataft" program that prints out up to N lines
>       after the match as well as the match itself.  Here's
>       the usage message:

>        pataft [-A N] pattern [files ...]

Again two versions.  The first uses a seek() without checking for errors, so
it's not so robust.  Also, if there's any overlap, it ends up reading the
same lines twice.  It would have been better to keep an array of lines
around, but I'm sufficiently lazy right now not to care.

#!/usr/bin/perl
# pataft [-A N] pattern [files ...]
# Each match by itself.


    ? (shift, shift)[1]
    : 3;

$re = shift;
defined $re or die "usage: pataft [-A N] pattern [files ...]\n";

while (<>) {
    /$re/o or next;
    $tell = tell ARGV;
    print "Match #", ++$i, " [\n$_";
    for (1..$a) {
        last if eof;
        print scalar <>;
    }
    print "]\n\n";
    seek ARGV, $tell, 0;

Quote:
}

__END__

#!/usr/bin/perl
# pataft [-A N] pattern [files ...]
# Will coalesce overlaps.


    ? (shift, shift)[1]
    : 3;

$re = shift;
defined $re or die "usage: pataft [-A N] pattern [files ...]\n";

while (<>) {
    /$re/o or next;
    print "Group #", ++$i, " [\n$_";
    COAL: for (1..$a) {
        last if eof;
        $line = <>;
        print $line;
        goto COAL if $line =~ /$re/o;
    }
    print "]\n\n";

Quote:
}

__END__

Quote:
>    3) Write a "patba" program that prints out up to
>       X lines before the match and Y lines after the match.

>        patba [-A X] [-B T] pattern [files ...]

>       or both N before and after:

>        patba [-C N] pattern [files ...]

#!/usr/bin/perl
# patba [ [ [-A X] [-B Y] ] | [-C N] ] pattern [files ...]
# Each match by itself.


    /A/ ? ($a) :
    /B/ ? ($b) :
    ($a, $b) = (shift) x 2 for shift;

Quote:
}

defined or $_ = 3 for $a, $b;
$b++;
$max = $b > 100 ? $b : 100;

$re = shift;
die "usage: patba [ [ [-A X] [-B Y] ] | [-C N] ] pattern [files ...]\n"
    unless defined $re;

while (<>) {


    /$re/o or next;
    $tell = tell ARGV;

    for (1..$a) {
        last if eof;
        print scalar <>;
    }
    print "]\n\n";
    seek ARGV, $tell, 0;

Quote:
}

__END__

#!/usr/bin/perl
# patba [ [ [-A X] [-B Y] ] | [-C N] ] pattern [files ...]
# Will coalesce overlaps.


    /A/ ? ($a) :
    /B/ ? ($b) :
    ($a, $b) = (shift) x 2 for shift;

Quote:
}

defined or $_ = 3 for $a, $b;
$max = $b > 100 ? $b : 100;

$re = shift;
die "usage: patba [ [ [-A X] [-B Y] ] | [-C N] ] pattern [files ...]\n"
    unless defined $re;

while (<>) {

    unless (/$re/o) {

        next;
    }

    COAL: for (1..$a) {
        last if eof;
        $line = <>;
        print $line;
        goto COAL if $line =~ /$re/o;
    }
    for (1..$b) {
        last if eof;

        if ($lines[-1] =~ /$re/o) {

            goto COAL;
        }
    }
    print "]\n\n";

Quote:
}

__END__

Quote:
>Solutions will be judged upon these overall criteria:

>    Objective:

>      * correctness: does it actually do the right thing?  if not,
>    nothing else matters. :-)
>      * space efficiency: don't use more space than minimally needed
>      * time efficiency: is your solution fast?
>      * test coverage: do you include test data to check all border cases?

They are correct unless they aren't.  When printing the previous N lines,
more space is used than absolutely necessary, but it means I don't have to
shift a perfect array for every line.  I used a modified version of Don
Blaheta's test data (it adds a 'foo' at the very start and very end):

        foo
        a
        b
        foo
        foo
        c
        foo
        d
        e
        f
        g
        foo
        h
        i
        j
        k
        l
        m
        n
        foo
        o
        p
        q
        r
        s
        t
        u
        v
        w
        x
        y
        z
        foo

Quote:
>    Subjective:

>      * conciseness:  keep it short.  no frills.  just a few lines.
>      * clarity: is this understandable to a native perl speaker?
>      * idiom: does this look like natural perl?  does it use cool
>    perl features absent in other languages?
>      * creativity: is your solution cleverly different from those
>    of others?

Some of it is more concise than others, but none measured to just a few
lines.  Those who've walked the desert should have no trouble understanding
the code, though a few may spit on me for using goto().  It's Perl written in
Perl so perhaps it's idiomatic.  I doubt it's very creative or clever though.

--



Tue, 05 Mar 2002 03:00:00 GMT  
 CONTEST: Range Searching

Quote:

>             pataft [-A N] pattern [files ...]

To which I said,

  perl -ne 'print if ($c=0, /pat/) .. $c++ == 3' files

and then I complained that I didn't have a nice extra credit one-liner.
(I'm not following-up to that post since I don't see it).

I should point out that there are still one-liners for this, even if not
as nice.

  perl -ne 'print if /pat/ .. (/pat/ ? $c = 0 : ++$c) == 3' files

I didn't really like this since /pat/ appears in the code twice and is
actually tested twice for some lines.

  perl -ne 'print if ($c=0, /pat/) ... (/pat/ ? $c = 0 : ++$c) == 3' \
  files

/pat/ is only tested once for each line here but now it's just getting
ugly.

--
Rick Delaney



Tue, 05 Mar 2002 03:00:00 GMT  
 
 [ 50 post ]  Go to page: [1] [2] [3] [4]

 Relevant Pages 

1. CONTEST: Range Searching

2. DBLookupComboBox: EDBEngineError 'No Current record'

3. Obfuscated Perl Contest: The Perl Journal contest #2

4. Trees with range searches

5. searching a file in ranges

6. searching for a # within a range

7. Calling CRT windows as child windows of main window

8. Q: How to insert Blob field by using TQuery

9. Btrieve questions...

10. Dbase on WinNT network

11. Help me please

12. using parallel port with pascal

 

 
Powered by phpBB® Forum Software