newscan 2.0 - a Perl Network News Scanner (Part 1 of 4) 
Author Message
 newscan 2.0 - a Perl Network News Scanner (Part 1 of 4)


Archive-name: newscan/part01

#!/bin/sh
# This is newscan, a shell archive (produced by shar 3.52.3)
# To extract the files from this archive, save it to a file, remove
# everything above the "!/bin/sh" line above, and type "sh file_name".
#

# Source directory /home/jfm/NEWSCAN
#
# existing files will NOT be overwritten unless -c is specified
#
# This is part 1 of a multipart archive                                    
# do not concatenate these parts, unpack them in order with /bin/sh        
#
# This shar contains:
# length  mode       name
# ------ ---------- ------------------------------------------
#   3250 -rw-r--r-- README
#   6193 -rw-r--r-- INSTALL
#    409 -rw-r--r-- Makefile
#  89451 -rwxr-xr-x newscan
#  14378 -rwxr-xr-x xnewscan
#  14706 -rw-r--r-- newscan.man
#  14568 -rw-r--r-- newscan.man.ascii
#   1422 -rw-r--r-- pex.cfg
#   1481 -rw-r--r-- cdrom.cfg
#   1597 -rw-r--r-- extest.cfg
#   1691 -rw-r--r-- windows.cfg
#    436 -rw-r--r-- ppc.cfg
#    800 -rw-r--r-- require.cfg
#   1655 -rw-r--r-- home.cfg
#   1582 -rw-r--r-- vanity.cfg
#   2052 -rw-r--r-- distrib.txt
#  17243 -rw-r--r-- alert.uu
#
touch -am 1231235999 $$.touch >/dev/null 2>&1
if test ! -f 1231235999 && test -f $$.touch; then
  shar_touch=touch
else
  shar_touch=:
  echo 'WARNING: not restoring timestamps'
fi
rm -f 1231235999 $$.touch
#
if test -r _sharseq.tmp; then
  echo 'Must unpack archives in sequence!'
  echo Please unpack part `cat _sharseq.tmp` next
  exit 1
fi
# ============= README ==============
if test -f 'README' && test X"$1" != X"-c"; then
  echo 'x - skipping README (File already exists)'
  rm -f _sharnew.tmp
else
  > _sharnew.tmp
  echo 'x - extracting README (Text)'
  sed 's/^X//' << 'SHAR_EOF' > 'README' &&
X               newscan - a Perl Network News scanner
X                         by John F. McGowan, Ph.D.

X
***********************************************************************
X
COPYRIGHT NOTICE:
X
X       This note is Copyright (C) 1993, 1994, 1995 by John F. McGowan.
Permission to reproduce and distribute is granted however.  
Additional information may be appended after the END OF ORIGINAL README
line.
X
X
Description:
X
X       newscan is an attempt to solve the information overload
problem in the Network News groups by scanning news groups for
articles that contain matches to regular expressions.  newscan can
also exclude articles in news groups that contain matches to regular
expressions.
X
X       newscan is written in Larry Wall's Perl (Practical Extraction and
Report Language).  Perl is described in Programming Perl by Larry Wall and
Randal L. Schwartz.  It is available for virtually all Unix systems.  Perl
is free but copyrighted by Larry Wall.
X
X       newscan contains Perl comments, a short help message generated by
entering % newscan -h, and an embedded manpage following the convention
described in Programming Perl.
X
X       The newscan distribution also contains xnewscan.  xnewscan is
a Tcl/Tk script that provides a graphical user interface for running
newscan.  xnewscan should run on any machine supporting the wish
windowing shell.
X
AUTHOR:
-------
X
X       newscan's author is John McGowan who can be reached at either

bug reports to the author.
X
X
SYSTEMS:
---------
Should work on Unix systems.
X
X
DEPENDENCIES:
--------------
Needs Perl to run.
xnewscan needs Tcl/Tk and wish (Windowing Shell) to run.
X
PACKING LIST:
---------------
README                   - this README file
INSTALL                  - Installation instructions for newscan
newscan                  - Perl shell script
Makefile                 - Makefile for building Perl script
xnewscan                 - Tcl/Tk script for graphical interface
newscan.man              - standalone nroff newscan manpage (for Linux)
newscan.man.ascii        - ascii output from running nroff on newscan.man
X                           (for Linux )
cdrom.cfg                - a sample configuration file
pex.cfg                  - a sample configuration file
extest.cfg               - a sample configuration file
X                           with wildcard * expansion
windows.cfg              - a sample configuration file
ppc.cfg                  - a sample configuration file
require.cfg              - a sample configuration file
X                           illustrates use of REQUIRE and WORD keywords
home.cfg                 - sample configuration file
X                           illustrates use of leading ~ in filespec
vanity.cfg               - sample configuration file
X                           illustrates use of leading ~ in filespec
distrib.txt              - how and where to post newscan to Net
alert.uu                 - uuencoded sound file used by xnewscan
X
STANDARD DISCLAIMER:
X
X       newscan is distributed AS IS.  There is NO WARRANTY express or
implied that it will work correctly, do what you want, or anything else.
Use at your own risk.
X
X
------------------------>END OF ORIGINAL README<-----------------------
SHAR_EOF
  $shar_touch -am 0312143195 'README' &&
  chmod 0644 'README' ||
  echo 'restore of README failed'
  shar_count="`wc -c < 'README'`"
  test 3250 -eq "$shar_count" ||
    echo "README: original size 3250, current size $shar_count"
  rm -f _sharnew.tmp
fi
# ============= INSTALL ==============
if test -f 'INSTALL' && test X"$1" != X"-c"; then
  echo 'x - skipping INSTALL (File already exists)'
  rm -f _sharnew.tmp
else
  > _sharnew.tmp
  echo 'x - extracting INSTALL (Text)'
  sed 's/^X//' << 'SHAR_EOF' > 'INSTALL' &&
X                Installation Instructions for NEWSCAN
X                   A Network News Article Scanner
X                     by John F. McGowan, Ph.D.
X
----------------------------------------------------------------------------
X
Install Topics
X
1.  Make sure you have Perl / Correct version of Perl.
1.1 How to get Perl 4.036
2.  Finding the Perl interpreter on your system.
3.  Installing newscan manpage (on-line help)
3.1 Installing newscan manpage on Linux.
4.  Unix OS flavor dependency problems (SOCKET.PH).
5.  Determining the NNTP server for your system.
6.  Intalling and using xnewscan
7.  Where to get Tcl/Tk
X
X
1.  You will need Larry Wall's Perl on your system.  newscan works with
X    Perl 4.036.  newscan is known to fail with Perl 4.019.  Perl 4.019
X    will report a syntax error in the newscan perl script.  This appears
X    to be a bug in the perl 4.019 interpreter.  perl 4.036 has no problem
X    parsing the newscan perl script.  You can
X    find out your version of Perl by typing
X
X       % perl -v
X      
X       The 36 is 4.036 is the Patch level returned by perl -v
X
X
X    NOTE on Perl 5.0
X    -------------------
X    Perl 5.0 represents the recent major release of Perl after 4.036.
X    Perl 5.0 is a major rewrite of the Perl code.  newscan has not been
X    tested with Perl 5.0.
X
X
1.1 How to get Perl 4.036
X
X       Perl is available by anonymous ftp from jpl-devvax.jpl.nasa.gov.

source distribution for Perl in files perl.kitnn.Z where nn = 1..44  Patches
for upgrading your version of perl can be found in pub/perl.4.0/patches
subdirectory.  Perl 5.000 is also available at this site.
X
X       Perl is available at many other sites as well.
X
2.  By default, the first line of the newscan script:
X
X       #!/usr/bin/perl
X
X       assumes perl is located in /usr/bin directory.
X
X       This is true on the author's system, but may not be true on your
system.  You may need to edit the path to perl.  For example,
X
X       #!/usr/local/bin/perl
X
X       is a common path for perl.
X
X       To find perl on your system, use the Unix which command:
X
X       % which perl
X       /usr/local/bin/perl
X
X       which will return path to perl.
X
X
3.      Man Page  (Unix On Line Help)
X
X       The newscan Perl script doubles as a manpage.  The script is
X       written in such a way that the Perl source acts as an nroff
X       comment.  Likewise, perl ignores the nroff manpage embedded in
X       the script.  To install the manpage:
X
X       % cp newscan newscan.1
X       % mv newscan.1 /usr/man/man1    # for example
X
X       This procedure works with the nroff supplied on AIX and Sun
X       systems to my certain knowledge.  It does not work with the
X       groff emulation of nroff on Linux.  Please see next section if
X       you are using on Linux.
X
3.1     Installing Man Page on Linux (Unix On Line Help)
X
X       The embedded manpage described in section 3. above DOES NOT WORK
X        on LINUX systems.  For some reason the groff emulation of nroff
X        provided with most Linux systems cannot handle the nroff directives
X        embedded in newscan.
X
X       For this reason, the newscan distribution now includes a
X       file newscan.man which contains the nroff source for the
X        newscan manpage.
X
X       % cp newscan.man newscan.1
X       % mv newscan.1 /usr/man/man1  # for example
X
X       The distribution also includes newscan.man.ascii which is the
X       straight ASCII output of running groff emulating nroff on
X       newscan.man.
X
4.      Machine or Unix Flavor Dependencies  (SOCKET.PH)
X
X       newscan needs sys/socket.ph Perl header file in sys subdirectory of
X       Perl library directory to layer out differences in socket calls
X       between the BSD (Berkeley) and SVR4 (System V, Release 4) flavors
X        of Unix.  If Perl is installed correctly at
X       your site, this is not a problem.  newscan will complain if it
X       cannot find sys/socket.ph!  Then complain to your Perl caretaker.
X
X       socket.ph can be created from the socket.h C header file using
X        the h2ph utility which comes with perl.
X
X       Quick Fix (if socket.ph file is missing)
X
X               Comment out line
X
X                       require "sys/socket.ph";
X
X               and change lines
X
X               $AF_INET = &AF_INET;
X               $SOCK_STREAM = &SOCK_STREAM;
X
X               to
X
X               $AF_INET = 2;
X
X               and
X
X               for BSD Systems (SunOS, SGI Irix 4, also AIX)
X
X               $SOCK_STREAM = 1;
X
X               for Unix SVR4 Systems (Solaris, SGI Irix 5, Unixware,...)
X
X               $SOCK_STREAM = 2;
X
X
5. Determining your NNTP (News) Server
X
X       (a) cat /usr/local/lib/rn/server
X            This file contains NNTP server address for rn and
X           trn news readers.
X
X       (b) Some systems define environment variable NNTPSERVER to contain
X            the nntp server address.  newscan will check this variable
X            and use it *IF* the NNTP line is omitted from
X            the newscan configuration
X            file.  The NNTP line explicity tells newscan which NNTP (News)
X            Server to attach to.
X
X       (c) Ask someone.
X
X
6. Installing and using xnewscan
X
X       xnewscan is a Tcl/Tk script that provides a graphical user interface
for newscan.  Tcl (pronounced "tickle") is a scripting language developed by
Dr. John Ousterhout and his students at Berkeley.  Tk is an extension to
Tcl.  Tk is a toolkit for the X Window System.  Tk adds commands to Tcl
to build Motif-like user interfaces.
X
X       xnewscan requires the wish windowing shell for running Tcl/Tk.
By default, the first line of xnewscan is
X
X       #!/usr/bin/wish -f
X
X       If wish is located somewhere other than /usr/bin, you will need
to edit this line.  For example,
X
X       #!/usr/local/bin/wish -f
X
X       Once xnewscan is installed, simply type
X
X       % xnewscan
X
X       This will pop up the graphical interface.  xnewscan expects
the newscan script (using the name newscan) to be installed somewhere in
your PATH.  xnewscan uses the term QUERY for newscan configuration files.
X
X       Use xnewscan to select queries (configuration files), run the
searches, view results (the articles found by search), and view statistics
on the search.
X
X
7. Where to get Tcl/Tk
X
X       The home ftp site for the Tcl source code is
X
X               ftp.cs.berkeley.edu
X
X       The home ftp site for the Tk source code is
X
X               ftp.cs.berkeley.edu
X
X
X       For Linux users, Tcl/Tk and wish are frequently bundled with Linux
X       distributions.
X
X       There is also a comp.lang.tcl USENET newsgroup and a Tcl Frequently
Asked Questions (FAQ).
X
---------------------END OF FILE------------------------------------------
SHAR_EOF
  $shar_touch -am 0312143195 'INSTALL' &&
  chmod 0644 'INSTALL' ||
  echo 'restore of INSTALL failed'
  shar_count="`wc -c < 'INSTALL'`"
  test 6193 -eq "$shar_count" ||
    echo "INSTALL: original size 6193, current size $shar_count"
  rm -f _sharnew.tmp
fi
# ============= Makefile ==============
if test -f 'Makefile' && test X"$1" != X"-c"; then
  echo 'x - skipping Makefile (File already exists)'
  rm -f _sharnew.tmp
else
  > _sharnew.tmp
  echo 'x - extracting Makefile (Text)'
  sed 's/^X//' << 'SHAR_EOF' > 'Makefile' &&
install: newscan.au
X
all: newscan newscan.au newscan.ps
X
submit: newscan.shar
X
newscan.shar: newscan newscan.list

-o newscan.shar -L50 -S < newscan.list
X
newscan: header nscan.y
X       byacc -P nscan.y


X
newscan.au: alert.uu
X       uudecode alert.uu
X
newscan.ps: newscan.dvi

X
newscan.dvi: newscan.tex
X      {*filter*}$<
SHAR_EOF
  $shar_touch -am 0311123295 'Makefile' &&
  chmod 0644 'Makefile' ||
  echo 'restore of Makefile failed'
  shar_count="`wc -c < 'Makefile'`"
  test 409 -eq "$shar_count" ||
    echo "Makefile: original size 409, current size $shar_count"
  rm -f _sharnew.tmp
fi
# ============= newscan ==============
if test -f 'newscan' && test X"$1" != X"-c"; then
  echo 'x - skipping newscan (File already exists)'
  rm -f _sharnew.tmp
else
  > _sharnew.tmp
  echo 'x - extracting newscan (Text)'
  sed 's/^X//' << 'SHAR_EOF' > 'newscan' &&
#!/usr/bin/perl
'di';
'ig00';

#define YYBYACC 1
$NNTP=257;
$SELECT=258;
$MBOX=259;
$COLLECT=260;
$STATISTICS=261;
$IN=262;
$ON=263;
$SEARCH=264;
$UNLESS=265;
$PATTERNS=266;
$PAIRS=267;
$REGEXP=268;
$IPADR=269;
$FQDN=270;
$PORT=271;
$NEWSGROUP=272;
$FILESPEC=273;
$LIST=274;
$VETO=275;
$DESELECT=276;
$WHERE=277;
$REQUIRE=278;
$RANGE=279;
$WORD=280;
$WORDSTEM=281;
$WORDEXP=282;
$PHRASE=283;
$OUTLINE=284;
$MBOXFORMAT=285;
$FORMAT=286;

X    0,    0,    1,    1,    1,    1,    1,    1,    1,    1,
X    1,    1,    1,    1,    1,    1,    1,    1,    1,    1,
X    1,    1,    1,    4,    4,    4,    4,    4,    5,    5,
X    2,    3,    3,

X    0,    2,    1,    3,    3,    4,    4,    3,    3,    3,
X    3,    3,    3,    3,    5,    5,    6,    6,    6,    4,
X    4,    3,    2,    2,    2,    2,    2,    1,    2,    1,
X    1,    1,    2,

X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,   31,    2,    3,    0,    0,   32,    0,    0,
X    0,   28,    0,    0,    0,    0,    0,    0,   23,    0,
X    0,    0,    0,    0,    4,    0,    5,   33,    8,   13,
X    0,    0,   24,   25,    0,   26,   27,   12,    0,   22,
X    9,   10,   11,   14,    6,    7,    0,    0,    0,   20,
X    0,    0,   29,   21,   15,    0,    0,   16,    0,   17,
X   18,   19,

X   14,   15,   19,   27,   46,

X  -10, -260, -265, -262, -247, -263,  -39,   12, -265, -263,
X -263, -255,    0,    0,    0,   -8,   -2,    0,   -7,   12,
X -224,    0, -240, -239, -236, -236,   12,   -9,    0,   -7,
X   12,   12,   12,   12,    0,   12,    0,    0,    0,    0,
X -238, -231,    0,    0, -236,    0,    0,    0,   12,    0,
X    0,    0,    0,    0,    0,    0,   12, -219, -218,    0,
X   12, -217,    0,    0,    0,   12,   12,    0,   12,    0,
X    0,    0,

X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,   40,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,

X    0,   -4,   42,   30,  -20,
);

X   13,   13,   13,   29,   22,   47,   18,   13,   16,   17,
X   20,   35,   37,   21,   39,   40,   23,   24,   28,   25,
X   26,   13,   48,   50,   63,   51,   52,   53,   54,   55,
X   33,   56,   58,   59,   57,   60,   61,   41,   42,   31,
X   32,   43,   44,   62,   64,   45,   66,   67,   69,   30,
X   30,    0,   65,    0,    0,    0,   68,    0,    0,    0,
X    0,   70,   71,    0,   72,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
X    0,    0,    0,    0,    0,    0,    2,    3,    4,    5,
X    0,    0,    0,    0,    6,    0,    0,    0,    0,    0,
X    0,    7,   34,    8,   38,    9,   10,   11,   36,   49,
X    0,    0,    0,    0,   12,

X   10,   10,   10,    8,  268,   26,  272,   10,  269,  270,
X  273,   16,   17,  261,   19,   20,  280,  281,   58,  283,
X  284,   10,   27,   28,   45,   30,   31,   32,   33,   34,
X  286,   36,  264,  265,  273,  267,  268,  262,  263,   10,
X   11,  282,  282,  275,   49,  282,  266,  266,  266,   10,
X    9,   -1,   57,   -1,   -1,   -1,   61,   -1,   -1,   -1,
X   -1,   66,   67,   -1,   69,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,   -1,
X   -1,   -1,   -1,   -1,   -1,   -1,  257,  258,  259,  260,
X   -1,   -1,   -1,   -1,  265,   -1,   -1,   -1,   -1,   -1,
X   -1,  272,  271,  274,  272,  276,  277,  278,  271,  279,
X   -1,   -1,   -1,   -1,  285,
);
$YYFINAL=1;
#ifndef YYDEBUG
#define YYDEBUG 0
#endif
$YYMAXTOKEN=286;

"end-of-file",'','','','','','','','','',"'\\n'",'','','','','','','','',
'','','','','','','','','','','','',
'','','','','','','','','','','','','','','','','','','','','','','','',
'','','',"':'",'','','','','','','','','','',
'','','','','','','','','','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','',
'','','','','','','','','','','','','','','','','','','','','','','','',
'','','','',"NNTP","SELECT","MBOX",
"COLLECT","STATISTICS","IN","ON","SEARCH","UNLESS",
"PATTERNS","PAIRS","REGEXP",
"IPADR","FQDN","PORT","NEWSGROUP","FILESPEC",
"LIST","VETO","DESELECT","WHERE",
"REQUIRE","RANGE","WORD","WORDSTEM","WORDEXP",
"PHRASE","OUTLINE","MBOXFORMAT",
"FORMAT",

"\$accept : stmt_list",
"stmt_list :",
"stmt_list : stmt_list statement",
"statement : terminator",
"statement : NNTP IPADR terminator",
"statement : NNTP FQDN terminator",
"statement : NNTP IPADR PORT terminator",
"statement : NNTP FQDN PORT terminator",
"statement : SELECT newsgroup_list terminator",
"statement : DESELECT newsgroup_list terminator",
"statement : WHERE search_expr terminator",
"statement : REQUIRE search_expr terminator",
"statement : UNLESS search_expr terminator",
"statement : MBOX FILESPEC terminator",
"statement : MBOXFORMAT FORMAT terminator",
"statement : COLLECT STATISTICS IN FILESPEC terminator",
"statement : COLLECT STATISTICS ON REGEXP terminator",
"statement : COLLECT STATISTICS ON SEARCH PATTERNS terminator",
"statement : COLLECT STATISTICS ON UNLESS PATTERNS terminator",
"statement : COLLECT STATISTICS ON VETO PATTERNS terminator",
"statement : COLLECT STATISTICS ON PAIRS",
"statement : NEWSGROUP ':' RANGE terminator",
"statement : NEWSGROUP ':' terminator",
"statement : LIST terminator",
"search_expr : WORD WORDEXP",
"search_expr : WORDSTEM WORDEXP",
"search_expr : PHRASE word_list",
"search_expr : OUTLINE word_list",
"search_expr : REGEXP",
"word_list : WORDEXP word_list",
"word_list : WORDEXP",
"terminator : '\\n'",
"newsgroup_list : NEWSGROUP",
"newsgroup_list : newsgroup_list NEWSGROUP",
);
#endif
sub yyclearin { $yychar = -1; }
sub yyerrok { $yyerrflag = 0; }
$YYSTACKSIZE = $YYSTACKSIZE || $YYMAXDEPTH || 500;
$YYMAXDEPTH = $YYMAXDEPTH || $YYSTACKSIZE || 500;
$yyss[$YYSTACKSIZE] = 0;
$yyvs[$YYSTACKSIZE] = 0;
sub YYERROR { ++$yynerrs; &yy_err_recover; }
sub yy_err_recover
{
X  if ($yyerrflag < 3)
X  {
X    $yyerrflag = 3;
X    while (1)
X    {
X      if (($yyn = $yysindex[$yyss[$yyssp]]) &&
X          ($yyn += $YYERRCODE) >= 0 &&
X          $yycheck[$yyn] == $YYERRCODE)
X      {
#if YYDEBUG
X       print "yydebug: state $yyss[$yyssp], error recovery shifting",
X             " to state $yytable[$yyn]\n" if $yydebug;
#endif
X        $yyss[++$yyssp] = $yystate = $yytable[$yyn];
X        $yyvs[++$yyvsp] = $yylval;
X        next yyloop;
X      }
X      else
X      {
#if YYDEBUG
X        print "yydebug: error recovery discarding state ",
X              $yyss[$yyssp], "\n"  if $yydebug;
#endif
X        return(1) if $yyssp <= 0;
X        --$yyssp;
X        --$yyvsp;
X      }
X    }
X  }
X  else
X  {
X    return (1) if $yychar == 0;
#if YYDEBUG
X    if ($yydebug)
X    {
X      $yys = '';
X      if ($yychar <= $YYMAXTOKEN) { $yys = $yyname[$yychar]; }
X      if (!$yys) { $yys = 'illegal-symbol'; }
X      print "yydebug: state $yystate, error recovery discards ",
X            "token $yychar ($yys)\n";
X    }
#endif
X    $yychar = -1;
X    next yyloop;
X  }
0;

Quote:
} # yy_err_recover

X
sub yyparse
{
#ifdef YYDEBUG
X  if ($yys = $ENV{'YYDEBUG'})
X  {
X    $yydebug = int($1) if $yys =~ /^(\d)/;
X  }
#endif
X
X  $yynerrs = 0;
X  $yyerrflag = 0;
X  $yychar = (-1);
X
X  $yyssp = 0;
X  $yyvsp = 0;
X  $yyss[$yyssp] = $yystate = 0;
X
yyloop: while(1)
X  {
X    yyreduce: {
X      last yyreduce if ($yyn = $yydefred[$yystate]);
X      if ($yychar < 0)
X      {
X        if (($yychar = &yylex) < 0) { $yychar = 0; }
#if YYDEBUG
X        if ($yydebug)
X        {
X          $yys = '';
X          if ($yychar <= $#yyname) { $yys = $yyname[$yychar]; }
X          if (!$yys) { $yys = 'illegal-symbol'; };
X          print "yydebug: state $yystate, reading $yychar ($yys)\n";
X        }
#endif
X      }
X      if (($yyn = $yysindex[$yystate]) && ($yyn += $yychar) >= 0 &&
X              $yycheck[$yyn] == $yychar)
X      {
#if YYDEBUG
X        print "yydebug: state $yystate, shifting to state ",
X              $yytable[$yyn], "\n"  if $yydebug;
#endif
X        $yyss[++$yyssp] = $yystate = $yytable[$yyn];
X        $yyvs[++$yyvsp] = $yylval;
X        $yychar = (-1);
X        --$yyerrflag if $yyerrflag > 0;
X        next yyloop;
X      }
X      if (($yyn = $yyrindex[$yystate]) && ($yyn += $yychar) >= 0 &&
X            $yycheck[$yyn] == $yychar)
X      {
X        $yyn = $yytable[$yyn];
X        last yyreduce;
X      }
X      if (! $yyerrflag) {
X        &yyerror('syntax error');
X        ++$yynerrs;
X      }
X      return(1) if &yy_err_recover;
X    } # yyreduce
#if YYDEBUG
X    print "yydebug: state $yystate, reducing by rule ",
X          "$yyn ($yyrule[$yyn])\n"  if $yydebug;
#endif
X    $yym = $yylen[$yyn];
X    $yyval = $yyvs[$yyvsp+1-$yym];
X    switch:
X    {
if ($yyn == 4) {
#line 22 "nscan.y"
{ $label = NNTP;
X         $them = $yyvs[$yyvsp-1];
last switch;
Quote:
} }

if ($yyn == 5) {
#line 25 "nscan.y"
{ $label = NNTP;
X         $them = $yyvs[$yyvsp-1];
last switch;
Quote:
} }

if ($yyn == 6) {
#line 28 "nscan.y"
{ $label = NNTP;
X         $them = $yyvs[$yyvsp-2];
X         $port = $yyvs[$yyvsp-1];
last switch;
Quote:
} }

if ($yyn == 7) {
#line 32 "nscan.y"
{ $label = NNTP;
X         $them = $yyvs[$yyvsp-2];
X         $port = $yyvs[$yyvsp-1];
last switch;
Quote:
} }

if ($yyn == 8) {
#line 36 "nscan.y"
{


X    {
X       $range{$group} =
X         '' unless $range{$group}; # blank range
X    }

X
last switch;
Quote:
} }

if ($yyn == 9) {
#line 46 "nscan.y"
{


X
last switch;
Quote:
} }

if ($yyn == 10) {
#line 51 "nscan.y"
{

# $2 is value of search expression


X    {                          #
# build Perl regular expression for case insensitive word search
X      
X       if($pattern{$Group})
X       {                       #
X           $pattern{$Group} =
X             join("\034", $pattern{$Group}, $yyvs[$yyvsp-1]);
X       }                       #
X       else
X       {
X           $pattern{$Group} = $yyvs[$yyvsp-1];
X       }
X    } # close loop over selected groups
X
X
last switch;
Quote:
} }

if ($yyn == 11) {
#line 72 "nscan.y"
{


X    {
X       if($required{$Group})
X       {
X           $required{$Group} =
X             join("\034", $required{$Group}, $yyvs[$yyvsp-1]);
X       }
X       else
X       {
X           $required{$Group} = $yyvs[$yyvsp-1];
X       }
X    }

X
last switch;
Quote:
} }

if ($yyn == 12) {
#line 89 "nscan.y"
{


X    {
X       if($veto{$Group})
X       {
X           $veto{$Group} =
X             join("\034", $veto{$Group}, $yyvs[$yyvsp-1]);
X       }
X       else
X       {
X           $veto{$Group} = $yyvs[$yyvsp-1];
X       }
X    }

X
X
last switch;
Quote:
} }

if ($yyn == 13) {
#line 107 "nscan.y"
{
X    $mbox = $yyvs[$yyvsp-1];
X    $mbox =~ s/^~/$ENV{'HOME'}/;
# translate leader tilde to HOME directory
X    $outFile = $yyvs[$yyvsp-1];
X
last switch;
Quote:
} }

if ($yyn == 14) {
#line 114 "nscan.y"
{
#
# this concept to add support for MMDF courtesy of
# Tim O'Malley
#
X    $MboxFormat = $yyvs[$yyvsp-1];
X    die "$0: Invalid MBOXFORMAT $MboxFormat in configuration file."
X      unless $MboxFormat =~ /unix|elm|mmdf/io;
X
last switch;
Quote:
} }

if ($yyn == 15) {
#line 124 "nscan.y"
{
X               $doCollect = $TRUE;
X               $statFile = $yyvs[$yyvsp-1];
X               $statFile =~ s/^\~/$ENV{'HOME'}/;
X      
last switch;
Quote:
} }

if ($yyn == 16) {
#line 130 "nscan.y"
{
X               $doCollect = $TRUE;
X               $statistics{$yyvs[$yyvsp-1]} = 0; # initialize pattern histogram
X      
last switch;
Quote:
} }

if ($yyn == 17) {
#line 135 "nscan.y"
{
X               $doCollect = $TRUE;
X               $doSearchPatterns = $TRUE;
X      
last switch;
Quote:
} }

if ($yyn == 18) {
#line 140 "nscan.y"
{
X               $doCollect = $TRUE;
X               $doVetoPatterns = $TRUE;
X      
last switch;
Quote:
} }

if ($yyn == 19) {
#line 145 "nscan.y"
{
X           $doCollect = $TRUE;
X           $doVetoPatterns = $TRUE;
X      
last switch;
Quote:
} }

if ($yyn == 20) {
#line 150 "nscan.y"
{
X               $doCollect = $TRUE;
X               $doPairs = $TRUE;
X               %pairs = ();
X      
last switch;
Quote:
} }

if ($yyn == 21) {
#line 156 "nscan.y"
{
X               $range{$yyvs[$yyvsp-3]} = $yyvs[$yyvsp-1];
X      
last switch;
Quote:
} }

if ($yyn == 22) {
#line 160 "nscan.y"
{
X               $range{$yyvs[$yyvsp-2]} = '';
X      
last switch;
Quote:
} }

if ($yyn == 23) {
#line 164 "nscan.y"
{
X               print "NNTP Server is $them \n";
X               print "Port is $port \n";
X                print "Statistics file is $statFile \n";
X               print "doCollect is $readable{$doCollect} \n";
X               print "doSearchPatterns is $readable{$doSearchPatterns} \n";
X               print "doVetoPatterns is $readable{$doVetoPatterns} \n";
X               print "doPairs is $readable{$doPairs} \n";





X               print "Patterns are ", %pattern, " \n";
X               print "Required patterns are ", %required, " \n";
X               print "Vetoed patterns are ", %veto, " \n";
X               print %range, "\n";
X      
last switch;
Quote:
} }

if ($yyn == 24) {
#line 186 "nscan.y"
{
X    $newpat = '/\b' . $yyvs[$yyvsp-0] . '\b/i';
# translate word to Perl regular expr
X    $yyval = $newpat;
X
last switch;
Quote:
} }

if ($yyn == 25) {
#line 192 "nscan.y"
{
X    $newpat = '/\b' . $yyvs[$yyvsp-0] . '\w*/i';
# translate word stem
# to Perl regular expr
X    $yyval = $newpat;
X
last switch;
Quote:
} }

if ($yyn == 26) {
#line 199 "nscan.y"
{
# build search expression using Perl regular expressions


X  $yyval = $newpat;
X
last switch;
Quote:
} }

if ($yyn == 27) {
#line 206 "nscan.y"
{
# an outline is a sequence of words that may contain intervening
# words.  For example, the outline "the red flower" would match
# "the red flower", "the beautiful red flower",
# "the red hibiscus flower" etc.
#


X  $yyval = $newpat;
X
last switch;
Quote:
} }

if ($yyn == 28) {
#line 217 "nscan.y"
{
X        $yyval = $yyvs[$yyvsp-0];              # straight Perl regular expression
X    
last switch;
Quote:
} }

if ($yyn == 29) {
#line 223 "nscan.y"
{ $yyval = join("\034", $yyvs[$yyvsp-1], $yyvs[$yyvsp-0]);
last switch;
Quote:
} }

if ($yyn == 30) {
#line 225 "nscan.y"
{
X    $yyval = $yyvs[$yyvsp-0];
X
last switch;
Quote:
} }

if ($yyn == 32) {
#line 233 "nscan.y"

last switch;
Quote:
} }

if ($yyn == 33) {
#line 235 "nscan.y"

last switch;
Quote:
} }

#line 617 "y.tab.pl"
X    } # switch
X    $yyssp -= $yym;
X    $yystate = $yyss[$yyssp];
X    $yyvsp -= $yym;
X    $yym = $yylhs[$yyn];
X    if ($yystate == 0 && $yym == 0)
X    {
#if YYDEBUG
X      print "yydebug: after reduction, shifting from state 0 ",
X            "to state $YYFINAL\n" if $yydebug;
#endif
X      $yystate = $YYFINAL;
X      $yyss[++$yyssp] = $YYFINAL;
X      $yyvs[++$yyvsp] = $yyval;
X      if ($yychar < 0)
X      {
X        if (($yychar = &yylex) < 0) { $yychar = 0; }
#if YYDEBUG
X        if ($yydebug)
X        {
X          $yys = '';
X          if ($yychar <= $#yyname) { $yys = $yyname[$yychar]; }
X          if (!$yys) { $yys = 'illegal-symbol'; }
X          print "yydebug: state $YYFINAL, reading $yychar ($yys)\n";
X        }
#endif
X      }
X      return(0) if $yychar == 0;
X      next yyloop;
X    }
X    if (($yyn = $yygindex[$yym]) && ($yyn += $yystate) >= 0 &&
X        $yyn <= $#yycheck && $yycheck[$yyn] == $yystate)
X    {
X        $yystate = $yytable[$yyn];
X    } else {
X        $yystate = $yydgoto[$yym];
X    }
#if YYDEBUG
X    print "yydebug: after reduction, shifting from state ",
X        "$yyss[$yyssp] to state $yystate\n" if $yydebug;
#endif
X    $yyss[++$yyssp] = $yystate;
X    $yyvs[++$yyvsp] = $yyval;
X  } # yyloop
Quote:
} # yyparse

#line 239 "nscan.y"
X
sub yylex
{
X
lexloop:
X    {
X       # get a line of input
X       if($line eq '')
X       {
X       #    &prompt;
X           $line = $config[$yyi];
X           $yyi++;             # increment pointer to line in config file
X           if(!$line || $line eq '')
X           {
X             return(0);
X           }
X       }
X
X       $line =~ s/^[ \t\f\r]*(.|\n)//
X           || next lexloop;
X
X       local($char) = $1;
X
X       if($char eq '#')
X       {
X           $line = "\n";
X           &yylex;
X       }
X       elsif( $yylex_mode == $WORD_MODE && $char =~/^[\w\'\-]/
&& $line =~ s/^([\w\'\-]*)// )
X         {
X           $yylval = $char.$1;
X           $WORDEXP;
X         }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^N/
&& $line =~ s/^NTP// )
X       {
X           $yylex_mode = $NNTP_MODE;
X           $NNTP;
X       }
X       elsif( $yylex_mode == $NNTP_MODE && $char =~ /^[\w\+\-]/
&& $line =~ s/^([\w\+\-]*\.?([\w\+\-\*]+\.?)+)// )
X       {                       # looking for Fully Qualified Domain Name
X           $yylval = $char.$1;
X           $FQDN;
X       }
X       elsif( $yylex_mode == $NNTP_MODE && $char =~ /^\d/
&& $line =~ s/^(\d*\.\d+\.\d+\.\d+)// )
X       {                       # looking for Internet Protocal Address
X           $yylval = $char.$1;
X           $IPADR;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^S/
&& $line =~ s/^ELECT//)
X       {
X           $SELECT;            # select newsgroups to search
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^D/
&& $line =~ s/^ESELECT// )
X       {
X           $DESELECT;          # deselect newsgroups to search
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^[\w\-\+\*\?]/
&& $line =~ s/^([\w\+\-\*\?]*\.([\w\+\-\*\?]+\.?)+)// )
X       {                       # read newsgroup name token
X           $yylval = $char.$1; # can overlap file names or FQDN
X           $NEWSGROUP;         # hence use of ROOT, FILE, and NNTP modes
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^\d/
&& $line =~ s/^(\d*(\-\d+)?(,(\d+(\-\d+)?))*)// )
X       {                       # excluded range token processed here
X           $temp = $char.$line;        #
#           ($yylval,$rest) = split(/ /, $temp, 2);
X           $yylval = $char.$1; #
X           $RANGE;             #
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^W/
&& $line =~ s/^HERE// )
X       {
X           $WHERE;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^W/
&& $line =~ s/^ORD\b// )
X       {
X           $yylex_mode = $WORD_MODE; # a WORD token can overlap a file name  
X           $WORD;              # hence use of WORD and FILE mode
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^W/
&& $line =~ s/^ORDSTEM\b// )
X       {
X           $yylex_mode = $WORD_MODE; # see comment above
X           $WORDSTEM;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^P/
&& $line =~ s/^HRASE// )
X       {
X           $yylex_mode = $WORD_MODE; # see comment above
X           $PHRASE;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^O/
&& $line =~ s/^UTLINE// )
X       {
X           $yylex_mode = $WORD_MODE;
X           $OUTLINE;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^U/
&& $line =~ s/^NLESS// )
X       {
X           $UNLESS;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^V/
&& $line =~ s/^ETO// )
X       {
X           $VETO;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^R/
&& $line =~ s/^EQUIRE// )
X       {
X           $REQUIRE;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^S/
&& $line =~ s/^EARCH// )
X       {
X           $SEARCH;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^M/
&& $line =~ s/^BOXFORMAT// )
X       {
X           $yylex_mode = $FMT_MODE;
X           $MBOXFORMAT;
X       }
X       elsif( $yylex_mode == $FMT_MODE && $char =~/^\w/
&& $line =~ s/(\w*)// )
X       {
X           $yylval = $char.$1;
X           $FORMAT;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^M/
&& $line =~ s/^BOX// )
X       {
X           $yylex_mode = $FILE_MODE;
X           $MBOX;
X       }
X       elsif( $yylex_mode == $FILE_MODE && $char =~ /^[\w\+\/\-\~]/
&& $line =~ s/^([\w\/\+\-\.]*)// )
X       {
X           $yylval = $char.$1;
X           $FILESPEC;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^C/
&& $line =~ s/^OLLECT// )
X       {
X           $COLLECT;
X       }
X       elsif( $yylex_mode == $ROOT_MODE
&& $char =~ /^I/ && $line =~ s/^N// )
X       {
X           $yylex_mode = $FILE_MODE;
X           $IN;
X       }
X       elsif( $yylex_mode == $ROOT_MODE
&& $char =~ /^O/ && $line =~ s/^N// )
X       {
X           $ON;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^P/
&& $line =~ s/^ATTERNS// )
X       {
X           $PATTERNS;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^P/
&& $line =~ s/^AIRS// )
X       {
X           $PAIRS;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^S/
&& $line =~ s/^TATISTICS// )
X       {
X           $STATISTICS;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^\//
&& $line =~ s/^(.*\/[igo]?)// )
X       {
X           $yylval = $char.$1;
X           $REGEXP;
X       }
X       elsif( $yylex_mode == $ROOT_MODE && $char =~ /^L/
&& $line =~ s/^IST// )
X       {
X           $LIST;
X       }
X       elsif( $char eq "\n")
# terminator - don't care about mode of scanner
X       {
X           $yylex_mode = $ROOT_MODE; # reset mode of lexical scanner
X           $yylval = $char;
X           ord($char);
X       }
X       else
X       {
X           $yylval = $char;
X           ord($char);
X       }
X    }                          # close lexloop
Quote:
}                               # end of sub yylex

X
sub yyerror
{
X    $cfgvalid = $FALSE;                # problem in configuration file
X    if ( !$opt_X )             # not runing in X Window mode
X    {

X    }
Quote:
}

#
#       Name: newscan
#       Date: $Date: 1995/03/12 22:34:19 $
#       Version: $Revision: 2.0 $
#       Author: John F. McGowan, Ph.D. ($Author: jfm $)

#
#       Description:
#
#               newscan is a utility to scan netnews for articles that
#       contain matches to regular expressions.  
#       newscan can also exclude articles that contain
#       matches to selected regular expressions.  newscan is writen
#       in Perl (Practical Extraction and Report Language).  Perl is
#       available for essentially all Unix systems, as well as some
#       non-Unix systems.  A good source of information on Perl
#       is "Programming Perl" by Larry Wall and Randal L. Schwartz.
#      
#       Articles that contain matches are stored in a file
#       in Unix mailbox format.
#       This file may be read using mail readers such as mail, elm, etc.
#
#               Search is controlled by a resource file .newscanrc (by
#       default) in the current directory.  the default file may overridden
#       through the environment variable
#       NEWSCAN, e.g. setenv NEWSCAN /me/.myrc.
#               newscan is intended to be run as a
#       background job, e.g. newscan &
#       since it takes a while to scan selected newsgroups for articles of
#       interest.
#
X
X
require 'sys/socket.ph';        # handle machine dependencies for sockets
require "getopts.pl";
X
# newscan initialization code
X
#
# newscan can
# play an audio file in Sun/Next format to indicate completion of a
# search.  Require audio file to be in PATH of user.


{
X    $audio_file = $dir . "\/newscan.au";
X    if(-e $audio_file)
X    {
X       last;           # terminate the loop
X    }
Quote:
}                               # loop over directories in path ends

X
$TRUE = 1;                      # human readable Boolean true
$FALSE = 0;                     # human readable Boolean false
X
$NAME{$TRUE} = 'TRUE';
$NAME{$FALSE} = 'FALSE';
X
$ROOT_MODE = 0;                 # root mode of lexical scanner
$NNTP_MODE = 1;      # NNTP mode (reading an address for NNTP server )
$WORD_MODE = 2;                 # WORD mode (reading a word )
$FILE_MODE = 3;                 # reading a filename
$FMT_MODE = 4;                  # reading a mailbox file format
X
$line = '';                     # buffer used by yylex subroutine
X
$oldTime = time();              # a timer to fire reporting in X server mode
$total_articles_scanned = 0;    # just what it says
$total_articles_found = 0;      # means what it says
X
X
$MboxFormat = "unix";       # default to standard Unix mail mailbox format
X                             # options include ELM and MMDF
X
$yylex_mode = $ROOT_MODE;     # initial mode for the newscan lexical scanner
X
X
&Getopts('c:r:H:X:hesa');
X                       # -c takes argument configuration file specification
X                       # -e indicates edit configuration file before search
X                               # -r invokes mail reader
X                                # -H <hostname> specifies Internet
X                                # domain name of local host explicity
X                                # from command line
X                               # -X <arg> invokes newscan in mode
# where it is running under an X Window System shell such as xnewscan
X                               # -h is help
X                               # -s is silent mode
X                               # -a audio mode (play audio messages)
if( $opt_H )
{
X    $hostname = $opt_H;
Quote:
}

X
if( ! $opt_s && ! $opt_X )      # if not silent give status message
{
# extract the newscan version number for vanity message
X    $the_version = '$Revision: 2.0 $';  # use single quotes to avoid
X                                          # evaluation of $Revision as a
X                                          # perl variable
X    ($head, $ver, $rest) = split(/ /, $the_version, 3);
X print "Running newscan newsreader version $ver by John F. McGowan, Ph.D. ";
X # initial message
X    print "\n"; # new line
Quote:
}



X
if($opt_r)
{
X       if($ENV{'READER'})
X       {
X               $myReader = $ENV{'READER'};
X               $foundReader = 1;  # indicate that reader has been found
X       }
X       else
X       {
# select a reasonable default mail reader - prefers elm if exists
X               $foundReader = 0;

X               {
X                       if(!$foundReader)
X                       {

X                       {
X                         if(-e "$path/$command" && -x "$path/$command")
X                           {
X                             $myReader = $path . "/" . $command;
X                             $foundReader = 1;
X                           }
X                       } # close loop over paths
X                       } # close if not foundReader
X               }
#               $myReader = 'elm';  # use elm mail reader for now
X       }
X
X       if(!$foundReader)
X       {
X          print "newscan problem!  Could not find a mail reader from ",

"Please set the READER environment variable to a reader on your system!\n";
X               die;
X       }
X
#       print "Checking if file $opt_r exists \n";
X
X       if(-e $opt_r)           # file exists
X       {
# some newsgroups may contain binary executables so don't do this for now
#           if(-T $opt_r)       # file is a text file
#           {
#                print "Please wait!  newscan using $myReader mail reader to
# read $opt_r file containing results of a search. \n";  
# let the user know this may take a while
X               exec("$myReader -f $opt_r");
#           }
#           else
#           {
#               die "newscan error: Folder file $opt_r is not a text file!";
#           }
X       }
X       else                    # let user know folder doesn't exist
X       {
X           die "newscan error: Folder of articles $opt_r does not exist!  
Cannot read!";
X       }
X
Quote:
}

X
if($opt_h)
{
X       print <<"EndOfHelp";
X
newscan -- a network news scanner  ( Version $Revision: 2.0 $ )
X
X       newscan searches selected Internet network news groups
X       for articles that match perl regular expressions.  newscan
X       implements a boolean query key-pattern full-text information
X       retrieval system.  In plain English, this means newscan scans
X       the complete text of each article for matches to various
X       combinations of perl regular expressions.
X
X       Command Line Options:
X
X               -c <file-specification>  
X
X                       This flag selects the configuration file that
X               tells newscan which newsgroups to search and what
X               patterns to search for.  If this option flag is not
X               specified, newscan uses the default configuration file
X               .newscanrc in the user's home directory or specified by
X               the environment variable NEWSCAN.
X
X               -e
X
X                       This flag indicates that the configuration file
X               should be edited before doing the search.  newscan will
X               pop the user into the editor specified by the EDITOR
X               environment variable.  If no configuration file exists,
X               newscan provides a template configuration file that the
X               user should edit.
X
X               -r <file-specification>
X
X                       This flag invokes a mail reader on the mailbox
X               format file specified by the file-specification.  Should
X               be the file containing the articles found by newscan in
X               a search.  The user should set the environment variable
X               READER to his or her favorite mail reader.  Otherwise,
X               newscan will select a mail reader, trying to use elm
X               first if it exists.
X
X                -H <name-of-local-host>
X
X                        This flag explicitly sets the name of the local
X                host.  For example, -H 440.rahul.net
X
X                -s
X
X                        Silent mode.  Suppress progress messages issued by
X                newscan during search.
X
X               -h
X
X                       Output this help message.
X
EndOfHelp
X       exit;
Quote:
}


'Sep','Oct','Dec');
X
%DaysInMonth = ( 'Jan', '31',
X                'Feb', '28',  # 1992 is leap year with 29 days
X                'Mar', '31',
X                'Apr', '30',
X                'May', '31',
X                'Jun', '30',
X                'Jul', '31',
X                'Aug', '31',
X                'Sep', '30',
X                'Oct', '31',
X                'Nov', '30',
X                'Dec', '31',
);

X
%NewsArticle = ();
X
X
# read the configuration file (default to .newscanrc)
if($opt_c)
{
# configuration file is specified at command line.
X       $configFile = $opt_c;
X       $configFile =~ s/^\~/$ENV{'HOME'}/;
Quote:
}

else
{
X       if($ENV{'NEWSCAN'})
X       {
X               $configFile = $ENV{'NEWSCAN'};
X       }
X       else # default configuration file in user's home directory
X       {
X               $configFile = $ENV{'HOME'} . '/.newscanrc';
X       }
Quote:
}

# edit the configuration file before start if -e

X
if($ENV{'EDITOR'})
{
X       $myeditor = $ENV{'EDITOR'};
Quote:
}

else
{
X       $foundEditor = 0;

X       {

X               {
X                       if(-e "$path/$command" && -x "$path/$command")
X                       {
X                               $myeditor = $path . "/" . $command;
X                               $foundEditor = 1;
X                       }
X               }
X               last if $foundEditor;
X       }
X       die "Could not find an editor!\n" unless $foundEditor;
#       $myeditor = 'emacs';  # I am an emacs snob
Quote:
}

X
if($opt_e)
{
X       if(-e $configFile)
X       {
# do nothing if configuration file already exists
X       }
X       else # file does not exist - provide a form
X       {
X               open(CONFIG,">$configFile");
X               print CONFIG <<"EndOfConfig";
# Template Configuration file for newscan Internet news scanner
#
# Note:  Lines beginning with # are comments.  Remove leading # and
#        edit line if you wish to activate line.
#
# Line following: specify the Internet address of the NNTP
# server for the system.
# The NNTP server is the machine where the Internet news is stored.
# OR this line may be omitted if you use the NNTPSERVER environment variable
# This line takes precedence over NNTPSERVER variable.
NNTP <nntphost>
# specify the mailbox format file for the found articles
MBOX <file-for-found-articles>
# specify the Internet newsgroups to be searched (space delimited list)
SELECT <list of newsgroups>
# retrieve a news article if it contains a match to regular expression
#WHERE /perl regular expression/[i]
# do not retrieve a news article if it contains a match to UNLESS
# regular expression
#UNLESS /perl regular expression/[i]
# a news article must contain a match to a REQUIRE regular expression to
#  be retrieved.
#REQUIRE /perl regular expression/[i]
# collect statistics on search in a file -- to aid in refining
# search criteria
#COLLECT STATISTICS IN <file-specfication>
# collect statistics on frequency of articles that match a pattern
#COLLECT STATISTICS ON /perl regular expression/i
# collect statistics on search patterns specified by WHERE and REQUIRE lines
#COLLECT STATISTICS ON SEARCH PATTERNS
# collect statistics on unless patterns specified by UNLESS
#COLLECT STATISTICS ON UNLESS PATTERNS
# collect statistics on frequency of articles that match pairs of patterns
#COLLECT STATISTICS ON PAIRS
#<newsgroup>:<excluded range>
#
EndOfConfig
X               close(CONFIG);
X       }
X       system("$myeditor $configFile");  # edit the configuration file
X       print "Do you wish to do this search (y/n):";
X       $ans = <STDIN>;
X       exit 0 if $ans =~ /[Nn]/;
Quote:
}

X
X

close(CONFIG);
#
$newgroup = 'NULL';

X
# parse the configuration file
$mbox = "newscanBox";  # default mailbox file for found articles
$port = 119;           # default to 119 as NNTP server port
X
$cfgvalid = $TRUE;              # assume configuration file valid to start
X
&parse(*config);            # parse subroutine - interface to yyparse
X
X
if( $opt_X )
{
#
#  1xx - informative message
#  2xx - Command ok
#  3xx - Command ok so far
#  4xx - Command was correct, but couldn't be performed for some reason
#  5xx - Command unimplemented, or incorrect, or a serious program error
#   occured
#
X    if($opt_X eq "VERIFY")   # just check that config file is valid
X    {
X       print "100 $NAME{$cfgvalid}\n"; # report validity of
X                                        # configuration file (query)
X       exit;
X    }
X    elsif ($opt_X eq "MBOX" )
X    {
X       print "101 $mbox\n";  # report mailbox format file
X       exit;
X    }
X    elsif ($opt_X eq "STAT")
X    {
X       print "102 $statFile\n"; # report statistics file
X       exit;
X    }
X    elsif ($opt_X eq "QUERY")
X    {
X       print "100 $NAME{$cfgvalid}\n";
X       print "101 $mbox\n";
X       print "102 $statFile\n";
X       exit;
X    }
X    elsif ($opt_X eq "RUN")
X    {
#
# this mode runs newscan as a server sending information to an X windows
# client via a regular pipe
#
X       $| = 1;   # need unbuffered output for
X                  # pipe InterProcess Communication
X       print "103 Running $0\n";
X    }
X    else
X    {
X       print STDERR "501 Unrecognized command\n";
X       exit;
X    }
Quote:
}

X
X
X
&FixRange(*range);
X
X
if($doCollect)
{
X       $statistics{'ALL'} = 0;  # count of number of articles scanned
X       $statistics{'FOUND'} = 0; # count of number of articles found
Quote:
}

X
# add search patterns to patterns to collect statistics for
# if requested
if($doSearchPatterns)
{
# append search patterns to statistics

{
X       $statistics{$pattern} = 0;
Quote:
}
}

X
if($doVetoPatterns)
{
# append search patterns to statistics

{
X       $statistics{$pattern} = 0;
Quote:
}
}

X
# open mailbox file to store found newsarticles
SHAR_EOF
  : || echo 'restore of newscan failed'
fi
echo 'End of newscan part 1'
echo 'File newscan is continued in part 2'
echo 2 > _sharseq.tmp
exit 0
--



Fri, 29 Aug 1997 07:12:15 GMT  
 
 [ 1 post ] 

 Relevant Pages 

1. newscan 2.0 - a Perl Network News Scanner (Part 3 of 4)

2. newscan 2.0 - a Perl Network News Scanner (Part 2 of 4)

3. newscan 2.0 - Announcement for Network News Scanner

4. newscan 1.105 - a Network News Scanner (Part 2 of 3)

5. newscan 1.66 - a Perl Network News Article scanner (requires NNTP)

6. newscan 1.45 - a NetNews network news article scanner

7. Announcing newscan 1.105 - a Network News Scanner

8. newscan 1.45 (a news scanner) in comp.sources.misc

9. Announcing newscan 1.66 - NNTP NetNews Scanner!

10. newscan 2.0 - xnewscan errors

11. newscan 2.0 - MODE READER Problem

12. looking for net news scanner

 

 
Powered by phpBB® Forum Software