gawk problem retrieving ARGV (Sun vs Linux) 
Author Message
 gawk problem retrieving ARGV (Sun vs Linux)

I'd recently had to mirror a web site from a Solaris to Linux RedHat
system. The site has a "map" web page which contains invocations of an
awk cgi script like this :

<!--#exec cmd="somelocalpath/lister.cgi Help"-->

lister.cgi is a gawk script which should scan a subdirectory
(subdirectory Help in the example shown) and for each html file found
format a particular HTML string (essentially it constructs on the fly
the list of all pages present).

In a nutshell the original lister.cgi is a script like this :

#!/opt/bin/gawk -f
BEGIN { dir=ARGV[1]"/*.html" }
END     {out="" ; aut=""
         while("ls /somelocalpath/" dir | getline test)
         { if (test !~ "index.html")
           ... do some reformatting here
         }
        }

This script can also be invoked from the shell prompt as e.g.
lister.cgi Help. It works on a Sun Solaris system with gawk 3.0.6 and
also on a Digital Unix system with gawk 2.14.

However on a Linux RedHat system with gawk 3.1.10 it fails with an error
of the form :  "Fatal : <Help> is a directory.

To make it work on Linux I had to modify the gawk script as follows :

#!/opt/bin/gawk -f
BEGIN { dir=argpass"/*.html" }
END     {out="" ; aut=""
         while("ls /somelocalpath/" dir | getline test)
         { if (test !~ "index.html")
           ... do some reformatting here
         }
        }

And to change its invocation (both in the exec cmd directive and on the
shell run string) as e.g.

lister.cgi -v argpass=Help </dev/null

essentially Linux gawk fails to handle ARGV[1], and I have to pass my
directory name using a variable. I then have to specify </dev/null
because the script then has no (file) argument.

The new form is backward compatible but looks inelegant to me. What is
the matter with gawk 3.1.10 in this respect ?

--
----------------------------------------------------------------------

avoid unwanted spam. Any mail returning to this address will be rejected.
Users can disclose their e-mail address in the article if they wish so.



Sun, 01 May 2005 23:20:07 GMT  
 gawk problem retrieving ARGV (Sun vs Linux)


Quote:
> I'd recently had to mirror a web site from a Solaris to Linux RedHat
> system. The site has a "map" web page which contains invocations of an
> awk cgi script like this :

> <!--#exec cmd="somelocalpath/lister.cgi Help"-->

> lister.cgi is a gawk script which should scan a subdirectory
> (subdirectory Help in the example shown) and for each html file found
> format a particular HTML string (essentially it constructs on the fly
> the list of all pages present).

> In a nutshell the original lister.cgi is a script like this :

> #!/opt/bin/gawk -f
> BEGIN { dir=ARGV[1]"/*.html" }
> END     {out="" ; aut=""
>          while("ls /somelocalpath/" dir | getline test)
>          { if (test !~ "index.html")
>            ... do some reformatting here
>          }
>         }

> This script can also be invoked from the shell prompt as e.g.
> lister.cgi Help. It works on a Sun Solaris system with gawk 3.0.6 and
> also on a Digital Unix system with gawk 2.14.

> However on a Linux RedHat system with gawk 3.1.10 it fails with an error
> of the form :  "Fatal : <Help> is a directory.

That's because the original script is poorly written by someone with
a poor knowledge of awk.  On most Unix systems,  you can open(2) and
read(2) directories - not so on Linux where you must opendir(2).  When
you pass a gawk script a directory and the above script,  just because
you have a null body doesn't mean you won't open ARGV[1].  In fact
it opens and closes on the other systems where it does not generate
an error.  You could fudge around with

      ARGV[1] = "/dev/null"
      ARGC = 2

but why bother.  If you are not going to have a body,  you don't need an
END block.  Just put evrything in the Begin block and explicitly exit:

   #!/opt/bin/gawk -f
   BEGIN {
      dir=ARGV[1]"/*.html"
      out=""
      aut=""
      while("ls /somelocalpath/" dir | getline test)
         {
         if (test !~ "index.html")
BTW,  this should be
         if (test != "index.html")
unless you want to match "indexahtml", indexbhtml
etc.  If you are using ~ you should use
         if (test !~ /pattern/)

           ... do some reformatting here
         }
      exit
   }

--
Dan Mercer

If responding by email, include the phrase 'from usenet'
in the subject line to avoid spam filtering.
   >

- Show quoted text -

Quote:
> To make it work on Linux I had to modify the gawk script as follows :

> #!/opt/bin/gawk -f
> BEGIN { dir=argpass"/*.html" }
> END     {out="" ; aut=""
>          while("ls /somelocalpath/" dir | getline test)
>          { if (test !~ "index.html")
>            ... do some reformatting here
>          }
>         }

> And to change its invocation (both in the exec cmd directive and on the
> shell run string) as e.g.

> lister.cgi -v argpass=Help </dev/null

> essentially Linux gawk fails to handle ARGV[1], and I have to pass my
> directory name using a variable. I then have to specify </dev/null
> because the script then has no (file) argument.

> The new form is backward compatible but looks inelegant to me. What is
> the matter with gawk 3.1.10 in this respect ?

> --
> ----------------------------------------------------------------------

> avoid unwanted spam. Any mail returning to this address will be rejected.
> Users can disclose their e-mail address in the article if they wish so.

Opinions expressed herein are my own and may not represent those of my employer.


Sun, 01 May 2005 23:42:58 GMT  
 gawk problem retrieving ARGV (Sun vs Linux)

Quote:

> > This script can also be invoked from the shell prompt as e.g.
> > lister.cgi Help. It works on a Sun Solaris system with gawk 3.0.6 and
> > also on a Digital Unix system with gawk 2.14.

> > However on a Linux RedHat system with gawk 3.1.10 it fails with an error
> > of the form :  "Fatal : <Help> is a directory.

> That's because the original script is poorly written by someone with
> a poor knowledge of awk.  [...]

  That's me ! :-(

  However I'd call it more a loss of knowledge of an unexpected
  different behaviour between Unixes (I'd worked more or less
  extensively on SunOS, Ultrix, Solaris, Digital Unix and HP-UX) and
  Linux (to which I do not have direct access) !

Quote:
> a poor knowledge of awk.  On most Unix systems,  you can open(2) and
> read(2) directories - not so on Linux where you must opendir(2).  When

  In fact I just asked a colleague to run my script on a Linux SuSE
  system, where he has gawk 3.0.6 (the same we have on Sun) and it
  gives the same error !

Quote:
> but why bother.  If you are not going to have a body,  you don't need an
> END block.  Just put evrything in the Begin block and explicitly exit:

  Thanks, I'm just going to try this suggestion !

--
----------------------------------------------------------------------

avoid unwanted spam. Any mail returning to this address will be rejected.
Users can disclose their e-mail address in the article if they wish so.



Mon, 02 May 2005 17:33:11 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. gawk executable scripts and ARGV[0]

2. Problems compiling GAWK 3.0.6 staticly (Linux)

3. Redraw Events: Sun vs. Linux

4. Sun Fortran 1.4 vs. Sun Fortran 2.0

5. Problem Porting Fortran from Sun to Linux

6. EXTERNAL Problem Porting Sun to Linux

7. Problems with floppy/flow... Linux/Sun

8. Porting problem from Sun to Linux

9. I'm trying to get a gawk (3.1) executable for sun/solaris

10. SUMMARY: cc vs. f77, argv argc for f77

11. Q:[F90][SUN] Compilation Problem with Sun Fortran 90 1.2

12. gawk under SunOS and linux.

 

 
Powered by phpBB® Forum Software