Creating an awk script to extract other scripts from a file 
Author Message
 Creating an awk script to extract other scripts from a file

I am trying to write an awk extraction script that will extract other
scripts from a file called awkfile2. There are 4 scripts in awkfile2,
written like this:

******************************************************
{ if ( disksize < $5 )
{
disksize = $5
computer = $0

Quote:
}
}

*****************************************************
386 { for (field = 1; field <= NF; field += 2)
printf("%s\t", $field)
print ""
Quote:
}

*****************************************************
286 { field = 1
while ( field < = NF)
{
printf ("%s\t",$field )
field += 2
Quote:
}
print ""
}

*****************************************************
BEGIN { field =1 }
$1 == "386" { do {
printf("%s\t", $field)
field +=2
Quote:
} while( field <= NF )
}

******************************************************

I want awk to extract each of the four above scripts and write each of them
to their own separate file, starting at awk6 all the way up to awk9.
So far this is what I have for my awk extraction script (called
extract.awk):

awk 'BEGIN { RS = "*" && RS > 30; counter = 6 }
/*/ {counter = counter + 1}
{ print $0 > "awk"counter }' awkfile2

This extracts the files, sort of, but it doesn't put each of them into their
own separate file (ie. awk6, awk7, awk8, awk9). Instead it extracts part of
one script, puts it into a file, and then extracts part the next script,
puts it into a file, etc. Perhaps something is wrong with the line "RS =
"*" && RS > 30" ???  I want it to notice the asteriks lines that seperate
each script in awkfile2 and copy each script between the asteriks lines to
its own seperate file.

Can someone help?

J. Spaceman



Sat, 07 May 2005 02:12:53 GMT  
 Creating an awk script to extract other scripts from a file

[*-separated file]

Quote:
> I want awk to extract each of the four above scripts and write each of them
> to their own separate file, starting at awk6 all the way up to awk9.
> So far this is what I have for my awk extraction script (called
> extract.awk):

> awk 'BEGIN { RS = "*" && RS > 30; counter = 6 }
> /*/ {counter = counter + 1}
> { print $0 > "awk"counter }' awkfile2

> Can someone help?

Instead of matching with RS, try line-by-line. Something like this:

gawk '/^\*+$/ { f++;next } {print > "out." f}' < infile



Sat, 07 May 2005 03:24:41 GMT  
 Creating an awk script to extract other scripts from a file

Quote:

> Instead of matching with RS, try line-by-line. Something like this:

> gawk '/^\*+$/ { f++;next } {print > "out." f}' < infile

This works, almost.  Except that it outputs all four of the above scripts to
a single file.  Each of them must be output to their own respective file
(ie. awk6, awk7, awk8, awk9).  

J. Spaceman



Sat, 07 May 2005 04:45:36 GMT  
 Creating an awk script to extract other scripts from a file

Quote:


>> Instead of matching with RS, try line-by-line. Something like this:

>> gawk '/^\*+$/ { f++;next } {print > "out." f}' < infile

> This works, almost.  Except that it outputs all four of the above
> scripts to
> a single file.  Each of them must be output to their own respective
> file (ie. awk6, awk7, awk8, awk9).

> J. Spaceman

gawk 'BEGIN{RS="[*]"}$0>""{print > "out."NR}' infile

Michael Heiming
--
Remove +SIGNS, if you expect an answer



Sat, 07 May 2005 05:35:19 GMT  
 Creating an awk script to extract other scripts from a file

Quote:
> I am trying to write an awk extraction script that will extract other
> scripts from a file called awkfile2. There are 4 scripts in awkfile2,
> written like this:

> ******************************************************
> { if ( disksize < $5 )
> {
> disksize = $5
> computer = $0
> }
> }
> *****************************************************
> 386 { for (field = 1; field <= NF; field += 2)
> printf("%s\t", $field)
> print ""
> }
> *****************************************************
> 286 { field = 1
> while ( field < = NF)
> {
> printf ("%s\t",$field )
> field += 2
> }
> print ""
> }
> *****************************************************
> BEGIN { field =1 }
> $1 == "386" { do {
> printf("%s\t", $field)
> field +=2
> } while( field <= NF )
> }
> ******************************************************

> I want awk to extract each of the four above scripts and write each of
them
> to their own separate file, starting at awk6 all the way up to awk9.
> So far this is what I have for my awk extraction script (called
> extract.awk):

> awk 'BEGIN { RS = "*" && RS > 30; counter = 6 }
> /*/ {counter = counter + 1}
> { print $0 > "awk"counter }' awkfile2

> This extracts the files, sort of, but it doesn't put each of them into
their
> own separate file (ie. awk6, awk7, awk8, awk9). Instead it extracts
part of
> one script, puts it into a file, and then extracts part the next
script,
> puts it into a file, etc. Perhaps something is wrong with the line "RS
=
> "*" && RS > 30" ???  I want it to notice the asteriks lines that
seperate
> each script in awkfile2 and copy each script between the asteriks
lines to
> its own seperate file.

> Can someone help?

> J. Spaceman

$ cat awkfile2
******************************************************
{ if ( disksize < $5 )
{
disksize = $5
computer = $0
Quote:
}
}

*****************************************************
386 { for (field = 1; field <= NF; field += 2)
printf("%s\t", $field)
print ""
Quote:
}

*****************************************************
286 { field = 1
while ( field < = NF)
{
printf ("%s\t",$field )
field += 2
Quote:
}
print ""
}

*****************************************************
BEGIN { field =1 }
$1 == "386" { do {
printf("%s\t", $field)
field +=2
Quote:
} while( field <= NF )
}

******************************************************
$ awk 'BEGIN { counter = 5 }

Quote:
> /^*+$/ {close("awk"counter); counter ++; next}
> { print $0 > "awk"counter }' awkfile2

$ ls -l awk*
-rw-r--r--    1 petert   unknown        55 Nov 18 21:52 awk10
-rw-r--r--    1 petert   unknown        57 Nov 18 21:55 awk6
-rw-r--r--    1 petert   unknown        81 Nov 18 21:55 awk7
-rw-r--r--    1 petert   unknown        88 Nov 18 21:55 awk8
-rw-r--r--    1 petert   unknown        96 Nov 18 21:55 awk9
-rw-r--r--    1 petert   unknown       593 Nov 18 21:51 awkfile2

HTH
--
Peter S Tillier
"Who needs perl when you can write dc and sokoban in sed?"



Sat, 07 May 2005 06:03:06 GMT  
 Creating an awk script to extract other scripts from a file
OK, this is what I have now:

awk 'BEGIN {RS = "*"; counter = 6 } $0 { print$0 > "awk"counter;next }
END { if (counter > 9) exit;
        else { counter++ }' awkfile2

But I get the following error when I try to run it:

awk: cmd. line:4: (END OF FILE)
awk: cmd. line:4: parse error

Basically I can extract all the individual scripts from awkfile2 now, the
only problem is that it creates a file called 'awk10' with nothing in it
(because of the last row of asteriks at the bottom of awkfile2).  I want to
limit my extraction to awk6 to awk9, which an extracted script in each,
that is why I am trying to use the if/else statement to test to see where
the counter is at.  

Can anyone help?

J. Spaceman



Sat, 07 May 2005 07:57:07 GMT  
 Creating an awk script to extract other scripts from a file

Quote:
> OK, this is what I have now:

> awk 'BEGIN {RS = "*"; counter = 6 } $0 { print$0 > "awk"counter;next }
> END { if (counter > 9) exit;
>         else { counter++ }' awkfile2

> But I get the following error when I try to run it:

> awk: cmd. line:4: (END OF FILE)
> awk: cmd. line:4: parse error

> Basically I can extract all the individual scripts from awkfile2 now, the
> only problem is that it creates a file called 'awk10' with nothing in it
> (because of the last row of asteriks at the bottom of awkfile2).  I want
to
> limit my extraction to awk6 to awk9, which an extracted script in each,
> that is why I am trying to use the if/else statement to test to see where
> the counter is at.

> Can anyone help?

> J. Spaceman

Oops...rebrace your code:

    awk '
        BEGIN {RS = "*"; counter = 6 }
        $0 { print$0 > "awk"counter;next }
        END {
            if (counter > 9)
                exit;
            else {
                counter++
            }
    ' awkfile2

I sense a missing brace.

If you want to use RS, consider using it as a regular expression (Gnu AWK
does this, your awk might not.)  It can't really be a "Record Separator"
because your file begins and ends with your asterisks, too, so I added the
pattern /[^ \t\n]/  If there is anything not whitespace in the awk program
between starlines, it is printed; this effectively skips the beginning and
end  starlines (actually, the null programs before and after them,
repectively).

For four files, you generally don't need to worry about closing your files,
but starting at ten, you do.  So I close mine.

    BEGIN {
        RS = "\*+\n"
        counter = 6
    }
    /[^ \t\n]/ {
        print > f = "awk" counter++
        close(f)
    }

    - Dan



Sat, 07 May 2005 09:27:26 GMT  
 Creating an awk script to extract other scripts from a file

Quote:
> OK, this is what I have now:

> awk 'BEGIN {RS = "*"; counter = 6 } $0 { print$0 > "awk"counter;next }
> END { if (counter > 9) exit;
>         else { counter++ }' awkfile2

> But I get the following error when I try to run it:

> awk: cmd. line:4: (END OF FILE)
> awk: cmd. line:4: parse error

> Basically I can extract all the individual scripts from awkfile2 now,
the
> only problem is that it creates a file called 'awk10' with nothing in
it
> (because of the last row of asteriks at the bottom of awkfile2).  I
want to
> limit my extraction to awk6 to awk9, which an extracted script in
each,
> that is why I am trying to use the if/else statement to test to see
where
> the counter is at.

It's much easier not to change RS (see my post of 18/11/02 22:03) and to
use a rule that checks for, and ignores, the row of asterisks, because
this provides an action that allows you to increment the counter and to
close the file being written on a "once per file" basis.  The following
code

$ awk 'BEGIN { counter = 5 }

Quote:
>   /^*+$/ {close("awk"counter); counter ++; next}
>   { print $0 > "awk"counter }' awkfile2

$ ls -l awk*
-rw-r--r--    1 petert   unknown        57 Nov 19 04:37 awk6
-rw-r--r--    1 petert   unknown        81 Nov 19 04:37 awk7
-rw-r--r--    1 petert   unknown        88 Nov 19 04:37 awk8
-rw-r--r--    1 petert   unknown        96 Nov 19 04:37 awk9
-rw-r--r--    1 petert   unknown       112 Nov 18 22:03 awkextract.awk
-rw-r--r--    1 petert   unknown       594 Nov 19 04:31 awkfile2
-rwxr-xr-x    1 petert   unknown       438 Feb 10  2002 awkinbat.bat*
-rw-r--r--    1 petert   unknown        51 Feb 10  2002 awkinbat.dat
-rw-r--r--    1 petert   unknown       675 Jul 18 08:25 awkprof.out

doesn't create a file awk10.

If you want to do this by changing RS then you need to show that RS
consists of "one or more asterisks followed by a newline":

$ awk 'BEGIN { RS = "*+\n"; counter = 6; getline }
  { close("awk"counter); print $0 > "awk"counter++ }
  END { close("awk"counter) }' awkfile2

$ ls -l awk*
-rw-r--r--    1 petert   unknown        58 Nov 19 04:32 awk6
-rw-r--r--    1 petert   unknown        82 Nov 19 04:32 awk7
-rw-r--r--    1 petert   unknown        89 Nov 19 04:32 awk8
-rw-r--r--    1 petert   unknown        97 Nov 19 04:32 awk9
-rw-r--r--    1 petert   unknown       112 Nov 18 22:03 awkextract.awk
-rw-r--r--    1 petert   unknown       594 Nov 19 04:31 awkfile2
-rwxr-xr-x    1 petert   unknown       438 Feb 10  2002 awkinbat.bat*
-rw-r--r--    1 petert   unknown        51 Feb 10  2002 awkinbat.dat
-rw-r--r--    1 petert   unknown       675 Jul 18 08:25 awkprof.out

$

The "getline" is used to skip the first row of asterisks in the file,
and, provided that the last line in the file ends with a newline, will
only output files awk6 thru' awk9 as you require (if there isn't a
newline then the last file contains the asterisks too).  When you use RS
as above then $0 consists of everything between successive separator
lines.

HTH
--
Peter S Tillier
"Who needs perl when you can write dc and sokoban in sed?"



Sat, 07 May 2005 12:42:09 GMT  
 Creating an awk script to extract other scripts from a file

Quote:


>> Instead of matching with RS, try line-by-line. Something like this:

>> gawk '/^\*+$/ { f++;next } {print > "out." f}' < infile

> This works, almost.  Except that it outputs all four of the above scripts to
> a single file.  Each of them must be output to their own respective file
> (ie. awk6, awk7, awk8, awk9).  

Nope, it works perfect using GNU Awk 3.1.1 on Linux. Check this:

$ ls -l out*
ls: out*: No such file or directory
$ gawk '/^\*+$/ { f++;next } {print > "out." f}' < a.in
$ ls -l out*
-rw-rw-r--    1 wic      wic            57 Nov 19 23:02 out.1
-rw-rw-r--    1 wic      wic            81 Nov 19 23:02 out.2
-rw-rw-r--    1 wic      wic            88 Nov 19 23:02 out.3
-rw-rw-r--    1 wic      wic            96 Nov 19 23:02 out.4
$ cat out.4
BEGIN { field =1 }
$1 == "386" { do {
printf("%s\t", $field)
field +=2

Quote:
} while( field <= NF )
}



Sun, 08 May 2005 06:05:59 GMT  
 Creating an awk script to extract other scripts from a file

Quote:


> >> Instead of matching with RS, try line-by-line. Something like this:

> >> gawk '/^\*+$/ { f++;next } {print > "out." f}' < infile

> > This works, almost.  Except that it outputs all four of the above
scripts to
> > a single file.  Each of them must be output to their own respective
file
> > (ie. awk6, awk7, awk8, awk9).

> Nope, it works perfect using GNU Awk 3.1.1 on Linux. Check this:

> $ ls -l out*
> ls: out*: No such file or directory
> $ gawk '/^\*+$/ { f++;next } {print > "out." f}' < a.in
> $ ls -l out*
> -rw-rw-r--    1 wic      wic            57 Nov 19 23:02 out.1
> -rw-rw-r--    1 wic      wic            81 Nov 19 23:02 out.2
> -rw-rw-r--    1 wic      wic            88 Nov 19 23:02 out.3
> -rw-rw-r--    1 wic      wic            96 Nov 19 23:02 out.4
> $ cat out.4
> BEGIN { field =1 }
> $1 == "386" { do {
> printf("%s\t", $field)
> field +=2
> } while( field <= NF )
> }

Interesting that you and I came up with more-or-less the same solution -
I didn't see yours on my news server when I posted mine.  As you say
(and I repeated later) it's much easier to do this line-by-line,
although a solution using RS="*+\n" also works fine.

Peter
--
Peter S Tillier
"Who needs perl when you can write dc and sokoban in sed?"



Sun, 08 May 2005 06:24:58 GMT  
 Creating an awk script to extract other scripts from a file

Quote:


>> $ gawk '/^\*+$/ { f++;next } {print > "out." f}' < a.in

> Interesting that you and I came up with more-or-less the same
> solution - I didn't see yours on my news server when I posted mine.
> As you say (and I repeated later) it's much easier to do this
> line-by-line, although a solution using RS="*+\n" also works fine.

Yes, thats cool and using close() is of course a more robust solution
as well.

Part of your code:

   /^*+$/ {close("awk"counter); counter ++; next}

Curious, don't you have to escape the '*' in your line pattern regexp,
or is an asterisk by itself considerad a normal character? I usually
escape things just to be on the safe side :-)



Sun, 08 May 2005 06:42:22 GMT  
 Creating an awk script to extract other scripts from a file


...

Quote:
>Part of your code:

>   /^*+$/ {close("awk"counter); counter ++; next}

>Curious, don't you have to escape the '*' in your line pattern regexp,
>or is an asterisk by itself considerad a normal character? I usually
>escape things just to be on the safe side :-)

FWIW:

% gawk '/*/'
foo me
foo*me
foo*me
(Ctrl/D)
% tawk '/*/'
tawk: error in PROGRAM line 1: illegal regular expression: unexpected *
tawk: aborting due to compilation errors
% mawk '/*/'
mawk: line 1: regular expression compile failed (missing operand)
*



Sun, 08 May 2005 06:57:04 GMT  
 Creating an awk script to extract other scripts from a file

Quote:


> >> $ gawk '/^\*+$/ { f++;next } {print > "out." f}' < a.in

> > Interesting that you and I came up with more-or-less the same
> > solution - I didn't see yours on my news server when I posted mine.
> > As you say (and I repeated later) it's much easier to do this
> > line-by-line, although a solution using RS="*+\n" also works fine.

> Yes, thats cool and using close() is of course a more robust solution
> as well.

> Part of your code:

>    /^*+$/ {close("awk"counter); counter ++; next}

> Curious, don't you have to escape the '*' in your line pattern regexp,
> or is an asterisk by itself considered a normal character? I usually
> escape things just to be on the safe side :-)

As Kenny points out gawk's RE engine only treats * as a metacharacter if
can be one [1] - in fact in the gawk version that I'm running escaping
the * gives a warning that the backslash is being ignored.

[1] This is similar to the caret, ^, inside []s where it's only a
metacharacter if it comes first inside the brackets and hyphen, -, also
inside []s, where it's only a metacharacter representing a range if it
can be, i.e., in [a-z] it does represent a range, but in [a-] it doesn't
(partly because ] precedes a in the collating sequence)

Regards,
Peter



Sun, 08 May 2005 13:37:38 GMT  
 
 [ 15 post ] 

 Relevant Pages 

1. Access to Script Name Within Awk Script

2. Replacing a string from an input file within awk script

3. I want to put my awk file into ksh script

4. piping input files into awk script

5. embedding awk script in DOS batch file

6. File pasting in an awk script

7. How to specify input file within awk script?

8. Naming the output file based on awk script variable

9. awk script to modify text file.

10. Checking that a file or directory is writable from an awk script

11. sed/awk script to format a file??

12. How to create an .exe file from a python script

 

 
Powered by phpBB® Forum Software