Rewinding an input file 
Author Message
 Rewinding an input file

Hi,

I hope I'm not asking an old question again. I must read 1 file
several times. This works with the following script, where xx is
the file to be re-read, and dummy is some other file.

gawk '
END {
  for (n=0;n<2;n++) {
    for (;;) {
      if (0==getline < infile) break
#      getline < infile; if ($1=="") break
      print $0
    }
    close(infile)
  }

Quote:
}' infile=xx dummy

So the file is read, then closed, and read again. But look at the
out-commented line. To my opinion this should have exactly the same
functionality as the line above it. But no ... only empty $0's are
printed during the 2nd loop traversal. A bug or a feature?


Sun, 11 Aug 2002 03:00:00 GMT  
 Rewinding an input file

$1=="", will return 0 if $1 is empty, whereas
getline < infile, will return 0 when it gets to
the end of infile. So the two lines do not have
the same functionality.

As for getting empty $0's printed during the second
loop - I can't reproduce that.
With an input file xx:

This is
the file
to be
re-read.

That was
a blank
line.

I get the following output from the code as you presented it:

This is
the file
to be
re-read.

That was
a blank
line.
This is
the file
to be
re-read.

That was
a blank
line.

If I change the comment to the other line I get:

This is
the file
to be
re-read.
This is
the file
to be
re-read.

Which is exactly what I'd expect.

Can you post a copy of the input file with which you got
your results?

Tristan.

Quote:

> Hi,

> I hope I'm not asking an old question again. I must read 1 file
> several times. This works with the following script, where xx is
> the file to be re-read, and dummy is some other file.

> gawk '
> END {
>   for (n=0;n<2;n++) {
>     for (;;) {
>       if (0==getline < infile) break
> #      getline < infile; if ($1=="") break
>       print $0
>     }
>     close(infile)
>   }
> }' infile=xx dummy

> So the file is read, then closed, and read again. But look at the
> out-commented line. To my opinion this should have exactly the same
> functionality as the line above it. But no ... only empty $0's are
> printed during the 2nd loop traversal. A bug or a feature?



Sun, 11 Aug 2002 03:00:00 GMT  
 Rewinding an input file
Hi Tristan,

Quote:

> $1=="", will return 0 if $1 is empty, whereas
> getline < infile, will return 0 when it gets to
> the end of infile. So the two lines do not have
> the same functionality.

I forgot to tell that my input file does not contain blank
lines. In this case the 2 lines indeed are equivalent.

The results that you reported are exactly as they can be
expected. My point was that I suspected bad behavior in my
awk version, which is Gnu awk. Thus in "a feature or a bug?"
maybe the 2nd is true.

Wouter Boeke



Sun, 11 Aug 2002 03:00:00 GMT  
 Rewinding an input file
Wouter,

I'm afraid I don't agree. The two lines are *not*
equivalent even in the case where you have no blank
lines in a file.

getline < infile; if ($1=="") break

will not break out of the for loop when it
detects the end of a file - only when it
detects a blank line - and consequently
your gawk program prints out the last line
of the file endlessly (it never even gets
to read the file twice).

if (0==getline < infile) break

will break out of the loop when it detects
the end of the file.

I hope this has helped to clarify what
your gawk script is doing,

Tristan.

Quote:

> Hi Tristan,


> > $1=="", will return 0 if $1 is empty, whereas
> > getline < infile, will return 0 when it gets to
> > the end of infile. So the two lines do not have
> > the same functionality.

> I forgot to tell that my input file does not contain blank
> lines. In this case the 2 lines indeed are equivalent.

> The results that you reported are exactly as they can be
> expected. My point was that I suspected bad behavior in my
> awk version, which is Gnu awk. Thus in "a feature or a bug?"
> maybe the 2nd is true.

> Wouter Boeke



Sun, 11 Aug 2002 03:00:00 GMT  
 Rewinding an input file
The proof of the pudding is in the eating ...

The file I want to read more then once is:

one
two
three

My script:

gawk '
END {
  for (n=0;n<2;n++) {
    for (;;) {
       getline < infile; if ($1=="") break
       print "[" $0 "]"
    }
    close(infile)
  }

Quote:
}' infile=xx dummy

The result:

[one]
[two]
[three]
[]
[]
[]

My interpretation of this:

1. The first read of file xx is okay.
2. After this, the file is NOT closed and then reopened, instead
   the last end-of-file character is read repeatedly. And that
   is NOT okay.

Regards,
Wouter Boeke



Sun, 11 Aug 2002 03:00:00 GMT  
 Rewinding an input file
Is there a trailing carriage return
in your input file?

If there is then the output is exactly what
I would expect - and nothing is going wrong
with gawk.

Basically the script that you have presented
below will repeat the last line of the file
forever.

Its not a bug in gawk - that is the correct
behavior for what you have written.

If for some reason there is no trailing blank
line in your file I don't know what's going on.
In that case the output should be:
[one]
[two]
[three]
[three]
[three]
[three]

ad infinitum.

And that is what I get when I run the file
through the script in your last post.

Tristan.



Sun, 11 Aug 2002 03:00:00 GMT  
 Rewinding an input file
Probably you are right ... However I thought that in awk
variables like $0, $1 are equal to "" if they are not set.

Thanks for the lesson.



Sun, 11 Aug 2002 03:00:00 GMT  
 Rewinding an input file

Quote:

> Probably you are right ... However I thought that in awk
> variables like $0, $1 are equal to "" if they are not set.

they are equal to "" if the're not set!
the problem is that you assume that if getline gets at the
end of file, $0 and consequently $1,$2,... become "", but they
do NOT.

instead they keep the values before the last getline. so

try this
gawk '
END {
      getline < infile; print "["$0"]["$1"]"
      getline < infile; print "["$0"]["$1"]"
      getline < infile; print "["$0"]["$1"]"
      getline < infile; print "["$0"]["$1"]"

Quote:
}' infile=xx

you'll see

[one][one]
[two][two]
[three][three]
[three][three]

to do what you want you can use the following

END {
  for (n=0;n<2;n++) {
    for (;;) {
       if ((getline < infile)==0) break
       print n"[" $0 "]"
    }
    close(infile)
  }

Quote:
}

it will print

0[one]
0[two]
0[three]
1[one]
1[two]
1[three]

goodluck
--
eiso



Sun, 11 Aug 2002 03:00:00 GMT  
 
 [ 8 post ] 

 Relevant Pages 

1. PLAY,REWIND,FORWARD,PAUSE ....a WAV File ????

2. "Rewinding" files in COBOL

3. file rewind?

4. how to rewind an seq. file

5. newbie would like to break input file and output to separate files

6. Reading from input file writing to output file

7. sed: input file = output file

8. comparing an input file with an output file

9. Mutiple output files single Input file

10. Single file input ==> multi file output

11. with-input-from-file, with-output-to-file

12. with-input-from-file, with-output-to-file

 

 
Powered by phpBB® Forum Software