counting number of records in a Fortran direct file 
Author Message
 counting number of records in a Fortran direct file

Hi,

I have large "direct" files for which I'd like to know the
number of records in them without having to read through
them. Reading through them would take 100s or more,
but I cannot afford the time cost.

Is there a way to get the number of records in a file
within a constant amount of time (constant in the
sense that the time is independent of the length
of the file)? Please advise!

Thanks a lot in advance,

-Lei Pan



Tue, 30 May 2006 04:24:24 GMT  
 counting number of records in a Fortran direct file

Quote:
Lei Pan writes:
> I have large "direct" files for which I'd like to know the
> number of records in them without having to read through
> them. Reading through them would take 100s or more,
> but I cannot afford the time cost.

> Is there a way to get the number of records in a file
> within a constant amount of time (constant in the
> sense that the time is independent of the length
> of the file)? Please advise!

Does it need to be a portable solution?  If not, you can use
an API call to get the file size and divide by the record
length, or do a system call, perform a directory listing,
capture the output, and extract the file size from that.

Maybe somebody else knows of a portable solution that doesn't
involve reading the file, but off-hand I'd try reading the
file in a loop with a large step on the record number until
it fails, then starting from the last good attempt, going
up again with a smaller step until it fails, and so on until
your step size is 1.  Sort of like a binary search.  The time
it takes will be fairly independent of the length of the file.



Tue, 30 May 2006 05:49:14 GMT  
 counting number of records in a Fortran direct file

Quote:

> Lei Pan writes:

> > I have large "direct" files for which I'd like to know the
> > number of records in them without having to read through
> > them. Reading through them would take 100s or more,
> > but I cannot afford the time cost.

> > Is there a way to get the number of records in a file
> > within a constant amount of time (constant in the
> > sense that the time is independent of the length
> > of the file)? Please advise!

> Does it need to be a portable solution?  If not, you can use
> an API call to get the file size and divide by the record
> length, or do a system call, perform a directory listing,
> capture the output, and extract the file size from that.

> Maybe somebody else knows of a portable solution that doesn't
> involve reading the file, but off-hand I'd try reading the
> file in a loop with a large step on the record number until
> it fails, then starting from the last good attempt, going
> up again with a smaller step until it fails, and so on until
> your step size is 1.  Sort of like a binary search.  The time
> it takes will be fairly independent of the length of the file.

That might not always work.  It depends on the underlying file
system.  fortran doesn't require that all of the records be actually
written.  If the file only contains record 1 and record 1000000000
it's hard to say what the correct answer is or will be.

Dick Hendrickson



Tue, 30 May 2006 06:05:52 GMT  
 counting number of records in a Fortran direct file

Quote:
>> > Is there a way to get the number of records in a file
>> > within a constant amount of time (constant in the
>> > sense that the time is independent of the length
>> > of the file)? Please advise!
>> Maybe somebody else knows of a portable solution that doesn't
>> involve reading the file, but off-hand I'd try reading the
>> file in a loop with a large step on the record number until
>> it fails, then starting from the last good attempt, going
>> up again with a smaller step until it fails, and so on until
>> your step size is 1.  Sort of like a binary search.  The time
>> it takes will be fairly independent of the length of the file.

I have written code to do that in the past: reads records 1, 2, 4, 8, ...
until a read fails, then uses a binary search on the last two values of
the series to find the actual last valid record.  Fairly fast, but
probably not quite portable as it depends on the system giving you no
error when reading a record N less than the last whether record N has been
actually written or not, but giving an error for all records after the
last one.  I've found that this is true for most compilers, but your
mileage may vary.

Sorry I'm working away from home, so can't easily find the code, else
would post, as it's only a few lines long.

--
Clive Page

--
--
Clive Page
Dept of Physics & Astronomy,
University of Leicester,    Tel +44 116 252 3551



Tue, 30 May 2006 15:57:35 GMT  
 counting number of records in a Fortran direct file
Lei Pan:  A subprogram that is compiler dependent follows:

      SUBROUTINE GETFSIZ(FID,SIZE,UNITNO)
!===Get file size of file with FILE=FID or Unit=UNITNO.
!  Code is set up for Compaq CVF

!  Lah - Lahey/Fujitsu LF95
!  MSF - Microsoft Fortran F(X)
!  Sal - Salford FTN95
!  CVF USE DFLIB
!  MSF USE MSFLIB
       USE DFLIB
      CHARACTER*(*) FID
      INTEGER UNITNO, SIZE

!---Compaq Fortran; same as MS Fortran requires following 10 lines.
!     USE DFLIB (must be just after SUBROUTINE statement).
      INTEGER*4 iresult, handle
      TYPE (FILE$INFO) info
      handle = FILE$FIRST
      iresult = GETFILEINFOQQ(FID,info,handle)
      if( iresult.NE.0 ) THEN
        SIZE=info.length
      ELSE
        SIZE=0
      END IF

!Lah  INQUIRE(FILE=FID,FLEN=SIZE)
!
!MSF  Same as CVF code above with exceptions:
!     iresult=GETFILEINFOQQ(FID,buffer,handle)
!     where buffer is a derived type file$info defined in
!     MSFLIB.F90 and handle is a status indicator.

!Sal  Salford Fortran requires the following three lines.
!     INTEGER (KIND=3) SIZE
!     INTEGER (KIND=2) ERRCODE

      RETURN
      END

Skip Knoble, Penn State

-|Hi,
-|
-|I have large "direct" files for which I'd like to know the
-|number of records in them without having to read through
-|them. Reading through them would take 100s or more,
-|but I cannot afford the time cost.
-|
-|Is there a way to get the number of records in a file
-|within a constant amount of time (constant in the
-|sense that the time is independent of the length
-|of the file)? Please advise!
-|
-|Thanks a lot in advance,
-|
-|-Lei Pan
-|

   Herman D. (Skip) Knoble, Research Associate
   (a computing professional for 38 years)

   Web: http://www.personal.psu.edu/hdk
   Penn State Information Technology Services
    Academic Services and Emerging Technologies
     Graduate Education and Research Services
   Penn State University
     214C Computer Building
     University Park, PA 16802-21013
   Phone:+1 814 865-0818   Fax:+1 814 863-7049



Tue, 30 May 2006 21:31:38 GMT  
 counting number of records in a Fortran direct file
Thanks!

I am using the Absoft Fortran. I am working on parallelizing
a piece of legacy code, so I don't have the freedom of
switching compilers.

Do you know if the code you provided can work with Absoft?

-Lei



Quote:
> Lei Pan:  A subprogram that is compiler dependent follows:

>       SUBROUTINE GETFSIZ(FID,SIZE,UNITNO)
> !===Get file size of file with FILE=FID or Unit=UNITNO.
> !  Code is set up for Compaq CVF

> !  Lah - Lahey/Fujitsu LF95
> !  MSF - Microsoft Fortran F(X)
> !  Sal - Salford FTN95
> !  CVF USE DFLIB
> !  MSF USE MSFLIB
>        USE DFLIB
>       CHARACTER*(*) FID
>       INTEGER UNITNO, SIZE

> !---Compaq Fortran; same as MS Fortran requires following 10 lines.
> !     USE DFLIB (must be just after SUBROUTINE statement).
>       INTEGER*4 iresult, handle
>       TYPE (FILE$INFO) info
>       handle = FILE$FIRST
>       iresult = GETFILEINFOQQ(FID,info,handle)
>       if( iresult.NE.0 ) THEN
>         SIZE=info.length
>       ELSE
>         SIZE=0
>       END IF

> !Lah  INQUIRE(FILE=FID,FLEN=SIZE)
> !
> !MSF  Same as CVF code above with exceptions:
> !     iresult=GETFILEINFOQQ(FID,buffer,handle)
> !     where buffer is a derived type file$info defined in
> !     MSFLIB.F90 and handle is a status indicator.

> !Sal  Salford Fortran requires the following three lines.
> !     INTEGER (KIND=3) SIZE
> !     INTEGER (KIND=2) ERRCODE

>       RETURN
>       END

> Skip Knoble, Penn State


> -|Hi,
> -|
> -|I have large "direct" files for which I'd like to know the
> -|number of records in them without having to read through
> -|them. Reading through them would take 100s or more,
> -|but I cannot afford the time cost.
> -|
> -|Is there a way to get the number of records in a file
> -|within a constant amount of time (constant in the
> -|sense that the time is independent of the length
> -|of the file)? Please advise!
> -|
> -|Thanks a lot in advance,
> -|
> -|-Lei Pan
> -|

>    Herman D. (Skip) Knoble, Research Associate
>    (a computing professional for 38 years)

>    Web: http://www.personal.psu.edu/hdk
>    Penn State Information Technology Services
>     Academic Services and Emerging Technologies
>      Graduate Education and Research Services
>    Penn State University
>      214C Computer Building
>      University Park, PA 16802-21013
>    Phone:+1 814 865-0818   Fax:+1 814 863-7049



Sat, 03 Jun 2006 08:15:34 GMT  
 counting number of records in a Fortran direct file
Thank you all for your suggestions!

Clive: Could you please post your code after you return home?

Thanks again!

-Lei


Quote:
> >> > Is there a way to get the number of records in a file
> >> > within a constant amount of time (constant in the
> >> > sense that the time is independent of the length
> >> > of the file)? Please advise!

> >> Maybe somebody else knows of a portable solution that doesn't
> >> involve reading the file, but off-hand I'd try reading the
> >> file in a loop with a large step on the record number until
> >> it fails, then starting from the last good attempt, going
> >> up again with a smaller step until it fails, and so on until
> >> your step size is 1.  Sort of like a binary search.  The time
> >> it takes will be fairly independent of the length of the file.

> I have written code to do that in the past: reads records 1, 2, 4, 8, ...
> until a read fails, then uses a binary search on the last two values of
> the series to find the actual last valid record.  Fairly fast, but
> probably not quite portable as it depends on the system giving you no
> error when reading a record N less than the last whether record N has been
> actually written or not, but giving an error for all records after the
> last one.  I've found that this is true for most compilers, but your
> mileage may vary.

> Sorry I'm working away from home, so can't easily find the code, else
> would post, as it's only a few lines long.

> --
> Clive Page

> --
> --
> Clive Page
> Dept of Physics & Astronomy,
> University of Leicester,    Tel +44 116 252 3551



Sat, 03 Jun 2006 08:16:18 GMT  
 counting number of records in a Fortran direct file

Quote:

>Clive: Could you please post your code after you return home?

Here it is, for what it's worth.  Note that this requires you to
know the record length.

! finds number of records on existing direct-access unformatted file
subroutine lenfile(fname, lenrec, length)
implicit none
character*(*), intent(in) :: fname  ! name of existing direct-access file
integer, intent(in)       :: lenrec ! record length (O/S dependent units)
integer, intent(out) :: length      ! number of records.
!
character :: cdummy*1
integer :: lunit, nlo, nhi, mid, kode
logical :: exists, open
!
! find a free unit on which to open the file
!
do lunit = 99, 1, -1
   inquire(unit=lunit, exist=exists, opened=open)
   if(exists .and. .not. open) exit
end do
open(unit=lunit, file=fname, access="direct", recl=lenrec, iostat=kode)
if(kode /= 0) then
   print *, 'error in lenfile: ', trim(fname), ' does not exist'
   return
end if
!
! expansion phase
!
mid = 1
do
   read(lunit, rec=mid, iostat=kode) cdummy
   if(kode /= 0) exit
   mid = 2 * mid
end do
!
! length is between mid/2 and mid, do binary search to refine
!
nlo = mid/2
nhi = mid
do while(nhi - nlo > 1)
   mid = (nlo + nhi) / 2
   read(lunit, rec=mid, iostat=kode) cdummy
   if(kode == 0) then
      nlo = mid
   else
      nhi = mid
   end if
end do
length = nlo
close(unit=lunit)
return
end subroutine lenfile

--
Clive Page



Sat, 03 Jun 2006 18:01:08 GMT  
 
 [ 8 post ] 

 Relevant Pages 

1. count number of record with a filter

2. count number of records in view?

3. Counting large number of records in CW 2...

4. SGI Fortran direct access record length?

5. number of records for tps file and searching files

6. direct access file record length

7. F2k8+/- : Direct Access File Record Locking

8. Direct access files - deleting last record

9. record length for direct access files

10. compressing records in a direct access file

11. Record Count On VSAM File Under CICS

12. Counting non-deleted records in a Clipper/dBase file

 

 
Powered by phpBB® Forum Software