Direct Access Record Size vs. Disk Sector Size 
Author Message
 Direct Access Record Size vs. Disk Sector Size

Hi,

Is there any advantage to making a direct access record be a multiple of
the sector size (say 512 bytes).  The only one I can think of is that it
allows me to make full use of all available space within a sector (I
think).  Are there any performance or other reasons to do so?  I fuzzily
remember that VM help documentation makes a big deal over formatting a
mini-disk to specific block sizes (1024 or 4096 usually) depending on
the type of data being stored.  I assume that similar reasoning applies
to Wintel.
--

Gary Scott


http://www.*-*-*.com/

Support the GNU fortran G95 Project:   http://www.*-*-*.com/



Sun, 22 Feb 2004 07:24:03 GMT  
 Direct Access Record Size vs. Disk Sector Size

Quote:

> Is there any advantage to making a direct access record be a multiple of
> the sector size (say 512 bytes).

I'd expect some performance benefits, but I'll not be so foolish as to
make a claim that they would be measurable unless I actually did such
a measurement.  At a fairly low level, I/O to the disk is only in
terms of entire sectors.  So if you have a record that occupies so
much as a single byte in a particular sector, modifying that record
involves reading the whole sector, modifying the copy in memory, and
then writing the whole sector back out.  So writing record smaller
than a sector, but crossing the boundary between two sectors, could
easily end up involving 2 reads and 2 writes.  However, the operating
system might be smart enough about caching things to avoid this.  If
you are only writing the one record, its going to be hard to avoid...
but then if you are only writing one record, performance isn't likely
to be much of an issue.  If you are writing/rewriting lots of records,
including the adjacent ones, then its pretty easy to imagine big cache
effects here.

If it really matters, test.  For my own part, I just go ahead and make
my records sizes like 1024 or 2048 on the theory that it is unlikely
to hurt and might help (and my applications need portability, so there
isn't much point in tuning them particularly finely for particular
operating systems).

Quote:
>  The only one I can think of is that it
> allows me to make full use of all available space within a sector

I would not expect this to be relevant to most implementations.  The
most common implementations just map direct access records linearly
onlto the file addresses.  Record boundaries don't necessarily have
anything to do with sector boundaries (unless you have done something....
like making the record size a multiple of the sector size...that makes it
work out that way).  You shouldn't see unused space in sectors other than
the last.  Other implementations have existed, but I don't think they
are widespread (if they were, then I'd have had a lot more file portability
problems than actually seem to happen with my programs).

--
Richard Maine
email: my last name at domain
domain: qnet dot com



Sun, 22 Feb 2004 09:44:14 GMT  
 Direct Access Record Size vs. Disk Sector Size

Quote:

> Hi,

> Is there any advantage to making a direct access record be a multiple of
> the sector size (say 512 bytes).  The only one I can think of is that it
> allows me to make full use of all available space within a sector (I
> think).  Are there any performance or other reasons to do so?  I fuzzily
> remember that VM help documentation makes a big deal over formatting a
> mini-disk to specific block sizes (1024 or 4096 usually) depending on
> the type of data being stored.  I assume that similar reasoning applies
> to Wintel.
> --

> Gary Scott


> http://www.fortranlib.com

> Support the GNU Fortran G95 Project:  http://g95.sourceforge.net

I know this only from a colleague who experimented with it and perhaps
the odd fragment in a compiler manual, but it definitely made sense in
the old, all but forgotten days of MS-DOS. I have no clue as to whether
this is still true for current disk drivers and operating systems.

I would suspect that under UNIX and Windows NT as well, caching of any
writes and reads to the disk will annihilate any (mildly non-thoroughly)
attempt to gain advantage.

Regards,

Arjen Markus



Sun, 22 Feb 2004 14:22:44 GMT  
 Direct Access Record Size vs. Disk Sector Size

Quote:

> Is there any advantage to making a direct access record be a multiple of
> the sector size (say 512 bytes).  The only one I can think of is that it
> allows me to make full use of all available space within a sector (I
> think)...

Sectors are (in all implementations I am familiar with) always filled in.
Records will span sectors as needed.

Quote:
>  Are there any performance or other reasons to do so?

It is the job of the I/O library and/or OS to buffer records up and
make the actual low-level I/O requests to the disk drive.  This
buffering may be in the library, the OS, or both.

The process of writing data from a user array to disk could involve
a number of memory copies.  For example, from user array to library buffer,
then from library buffer into OS system buffer, then to the disk drive
(which nowadays usually has its own cache built in as well.)

For tiny I/O requests, like just a few sectors worth or less, it is
usually best to just let the I/O library buffer/cache things.  But for
large requests (say, many megabytes per request), it can be useful to
look into more efficient techniques that avoid extra data movement.

Some systems offer options to DMA data directly between the I/O device
and the users memory - bypassing layers of buffering.  Typically your
I/O requests then need to be 'well formed' - starting address must be
on a sector boundary and record length must be some multiple of the
sector size.  Otherwise the run-time library has to punt and buffer
things again.

Quote:
>  I fuzzily
> remember that VM help documentation makes a big deal over formatting a
> mini-disk to specific block sizes (1024 or 4096 usually) depending on
> the type of data being stored.  I assume that similar reasoning applies
> to Wintel.

Bigger is usually better.  But there are cases where small is useful.
Depends on the application.

Walt
-...-
Walt Spector
(w6ws at earthlink dot net)



Sun, 22 Feb 2004 23:43:14 GMT  
 Direct Access Record Size vs. Disk Sector Size
Hi,

Quote:


> >  The only one I can think of is that it
> > allows me to make full use of all available space within a sector

> I would not expect this to be relevant to most implementations.  The
> most common implementations just map direct access records linearly
> onlto the file addresses.  Record boundaries don't necessarily have
> anything to do with sector boundaries (unless you have done something....
> like making the record size a multiple of the sector size...that makes it
> work out that way).  You shouldn't see unused space in sectors other than
> the last.  

The last written sector was my concern.  I would like to squeeze in a
few more characters of text (which I lost when I added a UUID and
userID), but without significantly increasing the record length beyond
10000 bytes (19.5 sectors, but probably being stored in 20 sectors??).
If I change the record size to exactly a multiple of 20 sectors, I can
get about 240 characters of additional text without actually increasing
the amount of storage used on disk (yes/no?).  Or is the space in that
last sector not actually "wasted" on Wintel (on VM it would be or so the
help files imply) (Yes, this is just academic in the grand scheme of
things, but this database will be in existence for many years and grow
to many tens of megabytes before the project ends, so I'd like to be as
efficient as possible).

Other implementations have existed, but I don't think they

Quote:
> are widespread (if they were, then I'd have had a lot more file portability
> problems than actually seem to happen with my programs).

> --
> Richard Maine
> email: my last name at domain
> domain: qnet dot com



Mon, 23 Feb 2004 00:10:41 GMT  
 Direct Access Record Size vs. Disk Sector Size

Quote:

> If it really matters, test.  For my own part, I just go ahead and make
> my records sizes like 1024 or 2048 on the theory that it is unlikely
> to hurt and might help

In general, you should subtract some overhead bytes from that. For instance,
on VMS RMS will use 4 bytes (IIRC - maybe it's 2 bytes?) to indicate whether
the record is valid. Others might add the (redundant) record number just in
case.

        Jan



Mon, 23 Feb 2004 17:34:07 GMT  
 Direct Access Record Size vs. Disk Sector Size

Quote:


> > If it really matters, test.  For my own part, I just go ahead and make
> > my records sizes like 1024 or 2048 on the theory that it is unlikely
> > to hurt and might help

> In general, you should subtract some overhead bytes from that. For instance,
> on VMS RMS will use 4 bytes (IIRC - maybe it's 2 bytes?) to indicate whether
> the record is valid. Others might add the (redundant) record number just in
> case.

I realize the possibility, but I've rarely (well, not in the last 15
years or so anyway) yet seen an implementation of direct access that
did this.  Much before the mid-80's I might not have noticed.  I have
found very high portability of direct access files (I handle differing
data representations myself - but I rely on the operating system to
read or write the bits from/to the file).

Lahey has a file header (but not record header) for direct access
files by default.  Disableable by compiler switch.  About 15 years or
so ago, one of the operating systems on our Sim computers at the time
did system-dependent things for direct access unformatted, but was
"transparent" for direct access formatted (so I used the f66-style
Hollerith trick of "A" format with non-character I/O lists.  Ugly and
horribly slow - formatted I/O always is, even when it is "trivial"
formatting like "A", but it worked).  I think some Apple2 compiler
might have padded direct access records out to sector boundaries.
I haven't worked on IBM mainframes since about the mid-70's (praise
the lord), so I wouldn't know about them.  Hmm, that might actually
be pertinent to the original poster - if I recall other posts, he does
some work with IBM mainframes, though I might be confusing him with
another poster on that point.

I haven't done a lot on Vaxen for the last decade or so, but I'm quite
sure that I also succeeded in reading and writing such files on a
Vax running VMS.  I might have had to specify some non-default file
properties - don't recall that level of detail - but I'm sure I got
it to work.  And it sure as heck would not have worked if VMS was
writing or expecting extra bytes for each record.

So although in theory it could be an issue, in practice today I find
direct access to be the most portable form of binary I/O.  Specifically,
it rarely has system-specific "stuff".

--
Richard Maine                       |  Good judgment comes from experience;
email: my last name at host.domain  |  experience comes from bad judgment.
host: altair, domain: dfrc.nasa.gov |        -- Mark Twain



Mon, 23 Feb 2004 22:51:53 GMT  
 
 [ 7 post ] 

 Relevant Pages 

1. Hard disk sector size

2. getting direct access file size?

3. Direct access file size on SGI fortran

4. ADIR() Size vs Calculated Size

5. ANS Forth: Cell Size vs. Address Size

6. size of libtk.a vs size of wishx

7. How to get File label access, record size information

8. access a sector on a ZIP-disk

9. MSDOS sector size

10. Writing at boot sector to the first sector of a disk

11. Bios direct disk access.

12. Direct Disk access question

 

 
Powered by phpBB® Forum Software