questions about a backup program for the MS-DOS environment 
Author Message
 questions about a backup program for the MS-DOS environment

   I have currently written a backup program (using TC 2.0) specifically
   devoted to backing files in a meter reading system (I work in an
   electric cooperative).

   My program was in use for about 2 months with no problems
   when all of a sudden two problems occurred:

      1) an unformatted disk was put in the disk drive and caused
         my program to crash.

      2) the maximum file limit of (112) was reached on a 360k floppy
         disk.

   The information I am looking for is:

      - how to check to see if a floppy disk has been formatted.

      - if not, how to format it (various densities)

      - archiving all the files into one file, while still
         being able to extract them later.

      - possibly a faster copying scheme.  the following is the
         code I am using to copy from one file to another:

            do
            {
               n = fread(buf, sizeof(char), MAXBUF, infile);
               fwrite(buf, sizeof(char), n, outfile);
            } while (n == MAXBUF);        /* where MAXBUF = 7500 */

                                  thanks,
                                  david crow



Fri, 09 Oct 1992 23:38:32 GMT  
 questions about a backup program for the MS-DOS environment

Quote:
> [...]
>      - possibly a faster copying scheme.  the following is the
>         code I am using to copy from one file to another:

>            do
>            {
>               n = fread(buf, sizeof(char), MAXBUF, infile);
>               fwrite(buf, sizeof(char), n, outfile);
>            } while (n == MAXBUF);        /* where MAXBUF = 7500 */

Try:
     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
                fwrite(buf, sizeof(char), n, outfile);

By using BUFSIZ instead of your own buffer length you get a buffer size
equal to what the fread and fwrite routines use.  

--

D'Arcy Cain Consulting             |   Organized crime with an attitude
West Hill, Ontario, Canada         |
(416) 281-6094                     |



Sun, 11 Oct 1992 20:58:06 GMT  
 questions about a backup program for the MS-DOS environment

$      - archiving all the files into one file, while still
$         being able to extract them later.

   Well, the usual way of doing this is to have a file structure with
a header before each file containing:

- the filename or, perhaps, complete filespec
- the file size
- anything else important

$      - possibly a faster copying scheme.  the following is the
$         code I am using to copy from one file to another:
$            do
$            {
$               n = fread(buf, sizeof(char), MAXBUF, infile);
$               fwrite(buf, sizeof(char), n, outfile);
$            } while (n == MAXBUF);        /* where MAXBUF = 7500 */

   First:  Don't use a buffer size that is not an integral multiple
of sector size!  Make it a multiple of 512 (or, to be really safe,
1024).  I have a file copying program I wrote using a similar
method, except that it has an array of about 12 buffers.  At runtime,
it calls malloc () for each one, allocating something like 57344
(8192 + 16384 + 32768) bytes for each.  When these calls start failing
due to lack of memory, it keep cutting the buffer size in half until

a) the next cut in half would make it a multiple of 256 but not of 512
or
b) all buffers have been allocated

   It then inhales as much of the file as it can into memory and writes
it out again, repeating as necessary.  I don't recall what the performance
difference between this way and using a single 4096-byte buffer was, but
I think it was something in the order of 20-25%.  The only problem is that
if you're copying large files to or from a floppy, sometimes the floppy
will stop spinning while the program reads/writes a chunk using the hard
drive.
--
               More half-baked ideas from the oven of:
****************************************************************************

     <std_disclaimer.h> = "\nI'm only an undergraduate ... for now!\n";



Mon, 12 Oct 1992 01:08:13 GMT  
 questions about a backup program for the MS-DOS environment

Quote:

>> [...]
>>      - possibly a faster copying scheme.  the following is the
>>         code I am using to copy from one file to another:

>>            do
>>            {
>>               n = fread(buf, sizeof(char), MAXBUF, infile);
>>               fwrite(buf, sizeof(char), n, outfile);
>>            } while (n == MAXBUF);        /* where MAXBUF = 7500 */

>Try:
>     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
>            fwrite(buf, sizeof(char), n, outfile);

>By using BUFSIZ instead of your own buffer length you get a buffer size
>equal to what the fread and fwrite routines use.  

No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
(no wonder it's so slow :-)

Try (in small or tiny model):

#include <dos.h>

char far *buffer=farmalloc(65024);
unsigned n;
int readfile;   /* Open handle (use _open() ) */
int writefile;  /* Open handle (use _open() ) */

do
 {
 _BX=readfile;          /* Handle */
 _CX=65024;             /* Count */
 _DX=FP_OFF(buffer);    /* Offset of buffer */
 _DS=FP_SEG(buffer);    /* Segment of buffer */
 _AH=0x3f;
 geninterrupt(0x21);    /* Read */
 __emit__(0x73,2,0x2b,0xc0);    /* Clear AX if error.  This codes to:
                                        jnc over
                                        sub ax,ax
                                    over:
                                */
 _DS=_SS;               /* Restore data segment */

 n=_AX;                 /* Get amount actually read */
 if(!n) break;          /* If we're done */

 _CX=n;
 _BX=writefile;
 _DX=FP_OFF(buffer);
 _DS=FP_SEG(buffer);
 _AH=0x40;
 geninterrupt(0x21);    /* Write */
 _DS=_SS;
 } while(n==65024);
--



Sun, 18 Oct 1992 05:21:35 GMT  
 questions about a backup program for the MS-DOS environment



<<<      - possibly a faster copying scheme.  the following is the
<<<         code I am using to copy from one file to another:
<<<            do
<<<            {  n = fread(buf, sizeof(char), MAXBUF, infile);
<<<               fwrite(buf, sizeof(char), n, outfile);
<<<            } while (n == MAXBUF);        /* where MAXBUF = 7500 */
<<Try:
<<     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
<<                fwrite(buf, sizeof(char), n, outfile);
<<
<<By using BUFSIZ instead of your own buffer length you get a buffer size
<<equal to what the fread and fwrite routines use.  
<No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
<(no wonder it's so slow :-) Try (in small or tiny model):
<    [asm example deleted]

There is no point in going to asm to get high speed file copies. Since it
is inherently disk-bound, there is no sense (unless tiny code size is
the goal). Here's a C version that you'll find is as fast as any asm code
for files larger than a few bytes (the trick is to use large disk buffers):

#if Afilecopy
int file_copy(from,to)
#else
int file_append(from,to)
#endif
char *from,*to;
{       int fdfrom,fdto;
        int bufsiz;

        fdfrom = open(from,O_RDONLY,0);
        if (fdfrom < 0)
                return 1;
#if Afileappe
        /* Open R/W by owner, R by everyone else        */
        fdto = open(to,O_WRONLY,0644);
        if (fdto < 0)
        {   fdto = creat(to,0);
            if (fdto < 0)
                goto err;
        }
        else
            if (lseek(fdto,0L,SEEK_END) == -1)  /* to end of file       */
                goto err2;
#else
        fdto = creat(to,0);
        if (fdto < 0)
            goto err;
#endif

        /* Use the largest buffer we can get    */
        for (bufsiz = 0x4000; bufsiz >= 128; bufsiz >>= 1)
        {   register char *buffer;

            buffer = (char *) malloc(bufsiz);
            if (buffer)
            {   while (1)
                {   register int n;

                    n = read(fdfrom,buffer,bufsiz);
                    if (n == -1)                /* if error             */
                        break;
                    if (n == 0)                 /* if end of file       */
                    {   free(buffer);
                        close(fdto);
                        close(fdfrom);
                        return 0;               /* success              */
                    }
                    n = write(fdto,buffer,(unsigned) n);
                    if (n == -1)
                        break;
                }
                free(buffer);
                break;
            }
        }
err2:   close(fdto);
        remove(to);                             /* delete any partial file */
err:    close(fdfrom);
        return 1;

Quote:
}



Mon, 19 Oct 1992 05:39:58 GMT  
 questions about a backup program for the MS-DOS environment

Quote:




><<By using BUFSIZ instead of your own buffer length you get a buffer size
><<equal to what the fread and fwrite routines use.  
><No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
><(no wonder it's so slow :-) Try (in small or tiny model):
><        [asm example deleted]
>There is no point in going to asm to get high speed file copies. Since it
>is inherently disk-bound, there is no sense (unless tiny code size is
>the goal). Here's a C version that you'll find is as fast as any asm code
>for files larger than a few bytes (the trick is to use large disk buffers):
> [better C example deleted]

I didn't use asm to get the code itself fast.  The only reason I did it was so
that you can use 64K buffers in small/tiny model.  Now if only there was a
farread and farwrite call...

I guess you can just compile the program in large model to have this same
effect (by habit I don't tend to use the large models).

Interestingly, this aspect of the copy program is one place where I think DOS
is sometimes faster than UNIX.  I suspect that many UNIX versions of 'cp' use
block-sized buffers. Doing so makes overly pessimistic assumptions about the
amount of physical memory you're likely to get.  

Of course, since DOS doesn't buffer writes it often ends up being slower
anyway (since it has to seek to the FAT so often).  'copy *.*' would be much,
much faster if only DOS was just a wee bit smarter...
--



Mon, 19 Oct 1992 15:02:42 GMT  
 questions about a backup program for the MS-DOS environment

Quote:
>Interestingly, this aspect of the copy program [reading and writing very
>large blocks] is one place where I think DOS is sometimes faster than
>UNIX.  I suspect that many UNIX versions of 'cp' use block-sized buffers.
>Doing so makes overly pessimistic assumptions about the amount of
>physical memory you're likely to get.  

None of the newsgroups to which this is posted are particularly suited
to discussions about O/S level optimisation of file I/O, but I feel
compelled to point out that `big gulp' style copying is not always, and
indeed not often, the best way to go about things.  The optimal point
is often not `read the whole file into memory, then write it out of
memory', because this requires waiting for the entire file to come in
before figuring out where to put the new blocks for the output file.
It is better to get computation done while waiting for the disk to transfer
data, whenever this can be done without `getting behind'.  Unix systems
use write-behind (also known as delayed write) schemes to help out here;
writers need use only block-sized buffers to avoid user-to-kernel copy
inefficiencies.

As far as comp.lang.c goes, the best one can do here is call fread()
and fwrite() with fairly large buffers, since standard C provides nothing
more `primitive' or `low-level', nor does it give the programmer a way
to find a good buffer size.  Better stdio implementations will do well
with large fwrite()s, although there may be no way for them to avoid
memory-to-memory copies on fread().  A useful fwrite() implementation
trick goes about like this:

        set resid = number of bytes to write;
        set p = base of bytes to write;
        while (resid) {
                if (there is stuff in the output buffer ||
                    resid < output_buffer_size) {
                        n = MIN(resid, space_in_output_buffer);
                        move n bytes from p to buffer;
                        p += n;
                        resid -= n;
                        if (buffer is full)
                                if (fflush(output_file)) goto error;
                } else {
-->                  write output_buffer_size bytes directly;
                        if this fails, goto error;
                        p += n_written;
                        resid -= n_written;
                }
        }

The `trick' is in the line marked with the arrow --> : there is no
need to copy bytes into an internal buffer just to write them, at
least in most systems.  (Some O/Ses may `revoke' access to pages that
are being written to external files.)
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)



Tue, 20 Oct 1992 15:21:03 GMT  
 questions about a backup program for the MS-DOS environment

Quote:


>>Interestingly, this aspect of the copy program [reading and writing very
>>large blocks] is one place where I think DOS is sometimes faster than
>>UNIX.  I suspect that many UNIX versions of 'cp' use block-sized buffers.
>>Doing so makes overly pessimistic assumptions about the amount of
>>physical memory you're likely to get.  
>The optimal point
>is often not `read the whole file into memory, then write it out of
>memory', because this requires waiting for the entire file to come in
>before figuring out where to put the new blocks for the output file.
>It is better to get computation done while waiting for the disk to transfer
>data, whenever this can be done without `getting behind'.  Unix systems
>use write-behind (also known as delayed write) schemes to help out here;
>writers need use only block-sized buffers to avoid user-to-kernel copy
>inefficiencies.

On big, loaded, systems this is certainly true since you want full use of
'elevator' disk optimizing between multiple users.  This should be the normal
mode of operation.

The problem with this on smaller UNIX systems is that whatever the disk
interleave is will be missed unless there is very intelligent read-ahead. If
you're lucky enough to have all your memory paged in, one read call may, if
the system is designed right, read in contiguous sets of blocks without
missing the interleave.

For things like backups you usually want to tweak it a bit since this
operation is slow and can usually be done when no one else is on the system.  

Also, for copying to tapes and raw disks, 'cp' is usually very bad.  I think
dd can be used to transfer large sets of blocks.  On one system I know of, if
you 'cp' between two raw floppy devices, the floppy lights will blink on and
off for each sector.  Also you have to be carefull about what is buffered and
what isn't and happens when you mix the two.
--



Tue, 20 Oct 1992 18:27:29 GMT  
 questions about a backup program for the MS-DOS environment



   >> [...]
   >>      - possibly a faster copying scheme.  the following is the
   >>         code I am using to copy from one file to another:
   >>
   >>            do
   >>            {
   >>               n = fread(buf, sizeof(char), MAXBUF, infile);
   >>               fwrite(buf, sizeof(char), n, outfile);
   >>            } while (n == MAXBUF);        /* where MAXBUF = 7500 */
   >>
   >Try:
   >     while ((n = fread(buf, sizeof(char), BUFSIZ, infile)) != 0)
   >         fwrite(buf, sizeof(char), n, outfile);
   >
   >By using BUFSIZ instead of your own buffer length you get a buffer size
   >equal to what the fread and fwrite routines use.  

   No, no, no Yuck!  Don't use the C functions, and don't use such tiny buffers.
   (no wonder it's so slow :-)

   Try (in small or tiny model):

        ... 30 lines of *heavily* machine dependent C ...

To suggest replacing 5 (or 2) lines of working C that will run on
anything that runs C with 30 lines of `assembly code' that runs only
on a PC, with a specific memory model and C compiler is lunacy.

Especially as it is completely unnecessary.

The most important things are:

        + the buffer size
        + avoiding needless copying

The bigger the buffer, the less time you go round the loop.  I would
suggest using the open/read/write/close functions instead of stdio.h
for copying files.  This is because stdio does its own buffering:

        input -> infile's buffer -> your buf -> outfile's buffer -> output

with read/write *you* do the buffering:

        input -> your buffer -> output

Use a large buffer, preferably one that is a multiple of the block
size of the disk long.  It will go as fast as the 30 line wonder.  And
if it doesnt work you stand a chance of debugging it.



Tue, 20 Oct 1992 16:36:24 GMT  
 questions about a backup program for the MS-DOS environment

Quote:


>>Interestingly, this aspect of the copy program [reading and writing very
>>large blocks] is one place where I think DOS is sometimes faster than
>>UNIX.  I suspect that many UNIX versions of 'cp' use block-sized buffers.
>>Doing so makes overly pessimistic assumptions about the amount of
>>physical memory you're likely to get.  

>...`big gulp' style copying is not always, and
>indeed not often, the best way to go about things...  Unix systems
>use write-behind (also known as delayed write) schemes to help out here;
>writers need use only block-sized buffers to avoid user-to-kernel copy
>inefficiencies.

Indeed.  The DOS implementation of cp is only apparently "better"
because it is doing something explicitly which the Unix program
has no need for.  Unaided DOS has no write-behind or read-ahead
(and very little caching), and programs that do 512-byte or 1K
reads and writes (including, tragically, most programs using
stdio) run abysmally slowly.

Using stdio is supposed to be the "right" thing to do; the stdio
implementation should worry about things like correct block
sizes, leaving these unnecessary system-related details out of
application programs.  (Indeed, BSD stdio does an fstat to pick a
buffer size matching the block size of the underlying
filesystem.)  If a measly little not-really-an-operating-system
like DOS must be used at all, a better place to patch over its
miserably simpleminded I/O "architecture" would be inside stdio,
which (on DOS) should use large buffers (up around 10K) if it can
get them, certainly not 512 bytes or 1K.  Otherwise, every
program (not just backup or cp) potentially needs to be making
explicit, system-dependent blocksize choices.  cat, grep, wc,
cmp, sum, strings, compress, etc., etc., etc. all want to be able
to read large files fast.  (The versions of these programs that I
have for DOS all run unnecessarily slowly, because the stdio
package they are written in terms of is doing pokey little 512
byte reads.  I refuse to sully all of those programs with
explicit blocksize notions.  Sooner or later I have to stop using
the vendor's stdio implementation and start using my own, so I
can have it do I/O in bigger chunks.)


Quote:
>Also, for copying to tapes and raw disks, 'cp' is usually very bad.  I think
>dd can be used to transfer large sets of blocks.  On one system I know of, if
>you 'cp' between two raw floppy devices, the floppy lights will blink on and
>off for each sector.

A certain amount of "flip-flopping" like this is inevitable under
vanilla Unix, at least when using raw devices, since there is no
notion of asynchronous I/O: the system call for reading is active
until it completes, during which time the write call is inactive,
and the reader is similarly idle while writing is going on.
Graham Ross once proposed a clever "double-buffered" device copy
program which forked, resulting in two processes, sharing input
and output file descriptors, and synchronized through a semaphore
so that one was always actively reading while the other one was
writing.  (This trick is analogous to the fork cu used to do to
have two non-blocking reads pending.)  It was amazing to watch a
tape-to-tape copy via this program between high-throughput, 6250
bpi tape drives: both tapes would spin continuously, without
pausing.  (cp or dd under the same circumstances resulted in
block-at-a-time pauses while the writing drive waited for the
reader and vice versa.)

                                            Steve Summit



Wed, 21 Oct 1992 12:35:03 GMT  
 questions about a backup program for the MS-DOS environment
In Microsoft C (and some others), one can use "setvbuf" to attach a large
I/O buffer to a stdio-package file.

It's sensible to wish for the operating system to do this itself, but in
the DOS world, given the 640K memory limit, it's not totally unreasonable to
place the memory allocation burden on the application program (since it
probably has to worry about tight memory).

Using "setvbuf" makes a biiiiiig difference in file I/O performance.



Thu, 22 Oct 1992 08:21:52 GMT  
 
 [ 16 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Advice Needed re: MS-DOS/Win 98 as Python environment

2. General Programming Question: Making Backups

3. APL terminal program for MS-DOS wanted

4. seeking specialized ms-dos program

5. Slow Win95 Network Printing from MS-DOS Programs

6. MS-DOS reboot program

7. MS-DOS programming possible?

8. MS-DOS Boot Sector & Boot Programs

9. MS-DOS EDIT only program to thwart TSR

10. MS-DOS-calls from Fortran-program

11. HELP : MS FPS/DOS programs running in Windows

12. Cobol & MS-dos Question

 

 
Powered by phpBB® Forum Software