Efficient use of file handles in external sort 
Author Message
 Efficient use of file handles in external sort

I am writing an external sort routine.  That is, I am writing a
routine that sorts a data set too large to fit in memory at once.  For
those of you not familiar with the technique, this is generally done
in two steps:

1) Read as much as will fit into memory and sort that, then write the
results to a temporary file.  (FWIW, for this step, I'm using Knuth's
Algorithm 5.4.1R, "Replacement Selection.")

2) Merge several of the temporary files into another, longer temporary
file.  Continue in this way until all of the files are merged and the
end result is a fully ordered file.

Now, step 1 was pretty simple once I figured out how to implement the
algorithm.  For that matter, step 2 is pretty simple, except for one
nagging problem with file handles.

        The problem is this: Before the process began, I allocated
essentially all available memory[1] for an input buffer for steps 1
and 2 to deal with.  So when I want to open my temporary files, where
does fopen() get its FILE from?  Some implementations (perhaps many)
of the C library will use malloc() for this purpose, and it cannot be
assumed that there is any heap space available.

        I have come up with a couple possible solutions for this, but
none of them seem completely satisfactory:

            1) "Pre-allocate" file handles by opening as many file
    handles as are likely to be needed in advance.

    One problem with this is that there is no guarantee that it will
    help; what if, for instance, an implementation uses xstrdup() on
    the filename passed to fopen--this would cause filenames of
    different lengths to exhibit different heap usage.  Also, it
    wastes a lot of time to create files in the common case that only
    one temporary file is needed because the data indeed all fits in
    memory (I suppose I could special-case this later).

            2) Open file handles until fopen() returns failure.  Then
    free up some input records (reducing input buffer capacity) and
    retry fopen() until it returns success.

    There are numerous objections to this.  For instance, how to
    distinguish lack of memory for file handles from lack of disk
    space?  And if I free too many input records, then the order of
    merge (the number of files merged in each passed) must be reduced,
    which screws up the rest of the code because of the particular
    merge pattern in use (Huffman coding).

            3) I can assume that I'm working on a virtual memory
    machine and go ahead and blindly open as many files as I want.
    Although I've probably/possibly exhausted most/all of the physical
    memory of the machine, the FILEs shouldn't add very _much_ more--I
    don't open more than 8 at any one time--and thus the thrashing
    shouldn't be very bad even if I do use up all the physical memory.
    I just came up with this one as I'm typing, so it isn't very well
    thought through--feel free to rip it apart.

One thing to do would be to examine a few existing implementations of
the C library.  Do most or all guarantee that at least FOPEN_MAX file
handles can be opened at once?  Sadly, the answer is no: for instance
the GNU C library allocates file handles on the fly with malloc().
But the "pre-allocation" method 1 would work with glibc because it
never discards file handles, just invalidates them.  OTOH, looking at
the Borland C++ 4.0 library, that library keeps all FILEs in a static
array, so no special solution is necessary for BC++.

Anyone got any hints?  Also, pointers to existing external sort
implementations are welcome--I couldn't find any on the Web, believe
it or not.

--

PGP public key and home page at http://www.*-*-*.com/

[1] The amount of memory that the program is allowed to malloc() is a
    compile-time #define.  I'm not calling malloc() until it returns
    NULL in order to malloc() _all_ memory, that would be gruesome on
    a virtual memory machine.  OTOH, the algorithm should still work
    okay if less memory than the user specified at compile time is
    available.



Thu, 17 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort



| I am writing an external sort routine.  That is, I am writing a
| routine that sorts a data set too large to fit in memory at once.

[ problem: too many file handles ... ]

|         I have come up with a couple possible solutions for this, but
| none of them seem completely satisfactory:

[ ... ]

|             3) I can assume that I'm working on a virtual memory
|     machine and go ahead and blindly open as many files as I want.
|     Although I've probably/possibly exhausted most/all of the physical
|     memory of the machine, the FILEs shouldn't add very _much_ more--I
|     don't open more than 8 at any one time--and thus the thrashing
|     shouldn't be very bad even if I do use up all the physical memory.
|     I just came up with this one as I'm typing, so it isn't very well
|     thought through--feel free to rip it apart.

If your OS supports it, you could possibly use the mmap() function.
This function simply maps a file (region) to an address in the virtual
memory space. One of mmap's parameters tells it not to use ordinary
swap space, but use the file itself for those purposes (on my machine,
it's the MAP_SHARED parameter value).

Basically, the idea is this:

FILE* fd= fopen("your_huge_file", "r+");
void*  m= mmap(offset, fsiz, PROT_WRITE|PROT_READ, MAP_SHARED, fd, 0);

where 'offset' is an offset somewhere in your file (possibly 0) and fsiz is
the
size of the region to be mapped. If the mapping succeeds, the function
returns an address somewhere in memory which can be used at will ...

After you're done with the mapped region, you can unmap it again as
follows:

munmap(m, fsiz);

BTW, if you're using WNT (or another MS windows variant) have a look at
function 'CreateFileMapping' and 'MapViewOfFile', they're said to be
functionally equavent to the mmap() function ...

I hope this is not too specific ...

[it is, but I figure I'll allow this; however, please note followups. -mod]

kind regards,




Sat, 19 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort

Quote:

> I am writing an external sort routine.  That is, I am writing a
> routine that sorts a data set too large to fit in memory at once.  For
> those of you not familiar with the technique, this is generally done
> in two steps:

[ snip description of external sort algorithm ]

Quote:

>         The problem is this: Before the process began, I allocated
> essentially all available memory for an input buffer for steps 1
> and 2 to deal with.  So when I want to open my temporary files, where
> does fopen() get its FILE from?  Some implementations (perhaps many)
> of the C library will use malloc() for this purpose, and it cannot be
> assumed that there is any heap space available.

[ snip various ideas about avoiding or deferring malloc(2)s ]

Quote:

> Anyone got any hints?  Also, pointers to existing external sort
> implementations are welcome--I couldn't find any on the Web, believe
> it or not.

What occurs immediately to me is: if you are concerned about the space
taken up by FILE structures, perhaps you shouldn't be using buffered
i/o.  If you instead use unbuffered i/o ( creat, open, read, write,
close and friends ), you will only need to use file descriptors which
in general are int's and thereby smaller than FILE's.  On my machine,
the size of a FILE structure is 32 bytes, and an int is 4 bytes, so
there would be a significant savings in terms of space when the number
of files becomes large.  On a related note, many UNIX implementations
impose a limit on the number of files one processes may have open at
any one time.  This finite limit may assist you in determining the top
limit of how much memory you will need for file manipulations and how
much is left for your external sort.

Of course, this is just off the top of my head and I am almost certain
there's a good reason for ignoring this post, but I can't think of it:)

regards,
ejo

--

Disclaimer: You know the drill.. all opinions are mine.. blah blah blah.
(512) 838-2622         Cognito Ergo Disclaimum              T/L 678-2622



Sat, 19 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort

Quote:

> I am writing an external sort routine.  That is, I am writing a
> routine that sorts a data set too large to fit in memory at once.  For
> those of you not familiar with the technique, this is generally done
> in two steps:

> 1) Read as much as will fit into memory and sort that, then write the
> results to a temporary file.  (FWIW, for this step, I'm using Knuth's
> Algorithm 5.4.1R, "Replacement Selection.")

> 2) Merge several of the temporary files into another, longer temporary
> file.  Continue in this way until all of the files are merged and the
> end result is a fully ordered file.

> Now, step 1 was pretty simple once I figured out how to implement the
> algorithm.  For that matter, step 2 is pretty simple, except for one
> nagging problem with file handles.

>         The problem is this: Before the process began, I allocated
> essentially all available memory[1] for an input buffer for steps 1
> and 2 to deal with.  So when I want to open my temporary files, where
> does fopen() get its FILE from?  Some implementations (perhaps many)
> of the C library will use malloc() for this purpose, and it cannot be
> assumed that there is any heap space available.

In which case, fopen will fail.

The situation is worse than you seem to realize.  In most
implementations I am aware of, the first write to the newly opened file
will trigger a malloc.  In most cases, if the malloc fails, the write
will succeed anyway, but all further output will be unbuffered, i.e.:
you will have a system call per character output.

Quote:
>         I have come up with a couple possible solutions for this, but
> none of them seem completely satisfactory:

>             1) "Pre-allocate" file handles by opening as many file
>     handles as are likely to be needed in advance.

>     One problem with this is that there is no guarantee that it will
>     help; what if, for instance, an implementation uses xstrdup() on
>     the filename passed to fopen--this would cause filenames of
>     different lengths to exhibit different heap usage.  Also, it
>     wastes a lot of time to create files in the common case that only
>     one temporary file is needed because the data indeed all fits in
>     memory (I suppose I could special-case this later).

That is probably an unlikely occurance, although not impossible.  This
is the approach I would take, with the FILE* as static variables, *AND*
static buffers for each file.  Before using each file, I would also do a
setbuf so that the FILE used the static buffers.

This results in a significant amount of static memory being used, which
in turn reduces the amount of memory available on the heap, and thus the
number of objects which will fit in memory at one time.  Typically, I do
not think that this will make a significant difference in the total
time, but it is possible to imagine degenerate cases where it will cause
an additional pass.

Quote:
>             2) Open file handles until fopen() returns failure.  Then
>     free up some input records (reducing input buffer capacity) and
>     retry fopen() until it returns success.

>     There are numerous objections to this.  For instance, how to
>     distinguish lack of memory for file handles from lack of disk
>     space?  And if I free too many input records, then the order of
>     merge (the number of files merged in each passed) must be reduced,
>     which screws up the rest of the code because of the particular
>     merge pattern in use (Huffman coding).

And of course, the fact that the first write to each file will try and
allocate a buffer.  This is probably a killer argument, since there is
no guarantee that the records you free will be contiguous, and that the
system will be able to use them for a buffer.

Quote:
>             3) I can assume that I'm working on a virtual memory
>     machine and go ahead and blindly open as many files as I want.
>     Although I've probably/possibly exhausted most/all of the physical
>     memory of the machine, the FILEs shouldn't add very _much_ more--I
>     don't open more than 8 at any one time--and thus the thrashing
>     shouldn't be very bad even if I do use up all the physical memory.
>     I just came up with this one as I'm typing, so it isn't very well
>     thought through--feel free to rip it apart.

It's obviously not portable, as you say ("assume ... virtual memory").
Independantly of the possible thrashing, a certain number of Unixes seem
to start acting funny when all of the physical memory is used.  (Some
versions of Solaris, for example, hang for a few minutes, before
straightening themselves out and continuing normally.  And I seem to
recall hearing of one that would send SIG_KILL's to random processes,
although that might have been in conjunction with another problem.)

Quote:
> One thing to do would be to examine a few existing implementations of
> the C library.  Do most or all guarantee that at least FOPEN_MAX file
> handles can be opened at once?  Sadly, the answer is no: for instance
> the GNU C library allocates file handles on the fly with malloc().
> But the "pre-allocation" method 1 would work with glibc because it
> never discards file handles, just invalidates them.  OTOH, looking at
> the Borland C++ 4.0 library, that library keeps all FILEs in a static
> array, so no special solution is necessary for BC++.

Most of the libraries I've seen use the Borland strategy.  In which
case, the only real problem is the buffers; declare them statically and
use setbuf, and you are in business.

Of course, the standard doesn't really guarantee anything here, as far
as I can see.  

Quote:
> Anyone got any hints?  Also, pointers to existing external sort
> implementations are welcome--I couldn't find any on the Web, believe
> it or not.

I would imagine that linux comes with the sources to a version of the
Unix sort program.

--

GABI Software, Sarl., 8 rue des Francs Bourgeois, 67000 Strasbourg, France
Conseils en informatique industrielle --
                            -- Beratung in industrieller Datenverarbeitung



Sat, 19 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort

: I am writing an external sort routine.  That is, I am writing a
: routine that sorts a data set too large to fit in memory at once.  For
: those of you not familiar with the technique, this is generally done
: in two steps:

: 1) Read as much as will fit into memory and sort that, then write the
: results to a temporary file.  (FWIW, for this step, I'm using Knuth's
: Algorithm 5.4.1R, "Replacement Selection.")

: 2) Merge several of the temporary files into another, longer temporary
: file.  Continue in this way until all of the files are merged and the
: end result is a fully ordered file.

: Now, step 1 was pretty simple once I figured out how to implement the
: algorithm.  For that matter, step 2 is pretty simple, except for one
: nagging problem with file handles.

:         The problem is this: Before the process began, I allocated
: essentially all available memory[1] for an input buffer for steps 1
: and 2 to deal with.  So when I want to open my temporary files, where
: does fopen() get its FILE from?  Some implementations (perhaps many)
: of the C library will use malloc() for this purpose, and it cannot be
: assumed that there is any heap space available.

Why are you asking for such a huge amount of memory? At the bottom of
your post you said that a smaller amount of memory wouldn't hurt the
performance of the algorithm. So just ask for less, like five or ten
megs. If your OS supports virtual memory, then this shouldn't be a
problem. Then you can see if your algorithm works at all with out
crashing the process on an out of memory error. You don't _need_ to
get all of the memory for the algorithm(from my perception of it).
Although if you are getting the data from a tape and it's big like
40gigs or something, I can see why you would want to get a lot of
it into memory at once. Reading from the hard drive is quite fast
though, so if the data is on disk, who cares if you read from it ten
times or just once. Either way, if you run out of memory, get less
memory and you'll just have to suck the time loss reading from the
media. The result of this will be that you have memory for the
file descriptors.

:         I have come up with a couple possible solutions for this, but
: none of them seem completely satisfactory:

:             1) "Pre-allocate" file handles by opening as many file
:     handles as are likely to be needed in advance.

:     One problem with this is that there is no guarantee that it will
:     help; what if, for instance, an implementation uses xstrdup() on
:     the filename passed to fopen--this would cause filenames of
:     different lengths to exhibit different heap usage.  Also, it
:     wastes a lot of time to create files in the common case that only
:     one temporary file is needed because the data indeed all fits in
:     memory (I suppose I could special-case this later).

Whoa. I don't think you ever need to do this, for anything. You should
probably only need 1 or 2. Make one little(5-10Meg) sorted file
with a known, predictable,  file name, close the descriptor, and load in
the next segment to sort. Then open a _new_ file to put the results in.
From your explanation of the steps neccesary to complete the external
sort, you would need an incredible amount of space to store all of the
intermediate files. (I could be wrong on this because I've never
implemented that particular algorithm you describe.)

:             2) Open file handles until fopen() returns failure.  Then
:     free up some input records (reducing input buffer capacity) and
:     retry fopen() until it returns success.

:     There are numerous objections to this.  For instance, how to
:     distinguish lack of memory for file handles from lack of disk
:     space?  And if I free too many input records, then the order of
:     merge (the number of files merged in each passed) must be reduced,
:     which screws up the rest of the code because of the particular
:     merge pattern in use (Huffman coding).
Is this sorting method anything like 'merge sort' where the data that is
broken up is written on disk instead of kept in memory? Then all of it
is merged together in the final stages? I can't really answer this with
out understanding the need for so many file descriptors.

:             3) I can assume that I'm working on a virtual memory
:     machine and go ahead and blindly open as many files as I want.
:     Although I've probably/possibly exhausted most/all of the physical
:     memory of the machine, the FILEs shouldn't add very _much_ more--I
:     don't open more than 8 at any one time--and thus the thrashing
:     shouldn't be very bad even if I do use up all the physical memory.
:     I just came up with this one as I'm typing, so it isn't very well
:     thought through--feel free to rip it apart.
If you can assume that you are working on a virtual memory machine and
if it is UNIX, I have a solution for you. Use a handy little function
called 'mmap()'. This function allows you to map a file into memory
and the changes that you do to the memory are reflected in the file on
the drive(this is setable by configuration flags passed to mmap()). On
a 32-bit machine you can get close to a 4 gig limit(I think) of a file
size. If the file is on tape, load chunks of it onto a hard drive to
deal with. Then just open the file and pass the file descriptor to mmap.
Then you will have a way to sort the data(might have to do some type
casting magic on the memory though) and it will be automagically
sorted on the drive. No file i\o needed, the kernel does it for you,
and transparently. Also, you wouldn't need temporary files to take up
space on the device. For data files bigger than 4 gigs, I don't know
how well this would work. I'm sure it can be adapted though. The nice
aspect about this method is the file is sorted _in place_ in the device.
Just try to use smaller than physical memory if you can, who cares about
500K that isn't used for the sort, but needed for other memory allocation
purposes.

: One thing to do would be to examine a few existing implementations of
: the C library.  Do most or all guarantee that at least FOPEN_MAX file
: handles can be opened at once?  Sadly, the answer is no: for instance
: the GNU C library allocates file handles on the fly with malloc().
: But the "pre-allocation" method 1 would work with glibc because it
: never discards file handles, just invalidates them.  OTOH, looking at
: the Borland C++ 4.0 library, that library keeps all FILEs in a static
: array, so no special solution is necessary for BC++.

: Anyone got any hints?  Also, pointers to existing external sort
: implementations are welcome--I couldn't find any on the Web, believe
: it or not.

Remember, I don't know the exact algorithm you describe, so I could be
completely and utterly wrong in my suggestions. I hope I helped some
though, and that my prose wasn't too bad :).

My $0.02 worth.

-pete

P.S. I was just about to post this when a friend of mine said that
another database of keys only with indexes into the real database
could be made. Then after sorting just the keys, you can reorder
the real database. The data is moved only once. If the key represents
a lot of data, this is way more efficient, otherwise, it isn't.

Take your pick, I guess.

-----------------------------------------------------------------------------
Any and all ideas, words, and concepts are mine and noone else's. So
unfortunately, I take responsibility for them.



Sat, 19 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort

The first question is do you need a lot of file handles?  An external
merge sort should only need three at a time.

Any method of pre-allocating memory is likely to be a little messy.
Probably the least messy would be:
  1.  Allocate a small (reserve) buffer.
  2.  Allocate all available memory into sort buffers
  3.  Free the reserve buffer.

Alternatively you might consider making the maximum memory to use a
variable which can be set (say with a command line argument).  You can
then fine-tume it without a recompile.

 >
 > I am writing an external sort routine.  That is, I am writing a
 > routine that sorts a data set too large to fit in memory at once.  For
 > those of you not familiar with the technique, this is generally done
 > in two steps:
 >
 > 1) Read as much as will fit into memory and sort that, then write the
 > results to a temporary file.  (FWIW, for this step, I'm using Knuth's
 > Algorithm 5.4.1R, "Replacement Selection.")
 >
 > 2) Merge several of the temporary files into another, longer temporary
 > file.  Continue in this way until all of the files are merged and the
 > end result is a fully ordered file.
 >
 > Now, step 1 was pretty simple once I figured out how to implement the
 > algorithm.  For that matter, step 2 is pretty simple, except for one
 > nagging problem with file handles.
 >
 >         The problem is this: Before the process began, I allocated
 > essentially all available memory[1] for an input buffer for steps 1
 > and 2 to deal with.  So when I want to open my temporary files, where
 > does fopen() get its FILE from?  Some implementations (perhaps many)
 > of the C library will use malloc() for this purpose, and it cannot be
 > assumed that there is any heap space available.
 >
 >         I have come up with a couple possible solutions for this, but
 > none of them seem completely satisfactory:
 >
 >             1) "Pre-allocate" file handles by opening as many file
 >     handles as are likely to be needed in advance.
 >
 >     One problem with this is that there is no guarantee that it will
 >     help; what if, for instance, an implementation uses xstrdup() on
 >     the filename passed to fopen--this would cause filenames of
 >     different lengths to exhibit different heap usage.  Also, it
 >     wastes a lot of time to create files in the common case that only
 >     one temporary file is needed because the data indeed all fits in
 >     memory (I suppose I could special-case this later).
 >
 >             2) Open file handles until fopen() returns failure.  Then
 >     free up some input records (reducing input buffer capacity) and
 >     retry fopen() until it returns success.
 >
 >     There are numerous objections to this.  For instance, how to
 >     distinguish lack of memory for file handles from lack of disk
 >     space?  And if I free too many input records, then the order of
 >     merge (the number of files merged in each passed) must be reduced,
 >     which screws up the rest of the code because of the particular
 >     merge pattern in use (Huffman coding).
 >
 >             3) I can assume that I'm working on a virtual memory
 >     machine and go ahead and blindly open as many files as I want.
 >     Although I've probably/possibly exhausted most/all of the physical
 >     memory of the machine, the FILEs shouldn't add very _much_ more--I
 >     don't open more than 8 at any one time--and thus the thrashing
 >     shouldn't be very bad even if I do use up all the physical memory.
 >     I just came up with this one as I'm typing, so it isn't very well
 >     thought through--feel free to rip it apart.
 >
 > One thing to do would be to examine a few existing implementations of
 > the C library.  Do most or all guarantee that at least FOPEN_MAX file
 > handles can be opened at once?  Sadly, the answer is no: for instance
 > the GNU C library allocates file handles on the fly with malloc().
 > But the "pre-allocation" method 1 would work with glibc because it
 > never discards file handles, just invalidates them.  OTOH, looking at
 > the Borland C++ 4.0 library, that library keeps all FILEs in a static
 > array, so no special solution is necessary for BC++.
 >
 > Anyone got any hints?  Also, pointers to existing external sort
 > implementations are welcome--I couldn't find any on the Web, believe
 > it or not.
 >
 > --

 > PGP public key and home page at http://www.msu.edu/user/pfaffben
 >
 > [1] The amount of memory that the program is allowed to malloc() is a
 >     compile-time #define.  I'm not calling malloc() until it returns
 >     NULL in order to malloc() _all_ memory, that would be gruesome on
 >     a virtual memory machine.  OTOH, the algorithm should still work
 >     okay if less memory than the user specified at compile time is
 >     available.

--
#####################################################################

Emmenjay Consulting                
PO Box 909, Kensington, 2033, AUSTRALIA               +61 2 9667 1582



Sat, 19 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort

The problem is efficient use of memory.  Don't allocate everything you
can lay your hands on, leave some for the library to work with.
It's true that by using bigger buffers you can to som extent reduce
I/O is a sort, but this is only true up to a point.  At some point you
are starting to cause paging I/O, and that is just as expensive as
file I/O.  (Depending on the architecture it may even be more
expensive).

Anyway, for an external sort you only need two file handles,
One is uses to read the input file (or files one at a time),
and to write the output file.  The other is used to access the sort
work file.  The first input and work file opens should be done
before you allocate the tournament buffers anyway.  You may be
able to use an implementation specific routine to estimate
how much memory would be useful based on the size of the input file.



Mon, 21 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort

There are many good suggestions in this response, from which I will
mostly skip quoting, but unfortunately most of them don't seem to
apply.  There is one item, however, that I would like to reply
directly to:

Quote:

> P.S. I was just about to post this when a friend of mine said that
> another database of keys only with indexes into the real database
> could be made. Then after sorting just the keys, you can reorder
> the real database. The data is moved only once. If the key represents
> a lot of data, this is way more efficient, otherwise, it isn't.

Actually, Knuth shows in section 5.4.9 of his book _Art of Computer
Programming, Vol. 3: Sorting and Searching_, that keysorting doesn't
help.  He shows that, if there are N records, with B per block, and
internal memory can hold M records at a time, then at least
N*log2(B)/M block-reading operations are necessary to rearrange the
records, where log2() is the binary logarithm function.
--

PGP public key and home page at http://www.msu.edu/user/pfaffben


Mon, 21 Jun 1999 03:00:00 GMT  
 Efficient use of file handles in external sort

 > >
 > > Anyone got any hints?  Also, pointers to existing external sort
 > > implementations are welcome--I couldn't find any on the Web, believe
 > > it or not.
 > >
 >
 > What occurs immediately to me is: if you are concerned about the space
 > taken up by FILE structures, perhaps you shouldn't be using buffered
 > i/o.  If you instead use unbuffered i/o ( creat, open, read, write,
 > close and friends ), you will only need to use file descriptors which
 > in general are int's and thereby smaller than FILE's.  On my machine,
 > the size of a FILE structure is 32 bytes, and an int is 4 bytes, so
 > there would be a significant savings in terms of space when the number
 > of files becomes large.  On a related note, many UNIX implementations
 > impose a limit on the number of files one processes may have open at
 > any one time.  This finite limit may assist you in determining the top
 > limit of how much memory you will need for file manipulations and how
 > much is left for your external sort.

Several people have suggested this, and I think it sounds on the whole
like a good idea.  But I have one additional question: do most
implementations support the unbuffered i/o functions?  They are not
part of the ANSI standard, so is it safe in general to make use of
them?  If so, then this may be the path to take.
--

PGP public key and home page at http://www.msu.edu/user/pfaffben



Mon, 21 Jun 1999 03:00:00 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. Efficient use of file handles in external sort

2. How do I sort external file for....

3. sorting and searching large external files?

4. Clarification: Efficient 2D sorting algorithms?

5. Efficient 2D sorting algorithm?

6. Consequences of using an external function without an external function declaration

7. how can i make an application using an external file

8. using an external .obj file

9. quickest way to sort an almost sorted file

10. need help on external sort/search

11. Need help with external sorting

12. Need help with external sorting

 

 
Powered by phpBB® Forum Software