Some Expert level complications, Re: To accept input from PIPE by a program 
Author Message
 Some Expert level complications, Re: To accept input from PIPE by a program

Some questions of linked list to manage memory are purely C/C++ oriented
so I post this in comp.lang.c(++).



Quote:

> > Now I applied this solution to a file whose size is not specified.
> > The original working program would open the file specified in the
> > argv[1] and get the size by the usual fseek(SEEK_SET) method and
> > then ftell to get the size. However, this does not work with stdin.

>     There's nothing special about stdin that prevents this from
> working; the problem is with the source from which stdin reads its
> data.  Some sources (e.g., files) have well-defined sizes and are
> seekable.  Some (e.g., pipes, sockets, keyboards, ...) do not.

Is there a way to set up a pipe that feeds the data into the command_n
to hold all the output from the previous command_(n-1)in a big enough
pipe so that command_n can seek in it?

Quote:
> > I have been able to get the size from stdin using a loop with fgetc
> > till EOF. But then it is not able to move back. I assume that you
> > have to do fgetc and store the result as you go. This means that
> > you have to allocate memory and reallocate it if your size is
greater.

>     That's one way.  Another is to allocate a "chunk" of a convenient
> size, fill it with data from the file, and allocate additional chunks
> as needed, keeping track of the list of chunks somehow (perhaps by
> chaining them together in a linked list).  The rest of the program
> no longer has a simple array to deal with, but that may be
unimportant.

>     The "best" way, of course, is to figure out how to accomplish the
> task at hand without storing all the data: Do you *really* need to
> make multiple passes over the file, do you *really* need to access
> the data bytes in arbitrary order?  The answers to these questions
> may be affirmative, but if they aren't you can probably do everything
> you need with a smallish amount of storage essentially independent
> of the data length.

I do not need to do multiple passes over the data except in the
beginning. The problem is that I have to know the size to allocate
the memory. In the future when extending this problem, I may need to
do a lot of seeking in the file so that I would like to see both
solutions.

- Show quoted text -

Quote:
> > The only problem is that the size of the file
> > is very variable from <Meg to 10s of Megs and this command is inside
> > a pipe of commands. This means that you cannot take too much memory
> > since they need memory too, or do they???

>     "Yes."  Well, "No."  Or "Sort of."

>     Yes, the other programs need memory.  No, they don't compete with
> each other for "the same" memory, because they run in separate
> address spaces.  But they sort of compete, because the system has
> finite resources with which to provide the illusion that is virtual
> memory.

>     Summary: It's a good idea to keep memory consumption down, but
> it's not the Holy Grail it once was.  A few tens of megabytes isn't
> especially large, but it can become so if you've got a pipeline of
> half a dozen programs, each one of which makes its own copy (or two
> or three copies) of those megabytes.

The only solution I think that you suggested above is that of linked
list of data. This is a binary file. Can you give a set of procedures
and data structures so that this can be done transparently in C? and
also in super C, that is non OOP C++ or C with classes? I would much
appreciate a pseudo code and some actual code for this particular
problem.

Sent via Deja.com
http://www.*-*-*.com/



Sat, 28 Jun 2003 22:47:58 GMT  
 Some Expert level complications, Re: To accept input from PIPE by a program

Quote:
> Some questions of linked list to manage memory are purely C/C++ oriented
> so I post this in comp.lang.c(++).

[ as well as c.u.p and c.u.s, so some of this is offtopic for c.l.c/++ ]

...

Quote:
> Is there a way to set up a pipe that feeds the data into the command_n
> to hold all the output from the previous command_(n-1)in a big enough
> pipe so that command_n can seek in it?

Pipes aren't seekable and don't have a size.
You could use an explicit temporary file instead:
  step1 | step2 >temp; <temp step3 | step4
step3 can seek in its input; step2 and 4 can't.
(Use a filename that doesn't conflict with
other concurrent processes or permanent files;
/tmp/myprog.$$ is one traditional method.)

Or you could within the program explicitly copy
from stdin, at least the data you need random access to,
into a temp file, and use that.  Of course in a VM system,
just reading it into "memory" and letting the OS swap it,
versus writing it to a temp file and letting the OS cache
that file, amounts to pretty much the same thing.

...

Quote:
> I do not need to do multiple passes over the data except in the
> beginning. The problem is that I have to know the size to allocate
> the memory. In the future when extending this problem, I may need to
> do a lot of seeking in the file so that I would like to see both
> solutions.
...
> The only solution I think that you suggested above is that of linked
> list of data. This is a binary file. Can you give a set of procedures
> and data structures so that this can be done transparently in C? and
> also in super C, that is non OOP C++ or C with classes? I would much
> appreciate a pseudo code and some actual code for this particular
> problem.

C-with-classes, the original name of what became C++, is OO
or at least OB.  I guess you mean the "better C" subset of C++.

A linked list is quite simple.

#define SOMESIZE maybe 16K, 64K, 256K, 1M, etc.
/* or in C++ static const size_t SIZE = whatever; */

struct bufchunk {
  struct bufchunk * next;
  unsigned char body[SOMESIZE];

Quote:
};

/* in C add (and use) typedef to taste */
struct bufchunk * head = NULL, * tail;
size_t totalread = 0, actual;
struct bufchunk * t;
do{
  if (t = malloc(sizeof *t)) != NULL ) /* out of memory! */
  /* in C++ need cast t = (bufchunk*)malloc ...
    or use new(nothrow) or use new and omit error check */
  if( !head) head = t;
  else tail->next = t;
  tail = t;
  t->next = NULL;
/* or more trickily: xx *head=NULL, **tailptr=&head;
  ... *tailptr = t; *(tailptr = &t->next) = NULL; */
#if readwholefile
Quote:
}while( fread(t->body, 1, SOMESIZE, stdin) > 0 );

#else /* up to specified (maybe computed) amount */
  /* may need to reduce size of last chunk */
  if( (actual = fread(...)) == 0 ) /* error or early EOF */
Quote:
}while( (totalread += actual) < amountwanted );

#endif

/* to access byte n and maybe following */
size_t i;
for( i = 0, t = head; i + SOMESIZE < n; i += SOMESIZE )
  if( (t = t->next) == NULL ) /* seek off end */
access t->data[n-i] as needed
(or [n%SOMESIZE], usually more efficient if power-of-two)
be careful not to run off end of chunk:  if so,
step pointer to next (and not off end)

Alternatively you could have a pseudo-2D array, that is,
a (dynamic) array of pointers (to dynamic arrays).

unsigned char **chunks;
size_t numchunk, maxchunk, i;

unsigned char * t;
/* start out with enough pointer cells for typical use, say */
chunks = /*as above*/ malloc( (maxchunk=1024)*sizeof *chunks);
if(chunks == NULL) /* error */
for( i = 0; /* i < limitifany */; i++) {
  if( (t = /*as above*/ malloc(SOMESIZE)) == NULL ) /* error */
  if( numchunk >= maxchunk )
    if( (chunks = /*as above*/ realloc(chunks,
      (maxchunk *= 2)*sizeof *chunks)) == NULL ) /* error */
    /* if you want to recover gracefully, need to remember
      old pointer from before realloc attempt to use and/or free */
    /* C++ don't use new for this because it can't be realloc'ed */
    /* but could go to std::vector<unsigned char*> */

  chunks[i] = t;
  if( fread(t,1,SOMESIZE,stdin) <= 0 ) /* EOF */

Quote:
}

access chunks[n/SOMESIZE][n%SOMESIZE];
this is (usually) most efficient for a power-of-2 size

Or (really tricky) you can omit maxchunk,
leave the pointer null initially, and use
  if(!(i&(i-1))) p = realloc(p, (i+!i)*2 *sizeof *p)

Finally, on a VM system, it's nearly as good, and simpler,
to just declare or malloc/new some arbitrarily large size
and use the portion you need and let the rest remain
uncommitted/unswapped.  If static leave it zero-initialized
so it can (on most implementations, including AFAIK all
Unices) go in BSS and take no space in the object file;
if auto or dynamic leave it uninitialized.

--
- David.Thompson 1 now at worldnet.att.net



Fri, 11 Jul 2003 05:27:08 GMT  
 
 [ 2 post ] 

 Relevant Pages 

1. How to write a C/C++ program that can accept input from a pipe

2. To accept input from PIPE by a program

3. Multi-level piped processes from C

4. Please Help: Accepting Characters as Input

5. Accepting input as a password from Console in C#

6. Please Help: Accepting Characters as Input

7. RLE code needs to accept array input

8. Accepting input from cursor keys...

9. Accepting input from cursor keys...

10. Can compiler accept c-source from standard input?

11. Window Not Accepting Accelerator Input

 

 
Powered by phpBB® Forum Software