A proposed replacement for gets (was: I/P of a string containing spaces) 
Author Message
 A proposed replacement for gets (was: I/P of a string containing spaces)

Quote:

>It would seem that there's general agreement that gets() is evil and to
>be avoided; and yet there's at least desire to have a routine that reads
>a line of text from stdin and delivers it /sans/ newline as a string of
>char.

This is trivial to do with a simple wrapper around fgets(). All
that is needed is an extra step that looks for the newline
(perhaps using strchr()) and replaces it with a null byte.

Quote:
>I posted a proposed replacement Thursday evening and invited comment.
>Mike McCarty pointed out (correctly) that what I'd posted wasn't thread
>safe and was itself dangerous. It's been a long time since I've written
>re-entrant code (and then not in C) but I've given it another go. The
>source for char *sgets(void) follows and after that is a minimal
>mainline that can be used to play with it. By the way, it's much easier
>to read in monospaced fonts.

Nice sentiment; except that your news posting software has made a mess
of the code. Lines longer than 80 characters seem to have been wrapped
rather than preserved. Serves you right for violating the 79 column rule. :)

My comments are:

1. There is no precedent for a standard library function that calls
   malloc to allocate storage. So this new proposed function is a
   significant departure from the design. A fgets-like function
   that nukes the trailing newline with a null byte would probably
   fit in better; throngs of C programmers have written this code
   around fgets, so why not codify it into a function?

2. The E2BIG error code is already claimed by standards other than
   ANSI C and is not defined by ANSI C. Using it for this purpose
   would create controversy among various standards committees.

3. The identifier errno could be a macro. You can't do ``extern int errno''.
   Don't take this the wrong way, but if you are going to be proposing
   additions to the standard library, you should know little things like this.

4. Your recursive implementation is interesting, but does chew
   up automatic storage and is potentially inefficient. The stdin
   stream is not always a sluggish 120 wpm hacker.  :)

   Also, just because the language has no means to signal the
   unavailability of automatic storage doesn't mean that the storage
   is unlimited.  A solution based around realloc would be more robust,
   because it could detect running out of memory (at least on systems
   that don't lie by overcommitting requests beyond virtual memory size).
   Not that this matters much; it's the external interface of sgets that
   is important, not how it's implemented---though the difficulty
   of implementation must be considered, of course.

5. I'd say that it's useful enough to have an input routine that can deal with
   ``arbitrarily'' long lines, that it is worthwhile for its interface to take
   a FILE * parameter. Why tie it to stdin?

6. The fgets function can already deal with arbitrarily long lines, albeit in a
   piecemeal fashion. The fact that \n is added to the line is important
   because it distinguishes a partial read of a line from a completed
   read. When fgets returns a string with no \n at the end it means
   one of two things: the text stream wasn't terminated with a newline,
   or more data follows, continuing in the same line. Your proposed
   interface can't be used to tell whether the stream ended in a \n or
   not, therefore in this case it fails to properly reconstruct a
   binary stream.  The fgets() function, on the other hand, is precise
   enough to be used as the basis of a binary copy program.



Wed, 24 Oct 2001 03:00:00 GMT  
 A proposed replacement for gets (was: I/P of a string containing spaces)

Quote:

> My comments are:

> 1. There is no precedent for a standard library function that calls
>    malloc to allocate storage. So this new proposed function is a
>    significant departure from the design. A fgets-like function
>    that nukes the trailing newline with a null byte would probably
>    fit in better; throngs of C programmers have written this code
>    around fgets, so why not codify it into a function?

Works for me. BTW, I don't use either gets() or fgets(), so I don't have an axe to
grind here. It bothers me somewhat that there's a C language feature that
sophisticated programmers are saying "Never use!" (and I'm assuming that means
/they/ don't) and that seems to trip up the newcomers with some regularity. If it's
evil, let's either change it so it isn't - or just get rid of it altogether.

Quote:
> 2. The E2BIG error code is already claimed by standards other than
>    ANSI C and is not defined by ANSI C. Using it for this purpose
>    would create controversy among various standards committees.

Not out to usurp anyone's claim to error codes - but was hesitant to stake out a
new one. This code was simply the one that made the most sense in view of the error
I was trying to report. I'm not sure that I'm too concerned about avoiding
controversy between committees that bless dangerous and conducive-to-error library
functions anyway. (Sorry - please consider me "attitudnally (sp?) challenged" in
situations like that.)

Quote:
> 3. The identifier errno could be a macro. You can't do ``extern int errno''.
>    Don't take this the wrong way, but if you are going to be proposing
>    additions to the standard library, you should know little things like this.

Yup, I think it pretty well has to be a macro to satisfy the POSIX thread
requirement that each thread has to have its /own/ errno. Regretably, my compiler,
linkage editor, and run time libraries seem to like the extern approach, so I just
left it in. You're right about knowing things like this and I plan some serious
research time (It may be a losing battle - my ignorance has been growing faster
than my knowlege for some time now - "the hurrier I go the behinder I get!")

Quote:
> 4. Your recursive implementation is interesting, but does chew
>    up automatic storage and is potentially inefficient. The stdin
>    stream is not always a sluggish 120 wpm hacker.  :)

I've always thought that recursion was like perfume. Smells nice but tastes
terrible. Recursion is elegant /and/ hungry. On the other hand, it was the most
obvious way to accumulate "reasonable" amounts of data without knowing how big the
buffer needed to be. The intent was not to produce the be-all-and-end-all input
routine, but rather to provide a safer alternative to gets().

Quote:
>    Also, just because the language has no means to signal the
>    unavailability of automatic storage doesn't mean that the storage
>    is unlimited.  A solution based around realloc would be more robust,
>    because it could detect running out of memory (at least on systems
>    that don't lie by overcommitting requests beyond virtual memory size).
>    Not that this matters much; it's the external interface of sgets that
>    is important, not how it's implemented---though the difficulty
>    of implementation must be considered, of course.

I'm not completely sure that I captured all that you said; but I think I'm in
agreement with you. If some nut tries to read the National Archives (from which
some other nut has thoughtfully removed all the line-feeds) with one call to /any/
routine, there's bound to be grief. I just added realloc() to my study list 8-)

Quote:
> 5. I'd say that it's useful enough to have an input routine that can deal with
>    ``arbitrarily'' long lines, that it is worthwhile for its interface to take
>    a FILE * parameter. Why tie it to stdin?

I did think about it (and it'd have been easy to do) but I consciously put my
blinders on, telling myself to deal /just/ with the gets() headache. But I confess
that I also prefer general solutions.

Quote:
> 6. The fgets function can already deal with arbitrarily long lines, albeit in a
>    piecemeal fashion. The fact that \n is added to the line is important
>    because it distinguishes a partial read of a line from a completed
>    read. When fgets returns a string with no \n at the end it means
>    one of two things: the text stream wasn't terminated with a newline,
>    or more data follows, continuing in the same line. Your proposed
>    interface can't be used to tell whether the stream ended in a \n or
>    not, therefore in this case it fails to properly reconstruct a
>    binary stream.  The fgets() function, on the other hand, is precise
>    enough to be used as the basis of a binary copy program.

I agree with what you've said here but think it's a non sequitor since I wasn't
trying to obsolete or detract from the clear value of fgets().

My original version took about an hour to write and the second took another half
hour. Its 0445 (CDT) which means spent more time on this than on the code (and
suspect that this time was better spent.) Thanks for sharing your insights and
experience with me on this one.

Regards...

Morris Dovey
West Des Moines, Iowa USA



Wed, 24 Oct 2001 03:00:00 GMT  
 
 [ 2 post ] 

 Relevant Pages 

1. A proposed replacement for gets (was: Input of a string containing spaces)

2. Input of a string containing spaces?

3. storing strings that contain spaces

4. Animation in Turbo C (PS/2 compiler) lll Turbo C (in using a PS/2)

5. different printing results: ps-printer and non ps-printer

6. different printing results: ps-printer and non ps-printer

7. trying to solicit input containing spaces

8. CToolBarCtrl contains extra spaces

9. CToolBarCtrl contains extra spaces

10. CRichEditCtrl: Links to filenames containing spaces?

11. output of system(ps) into an array of strings

12. replacement for gets() function overflows..

 

 
Powered by phpBB® Forum Software