ANN: Numeric input from file streams 
Author Message
 ANN: Numeric input from file streams

I have just put up a package that includes the following source,
together with the various header files, testing, makefile, etc.  I
invite comments.  It is available under GPL, but other things can
be arranged.

  < http://www.*-*-*.com/ ;

I expect this to be especially useful in the embedded world, since
it allows accurate input of numeric values without requiring and
string storage.

/* ------------------------------------------------- *
 * File txtinput.c                                   *
 * ------------------------------------------------- */

#include <limits.h>   /* xxxx_MAX, xxxx_MIN */
#include <ctype.h>    /* isdigit, isblank, isspace */
#include <stdio.h>    /* FILE, getc, ungetc */
#include "stdops.h"   /* bool, true, false */
#include "txtinput.h"

#define UCHAR unsigned char

/* These stream input routines are written so that simple
 * conditionals can be used:
 *
 *      if (readxint(&myint, stdin)) {
 *         do_error_recovery; normally_abort_to_somewhere;
 *      }
 *      else {
 *         do_normal_things; usually_much_longer_than_bad_case;
 *      }
 *
 * They allow overflow detection, and permit other routines to
 * detect the character that terminated a numerical field. No
 * string storage is required, thus there is no limitation on
 * the length of input fields.  For example, a number entered
 * with a string of 1000 leading zeroes will not annoy these.
 *
 * The numerical input routines *NEVER* absorb a terminal '\n'.
 * Thus a sequence such as:
 *
 *      err = readxint(&myint, stdin);
 *      flushln(stdin);
 *
 * will always consume complete lines.
 *
 * They are also re-entrant, subject to the limitations of file
 * systems.  e.g interrupting readxint(v, stdin) operation with
 * a call to readxwd(wd, stdin) would not be well defined, if
 * the same stdin is being used for both calls.  If ungetc is
 * interruptible the run-time system is broken.
 */

/*--------------------------------------------------------------
 * skip all whitespace on f.  At completion getc(f) will
 *  return a non-blank character, which may be \n or EOF
 */
void skipblks(FILE *f)
{
   int ch;

   do {
      ch = getc(f);
   } while ((' ' == ch) || ('\t' == ch));
   /* while (isblank((UCHAR)ch)); */             /* for C99 */
   ungetc(ch, f);

Quote:
} /* skipblks */

/*--------------------------------------------------------------
 * skip all whitespace on f, including \n. At completion getc(f)
 *  will return a non-blank character, which may be EOF
 */
void skipwhite(FILE *f)
{
   int ch;

   do {
      ch = getc(f);
   } while (isspace((UCHAR)ch));
   ungetc(ch, f);

Quote:
} /* skipwhite */

/*--------------------------------------------------------------
 * Read an unsigned value.  Signal error for overflow or no
 * valid number found. Returns true for error, false for noerror
 *
 * Skip all leading whitespace on f.  At completion getc(f) will
 * return the character terminating the number, which may be \n
 * or EOF among others. It will NOT be a digit.  The combination
 * of error and the following getc returning \n indicates that
 * no numerical value was found on the line.
 * If the user wants to skip all leading white space including
 * /n, FF, VT, CR, he should first call "skipwhite(f);"
 *
 * Peculiarity: This specifically forbids a leading '+' or '-'.
 * Peculiarity: This forbids overflow, unlike C unsigned usage.
 *              on overflow, UINT_MAX is returned.
 */
bool readxwd(unsigned int *wd, FILE *f)
{
   unsigned int value, digit;
   bool         status;
   int          ch;

   #define UWARNLVL (UINT_MAX / 10U)
   #define UWARNDIG (UINT_MAX - UWARNLVL * 10U)

   value = 0;                           /* default */
   status = true;                       /* default error */

   do {
      ch = getc(f);
   } while ((' ' == ch) || ('\t' == ch));  /* skipblanks */
   /* while (isblank((UCHAR)ch)); */   /* for C99 */

   if (!(EOF == ch)) {
      if (isdigit((UCHAR)ch))         /* digit, no error */
         status = false;
      while (isdigit((UCHAR)ch)) {
         digit = (unsigned) (ch - '0');
         if ((value < UWARNLVL) ||
             ((UWARNLVL == value) && (UWARNDIG >= digit)))
            value = 10 * value + digit;
         else {                         /* overflow */
            status = true;
            value = UINT_MAX;
         }
         ch = getc(f);
      } /* while (ch is a digit) */
   }
   *wd = value;
   ungetc(ch, f);
   return status;

Quote:
} /* readxwd */

/*--------------------------------------------------------------
 * Read a signed value.  Signal error for overflow or no valid
 * number found.  Returns true for error, false for noerror.  On
 * overflow either INT_MAX or INT_MIN is returned in *val.
 *
 * Skip all leading whitespace on f.  At completion getc(f) will
 * return the character terminating the number, which may be \n
 * or EOF among others. It will NOT be a digit.  The combination
 * of error and the following getc returning \n indicates that
 * no numerical value was found on the line.
 *
 * If the user wants to skip all leading white space including
 * /n, FF, VT, CR, he should first call "skipwhite(f);"
 *
 * Peculiarity: an isolated leading '+' or '-' NOT immediately
 * followed by a digit will return error and a value of 0, when
 * the next getc will return that following non-digit.  This is
 * caused by the single level ungetc available.
 */
bool readxint(int *val, FILE *f)
{
   unsigned int value;
   bool         status, negative;
   int          ch;

   *val = value = 0;                    /* default */
   status = true;                       /* default error */
   negative = false;

   do {
          ch = getc(f);
   } while ((' ' == ch) || ('\t' == ch));  /* skipwhite */
   /* while (isblank((UCHAR)ch)); */       /* for C99 */

   if (!(EOF == ch)) {
      if (('+' == ch) || ('-' == ch)) {
         negative = ('-' == ch);
         ch = getc(f);                  /* absorb any sign */
      }

      if (isdigit((UCHAR)ch)) {         /* digit, no error */
         ungetc(ch, f);
         status = readxwd(&value, f);
         ch = getc(f);          /* This terminated readxwd */
      }

      if (negative && (value < UINT_MAX) &&
         ((value - 1) <= -(1 + INT_MIN)))  *val = -value;
      else if (value <= INT_MAX)           *val = value;
      else {                    /* overflow */
         status = true;
         if (value)
            if (negative)                  *val = INT_MIN;
            else                           *val = INT_MAX;
      }
   }
   ungetc(ch, f);
   return status;

Quote:
} /* readxint */

/*--------------------------------------------------------------
 * Flush input through an end-of-line marker inclusive.
 */
void flushln(FILE *f)
{
   int ch;

   do {
      ch = getc(f);
   } while (('\n' != ch)  && (EOF != ch));

Quote:
} /* flushln */

/* End of txtinput.c */

--

   Available for consulting/temporary embedded and systems.
   < http://www.*-*-*.com/ >  USE worldnet address!



Thu, 24 Mar 2005 09:45:42 GMT  
 ANN: Numeric input from file streams


Wed, 18 Jun 1902 08:00:00 GMT  
 ANN: Numeric input from file streams
[Read in comp.lang.c]


Quote:
> I have just put up a package that includes the following source,
> together with the various header files, testing, makefile, etc.  I
> invite comments.  It is available under GPL, but other things can
> be arranged.

>   <http://cbfalconer.home.att.net/download/txtio.zip>

> I expect this to be especially useful in the embedded world, since
> it allows accurate input of numeric values without requiring and
> string storage.

> /* ------------------------------------------------- *
>  * File txtinput.c                                   *
>  * ------------------------------------------------- */

> #include <limits.h>   /* xxxx_MAX, xxxx_MIN */
> #include <ctype.h>    /* isdigit, isblank, isspace */

Why are you using isdigit()? Outside of the C locale, there is no guarantee
that subtracting '0' from a flagged character is in the range 0..9.

[snip]

Quote:
> /*--------------------------------------------------------------
>  * Read a signed value.  Signal error for overflow or no valid
>  * number found.  Returns true for error, false for noerror.  On
>  * overflow either INT_MAX or INT_MIN is returned in *val.
>  *
>  * Skip all leading whitespace on f.  At completion getc(f) will
>  * return the character terminating the number, which may be \n
>  * or EOF among others. It will NOT be a digit.

                          ^^^^^^^^^^^^^^^^^^^^^^^
You can't guarantee this for implementations with non-sticky EOF. For
example: 123^Z0, where ^Z is a console's EOF signalling character.

Quote:
>                                                 The combination
>  * of error and the following getc returning \n indicates that
>  * no numerical value was found on the line.
>  *
>  * If the user wants to skip all leading white space including
>  * /n, FF, VT, CR, he should first call "skipwhite(f);"

     ^^^^^^^^^^^^^^
I prefer clear specifications like: \n \f \v \r. [Or write LF instead of /n]

- Show quoted text -

Quote:
>  *
>  * Peculiarity: an isolated leading '+' or '-' NOT immediately
>  * followed by a digit will return error and a value of 0, when
>  * the next getc will return that following non-digit.  This is
>  * caused by the single level ungetc available.
>  */

> bool readxint(int *val, FILE *f)
> {
>    unsigned int value;
>    bool         status, negative;
>    int          ch;

>    *val = value = 0;                    /* default */
>    status = true;                       /* default error */
>    negative = false;

>    do {
>           ch = getc(f);
>    } while ((' ' == ch) || ('\t' == ch));  /* skipwhite */
>    /* while (isblank((UCHAR)ch)); */       /* for C99 */

>    if (!(EOF == ch)) {
>       if (('+' == ch) || ('-' == ch)) {
>          negative = ('-' == ch);
>          ch = getc(f);                  /* absorb any sign */
>       }

>       if (isdigit((UCHAR)ch)) {         /* digit, no error */
>          ungetc(ch, f);
>          status = readxwd(&value, f);
>          ch = getc(f);          /* This terminated readxwd */
>       }

>       if (negative && (value < UINT_MAX) &&
>          ((value - 1) <= -(1 + INT_MIN)))  *val = -value;

As was said in clc in reply to your other post on the subject, the test
(value < UINT_MAX) is all but redundant. Suppose you have a 17-bit SM
implementation that uses floating point hardware:

   INT_MIN == -65535
   INT_MAX ==  65535
  UINT_MAX ==  65535

Your code will state that "-65535" is +65535. Similarly, "-0" is handled
(albeit correctly) by your later test for non negative values.

And, you should use *val = - (int) value; to avoid implementation defined
conversion from unsigned to signed. Consider...

  *val = -1U;

If UINT_MAX > INT_MAX, then *val may receive /any/ number the implementation
cares to define. Under C99, it may even raise a signal. Whilst every
implementation that I know of just re-interprets the raw representation, the
standard allows not-unreasonable things (tm) such as defaulting to 0 if the
value is outside of the range of int.

That said, there is then the problem of INT_MIN negation overflow on two's
complement machines. As I basically said in clc, I think the only way of
portably handling this is to make INT_MIN a special case. [I did post code
in clc which I believe addresses these issues except for the bizarre case
below which I believe is intractable in your present design.]

Quote:
>       else if (value <= INT_MAX)           *val = value;
>       else {                    /* overflow */
>          status = true;
>          if (value)
>             if (negative)                  *val = INT_MIN;
>             else                           *val = INT_MAX;
>       }
>    }
>    ungetc(ch, f);
>    return status;
> } /* readxint */

AFAIK, the following limits are not impossible under standard C:

   INT_MIN == -65536
   INT_MAX ==  65535
  UINT_MAX ==  65535

By using the unsigned int function as a utility, your code will not accept a
legitimate "-65536" on such a hypothetical implementation.

Have you considered avoiding unsigned int altogether? For what (little) it's
worth, I have written a similar function below, although I wrote it for
longs. I can't imagine that the "doubling" up of negative cases produces
more code than your extra testing and wrapping of the unsigned version. But
I guess YMMV.

  /*
   * int fgetld(long *lp, FILE *fp);
   *
   * Read a long integer value into *lp from the stream fp.
   * Leading whitespace is significant. [i.e. an error]
   *
   * Return:  EOF:  end of file
   *
   *            0:  *lp == 0   stream is not a number
   *                    Any sign +/- will be swallowed if not
   *                    followed by a digit.
   *                *lp == LONG_MAX   positive overflow
   *                *lp == LONG_MIN   negative overflow
   *                    Additional digits beyond the one which
   *                    caused the overflow, will remain.
   *
   *            1:  *lp is number read
   */

  #define LONG_MAX_DIV_10      (LONG_MAX / 10L)
  #define LONG_MAX_LAST_DIGIT  (LONG_MAX - 10L * LONG_MAX_DIV_10)

  #define LONG_MIN_DIV_10      ((LONG_MIN / 10L) - (-1L / 2L))
  #define LONG_MIN_LAST_DIGIT  (10L * LONG_MIN_DIV_10 - LONG_MIN)

  int fgetld(long *lp, FILE *fp)
  {
    long d;
    int c, r;

    *lp = 0, r = 0;

    if ((c = fgetc(fp)) == EOF)
      return EOF;

    if (c == '-')
    {
      for (; '0' <= (c = fgetc(fp)) && c <= '9'; r = 1)
      {
        d = c - '0';

        if (     *lp >  LONG_MIN_DIV_10
             || (*lp == LONG_MIN_DIV_10 && d <= LONG_MIN_LAST_DIGIT) )
        {
          *lp = *lp * 10L - d;
        }
        else
        {
          *lp = LONG_MIN;
          return 0;
        }
      }
    }

    else /* c != '-' */
    {
      if (c == '+')
        c = fgetc(fp);

      for (; '0' <= c && c <= '9'; r = 1, c = fgetc(fp))
      {
        d = c - '0';

        if (    *lp <  LONG_MAX_DIV_10
            || (*lp == LONG_MAX_DIV_10 && d <= LONG_MAX_LAST_DIGIT) )
        {
          *lp = *lp * 10L + d;
        }
        else
        {
          *lp = LONG_MAX;
          return 0;
        }
      }
    }

    ungetc(c, fp);
    return r;
  }

--
Peter



Thu, 24 Mar 2005 16:41:37 GMT  
 ANN: Numeric input from file streams


Wed, 18 Jun 1902 08:00:00 GMT  
 ANN: Numeric input from file streams


Quote:
> I have just put up a package that includes the following source,
> together with the various header files, testing, makefile, etc.  I
> invite comments.  It is available under GPL, but other things can
> be arranged.

>   <http://cbfalconer.home.att.net/download/txtio.zip>

> I expect this to be especially useful in the embedded world, since
> it allows accurate input of numeric values without requiring and
> string storage.

but likely more complicated than neccessary

Quote:
> /* ------------------------------------------------- *
>  * File txtinput.c                                   *
>  * ------------------------------------------------- */

> #include <limits.h>   /* xxxx_MAX, xxxx_MIN */
> #include <ctype.h>    /* isdigit, isblank, isspace */
> #include <stdio.h>    /* FILE, getc, ungetc */
> #include "stdops.h"   /* bool, true, false */
> #include "txtinput.h"

> #define UCHAR unsigned char

/* make something easier to write */

/* assueme that c is already defined as int
   that gets resp. holds a char to/from stream */

/* GETC get a char from stream and test for EOF and error
        sample: while (GET(f)) [...} read until EOF
                or an error occures */

#define GETC(f) (c = fgetc(f)) != EOF && !feof(c) &&!ferror(f)
/* UNGETC  in some cases we're read a char too much
           so bring it back into stream */
#define UNGETC(f) ungetc(c, f);
/* check error condition: results: 0: no error, no EOF
                                  -1: EOF
                                   1: error  
#define errcond(f) ferror(f) ? 1 : feof(f) ? -1 : 0

Quote:
> /* These stream input routines are written so that simple
>  * conditionals can be used:
>  *
>  *      if (readxint(&myint, stdin)) {
>  *         do_error_recovery; normally_abort_to_somewhere;
>  *      }
>  *      else {
>  *         do_normal_things; usually_much_longer_than_bad_case;
>  *      }
>  *
>  * They allow overflow detection, and permit other routines to
>  * detect the character that terminated a numerical field. No
>  * string storage is required, thus there is no limitation on
>  * the length of input fields.  For example, a number entered
>  * with a string of 1000 leading zeroes will not annoy these.
>  *
>  * The numerical input routines *NEVER* absorb a terminal '\n'.
>  * Thus a sequence such as:
>  *
>  *      err = readxint(&myint, stdin);
>  *      flushln(stdin);
>  *
>  * will always consume complete lines.
>  *
>  * They are also re-entrant, subject to the limitations of file
>  * systems.  e.g interrupting readxint(v, stdin) operation with
>  * a call to readxwd(wd, stdin) would not be well defined, if
>  * the same stdin is being used for both calls.  If ungetc is
>  * interruptible the run-time system is broken.
>  */

> /*--------------------------------------------------------------
>  * skip all whitespace on f.  At completion getc(f) will
>  *  return a non-blank character, which may be \n or EOF
>  */
> void skipblks(FILE *f)
> {
>    int ch;

>    do {
>       ch = getc(f);
>    } while ((' ' == ch) || ('\t' == ch));
>    /* while (isblank((UCHAR)ch)); */             /* for C99 */
>    ungetc(ch, f);
> } /* skipblks */

int skipwhite(FILE *f) {
   int c;
   while (GETC(f)) {
      if (!isspace(c)) {   /* any white char readed */
         UNGETC(f);
         return 0;
      }
   }
   return errcond(f);

- Show quoted text -

Quote:
}
> /*--------------------------------------------------------------
>  * skip all whitespace on f, including \n. At completion getc(f)
>  *  will return a non-blank character, which may be EOF
>  */
> void skipwhite(FILE *f)
> {
>    int ch;

>    do {
>       ch = getc(f);
>    } while (isspace((UCHAR)ch));
>    ungetc(ch, f);
> } /* skipwhite */

> /*--------------------------------------------------------------
>  * Read an unsigned value.  Signal error for overflow or no
>  * valid number found. Returns true for error, false for noerror
>  *
>  * Skip all leading whitespace on f.  At completion getc(f) will
>  * return the character terminating the number, which may be \n
>  * or EOF among others. It will NOT be a digit.  The combination
>  * of error and the following getc returning \n indicates that
>  * no numerical value was found on the line.
>  * If the user wants to skip all leading white space including
>  * /n, FF, VT, CR, he should first call "skipwhite(f);"
>  *
>  * Peculiarity: This specifically forbids a leading '+' or '-'.
>  * Peculiarity: This forbids overflow, unlike C unsigned usage.
>  *              on overflow, UINT_MAX is returned.
>  */
> bool readxwd(unsigned int *wd, FILE *f)
> {
>    unsigned int value, digit;
>    bool         status;
>    int          ch;

>    #define UWARNLVL (UINT_MAX / 10U)
>    #define UWARNDIG (UINT_MAX - UWARNLVL * 10U)

>    value = 0;                           /* default */
>    status = true;                       /* default error */

>    do {
>       ch = getc(f);
>    } while ((' ' == ch) || ('\t' == ch));  /* skipblanks */
>    /* while (isblank((UCHAR)ch)); */   /* for C99 */

>    if (!(EOF == ch)) {
>       if (isdigit((UCHAR)ch))         /* digit, no error */
>          status = false;
>       while (isdigit((UCHAR)ch)) {
>          digit = (unsigned) (ch - '0');
>          if ((value < UWARNLVL) ||
>              ((UWARNLVL == value) && (UWARNDIG >= digit)))
>             value = 10 * value + digit;
>          else {                         /* overflow */
>             status = true;
>             value = UINT_MAX;
>          }
>          ch = getc(f);
>       } /* while (ch is a digit) */
>    }
>    *wd = value;
>    ungetc(ch, f);
>    return status;
> } /* readxwd */

int read_uint(unsigned int *p, int *numdigists, FILE *f) {
   unsigned int i = 0;
   int c;
   int rc;

   *numdigits = 0;
   if ((rc = skipwhite(f)) != 0) return(rc);
   while (GETC(f)) {
      if (!isdigit(c)) {
         *p = i;
         UNGETC(f);
         return 0;
      }
      i *= 10;
      i += c - '0';
      (*numdigits)++; /* differs between nothing readed and '0'.
   }
   *p = i;
   return errcond(f);

- Show quoted text -

Quote:
}
> /*--------------------------------------------------------------
>  * Read a signed value.  Signal error for overflow or no valid
>  * number found.  Returns true for error, false for noerror.  On
>  * overflow either INT_MAX or INT_MIN is returned in *val.
>  *
>  * Skip all leading whitespace on f.  At completion getc(f) will
>  * return the character terminating the number, which may be \n
>  * or EOF among others. It will NOT be a digit.  The combination
>  * of error and the following getc returning \n indicates that
>  * no numerical value was found on the line.
>  *
>  * If the user wants to skip all leading white space including
>  * /n, FF, VT, CR, he should first call "skipwhite(f);"
>  *
>  * Peculiarity: an isolated leading '+' or '-' NOT immediately
>  * followed by a digit will return error and a value of 0, when
>  * the next getc will return that following non-digit.  This is
>  * caused by the single level ungetc available.
>  */
> bool readxint(int *val, FILE *f)
> {
>    unsigned int value;
>    bool         status, negative;
>    int          ch;

>    *val = value = 0;                    /* default */
>    status = true;                       /* default error */
>    negative = false;

>    do {
>           ch = getc(f);
>    } while ((' ' == ch) || ('\t' == ch));  /* skipwhite */
>    /* while (isblank((UCHAR)ch)); */       /* for C99 */

>    if (!(EOF == ch)) {
>       if (('+' == ch) || ('-' == ch)) {
>          negative = ('-' == ch);
>          ch = getc(f);                  /* absorb any sign */
>       }

>       if (isdigit((UCHAR)ch)) {         /* digit, no error */
>          ungetc(ch, f);
>          status = readxwd(&value, f);
>          ch = getc(f);          /* This terminated readxwd */
>       }

>       if (negative && (value < UINT_MAX) &&
>          ((value - 1) <= -(1 + INT_MIN)))  *val = -value;
>       else if (value <= INT_MAX)           *val = value;
>       else {                    /* overflow */
>          status = true;
>          if (value)
>             if (negative)                  *val = INT_MIN;
>             else                           *val = INT_MAX;
>       }
>    }
>    ungetc(ch, f);
>    return status;
> } /* readxint */

/* signpos == 0: leading, 1 = trailing, 2 either */
int read_int(int *p, int *numdigits, int signpos, FILE *f) {
   int c;
   int rc;
   int sign = 0;

   *numdigits = 0;
   if ((rc = skipwhite(f)) != 0) return rc;
   if (signpos != 1) {/* leading sign accepted */
      if (GETC(f)) == '+') sign = 1, signpos = 0;
      else if (c == '-' sign = - 1, signpos = 0;
      else UNGETC(f); /* not a sign */
      while (GETC(f))
         if (' ' != c) { /* ignore any space between sign and value */
            UNGETC(f);
            break;
         }
   }
   if ((rc = read_uint(p, numdigits, f)) > 0)
      return rc; /* error! */
   if (signpos >= 1) /* look for trailing sign */
      if (GETC(f)) == '+') sign = 1;
      else if (c == '-' sign = - 1;
      else UNGETC(f); /* not a sign */
   if (!sign) sign = 1; /* no sign readed, assume '+' */
   *p *= sign;  /* applicate sign to value */
   return errcond(f);      

Quote:
}    {
> /*--------------------------------------------------------------
>  * Flush input through an end-of-line marker inclusive.
>  */
> void flushln(FILE *f)
> {
>    int ch;

>    do {
>       ch = getc(f);
>    } while (('\n' != ch)  && (EOF != ch));
> } /* flushln */

> /* End of txtinput.c */

int flushin(FILE *f) {
   while (GETC(f)) if ('\n' == c) break;
   return errcond(f);

Quote:
}

/* read floating point */
numdigits == 0: no digit and no exp. readed */
int read_double(double *p, int *numdigits, FILE *f) {
   int i = 0;   /* x. */          
   int d = 0;   /* .x */
   int e = 0;   /* e... */
   int sign = 0;
   int c;
   int rc;  
   int digits = 0;

   if
...

read more »



Thu, 24 Mar 2005 20:58:20 GMT  
 ANN: Numeric input from file streams


Wed, 18 Jun 1902 08:00:00 GMT  
 ANN: Numeric input from file streams

Quote:


> > I have just put up a package that includes the following source,
> > together with the various header files, testing, makefile, etc.  I
> > invite comments.  It is available under GPL, but other things can
> > be arranged.

> >   <http://cbfalconer.home.att.net/download/txtio.zip>

> > I expect this to be especially useful in the embedded world, since
> > it allows accurate input of numeric values without requiring and
> > string storage.

> > /* ------------------------------------------------- *
> >  * File txtinput.c                                   *
> >  * ------------------------------------------------- */

> > #include <limits.h>   /* xxxx_MAX, xxxx_MIN */
> > #include <ctype.h>    /* isdigit, isblank, isspace */

> Why are you using isdigit()? Outside of the C locale, there is no
> guarantee that subtracting '0' from a flagged character is in the
> range 0..9.

I don't understand this objection.  I used isdigit because it may
well be a macro and considerably more efficient that comparison
against '0' and '9'.  From N869:

7.4.1.4  The isdigit function

Synopsis

[#1]
        #include <ctype.h>
        int isdigit(int c);

Description

[#2]  The  isdigit  function  tests  for  any  decimal-digit
character (as defined in 5.2.1).

(which in turn enumerates the characters '0' through '9'.)

Quote:

> [snip]
> > /*--------------------------------------------------------------
> >  * Read a signed value.  Signal error for overflow or no valid
> >  * number found.  Returns true for error, false for noerror.  On
> >  * overflow either INT_MAX or INT_MIN is returned in *val.
> >  *
> >  * Skip all leading whitespace on f.  At completion getc(f) will
> >  * return the character terminating the number, which may be \n
> >  * or EOF among others. It will NOT be a digit.
>                           ^^^^^^^^^^^^^^^^^^^^^^^
> You can't guarantee this for implementations with non-sticky EOF. For
> example: 123^Z0, where ^Z is a console's EOF signalling character.

As I read the standard, ungetc(EOF, f) will fail, and the file
will still be at eof, so that a following getc will return EOF.
If the underlying system returns a ^Z (etc) without an EOF
condition, it is just another control char, and can be pushed
back.  Many systems do just this if the ^Z does not occur
immediately following a \n.

The point of this is that the routine completely absorbs a valid
numeric field, and leaves an indication of overflow.  Thus the
user is never surprised by a long string of digits effectively
returning two numeric fields.

Quote:

> >                                                 The combination
> >  * of error and the following getc returning \n indicates that
> >  * no numerical value was found on the line.
> >  *
> >  * If the user wants to skip all leading white space including
> >  * /n, FF, VT, CR, he should first call "skipwhite(f);"
>      ^^^^^^^^^^^^^^
> I prefer clear specifications like: \n \f \v \r. [Or write LF instead of /n]

Fair enough.

... snip ...

Quote:

> >       if (negative && (value < UINT_MAX) &&
> >          ((value - 1) <= -(1 + INT_MIN)))  *val = -value;

> As was said in clc in reply to your other post on the subject, the test
> (value < UINT_MAX) is all but redundant. Suppose you have a 17-bit SM
> implementation that uses floating point hardware:

>    INT_MIN == -65535
>    INT_MAX ==  65535
>   UINT_MAX ==  65535

> Your code will state that "-65535" is +65535. Similarly, "-0" is handled
> (albeit correctly) by your later test for non negative values.

I am still mulling this small piece.  I have to arrange it so that
integer overflow can never occur internally, else all bets are
off.  I believe the earlier discussion showed that UINT_MAX must
be greater than INT_MAX, because all bits must be used and there
can be no added trap bits.  My limited tests (see the zip file)
indicate no problem here.

... snip ...

--

   Available for consulting/temporary embedded and systems.
   <http://cbfalconer.home.att.net>  USE worldnet address!



Thu, 24 Mar 2005 21:15:39 GMT  
 ANN: Numeric input from file streams


Quote:
>> #include <limits.h>   /* xxxx_MAX, xxxx_MIN */
>> #include <ctype.h>    /* isdigit, isblank, isspace */

>Why are you using isdigit()? Outside of the C locale, there is no guarantee
>that subtracting '0' from a flagged character is in the range 0..9.

Doesn't 5.2.1 cover that?

" In both the source and execution basic character sets, the value of
each character after 0 in the above list of decimal digits shall be
one greater than the value of the previous."

And of course the isdigit() is used to guarantee that it *is* a digit.



Fri, 25 Mar 2005 23:11:12 GMT  
 ANN: Numeric input from file streams

Quote:


>> >  * Skip all leading whitespace on f.  At completion getc(f) will
>> >  * return the character terminating the number, which may be \n
>> >  * or EOF among others. It will NOT be a digit.
>>                           ^^^^^^^^^^^^^^^^^^^^^^^
>> You can't guarantee this for implementations with non-sticky EOF. For
>> example: 123^Z0, where ^Z is a console's EOF signalling character.

>As I read the standard, ungetc(EOF, f) will fail,

This is correct.

Quote:
>and the file
>will still be at eof, so that a following getc will return EOF.

Chapter and verse, please.

Try this program on a Unix system:

    #include <stdio.h>

    int main()
    {
        int c;

        do {
            puts("Press the eof key, please");
            c = getc(stdin);
        } while (c != EOF);
        ungetc(EOF, stdin);
        puts("Enter something, please");
        c = getc(stdin);
        if (c == EOF) puts("You either have sticky eof or you typed eof again");
        else printf("The first character you typed was '%c'\n", c);
        return 0;
    }

You'll find that the else branch is taken.  DOS/Windows implementations
are inconsistent: some have sticky EOF (e.g. Digital Mars) other don't
(e.g. old Turbo C compilers) so the behaviour of this program is
inconsistent, too.

Dan
--
Dan Pop
DESY Zeuthen, RZ group



Sat, 26 Mar 2005 00:22:55 GMT  
 ANN: Numeric input from file streams

Quote:




> >> >  * Skip all leading whitespace on f.  At completion getc(f) will
> >> >  * return the character terminating the number, which may be \n
> >> >  * or EOF among others. It will NOT be a digit.
> >>                           ^^^^^^^^^^^^^^^^^^^^^^^
> >> You can't guarantee this for implementations with non-sticky EOF. For
> >> example: 123^Z0, where ^Z is a console's EOF signalling character.

> >As I read the standard, ungetc(EOF, f) will fail,

> This is correct.

> >and the file
> >will still be at eof, so that a following getc will return EOF.

> Chapter and verse, please.

> Try this program on a Unix system:

>     #include <stdio.h>

>     int main()
>     {
>         int c;

>         do {
>             puts("Press the eof key, please");
>             c = getc(stdin);
>         } while (c != EOF);
>         ungetc(EOF, stdin);
>         puts("Enter something, please");
>         c = getc(stdin);
>         if (c == EOF) puts("You either have sticky eof or you typed eof again");
>         else printf("The first character you typed was '%c'\n", c);
>         return 0;
>     }

> You'll find that the else branch is taken.  DOS/Windows implementations
> are inconsistent: some have sticky EOF (e.g. Digital Mars) other don't
> (e.g. old Turbo C compilers) so the behaviour of this program is
> inconsistent, too.
>From N869:

        #include <stdio.h>
        int ungetc(int c, FILE *stream);

Description

[#2] The ungetc function pushes the character specified by c
(converted to an unsigned char) back onto the  input  stream
pointed  to  by  stream.   Pushed-back  characters  will  be
returned by subsequent reads on that stream in  the  reverse
order of their pushing.  A successful intervening call (with
the stream pointed to  by  stream)  to  a  file  positioning
function  (fseek,  fsetpos,  or rewind) discards any pushed-
back  characters  for  the  stream.   The  external  storage
corresponding to the stream is unchanged.

[#3] One character of pushback is guaranteed.  If the ungetc
function is called too many times on the same stream without
an  intervening  read  or file positioning operation on that
stream, the operation may fail.

[#4] If the value of c equals that of  the  macro  EOF,  the
operation fails and the input stream is unchanged.
                        ^^^^^^^^^^^^^^^^^^^^^^^^^
This, to me, implies that the EOF indicator remains set.

[#5]  A  successful  call  to the ungetc function clears the
end-of-file indicator for the stream.  The value of the file
^^^^^^^^^^^^^^^^^^^^^
I guess this doesn't imply that an unsuccessful call does NOT
reset the EOF indicator.

position   indicator   for   the  stream  after  reading  or
discarding all pushed-back characters shall be the  same  as
it  was  before the characters were pushed back.  For a text
stream, the value of its file  position  indicator  after  a
successful  call to the ungetc function is unspecified until
all pushed-back characters are read  or  discarded.   For  a
binary stream, its file position indicator is decremented by
each successful call to the ungetc function;  if  its  value
was  zero  before  a  call,  it  is  indeterminate after the
call.233)

All right, the guarantee is only barring eof :-[.  This leaves the
possibility of triggering integer overflow in the *val = -value;
statement even though value is known to be <= -INT_MIN.

--

   Available for consulting/temporary embedded and systems.
   <http://cbfalconer.home.att.net>  USE worldnet address!



Sat, 26 Mar 2005 02:20:04 GMT  
 ANN: Numeric input from file streams

Quote:
>[#4] If the value of c equals that of  the  macro  EOF,  the
>operation fails and the input stream is unchanged.
>                        ^^^^^^^^^^^^^^^^^^^^^^^^^
>This, to me, implies that the EOF indicator remains set.

This is correct.  But the eof indicator being set documents what happened
in the past, it does NOT predict the future.  In other words, there is
no guarantee that if a getc call has failed and the eof flag was set the
next getc call will automatically fail, too.

This is the sticky eof vs non-sticky eof issue.  On implementations with
sticky eof, if the eof flag is set on the stream any future input
operations on the stream will automatically fail.  On implementations
with non-sticky eof, having the eof flag set does not guarantee that
future input operations will fail (the implementation is free to try
again on the stream and see if some data has become available in the
meantime).

My program was supposed to illustrate the difference.

Dan
--
Dan Pop
DESY Zeuthen, RZ group



Sat, 26 Mar 2005 17:54:19 GMT  
 
 [ 11 post ] 

 Relevant Pages 

1. c file (stream) input buffer problem help PLEASE!

2. file input stream buffer as string array!

3. numeric formatting within streams (facets?)

4. keyboard input numeric error checking

5. Input validation on a numeric argument

6. Numeric Input Function

7. Numeric Input Function

8. keyboard input numeric error checking

9. Forcing numeric input

10. Unable to sync input and output streams

11. How to clear input stream in ANSI C?

12. flushing the standard input stream (newb question)

 

 
Powered by phpBB® Forum Software