Bug in GAWK 3.1.1? 
Author Message
 Bug in GAWK 3.1.1?

When I run the following AWK file...

BEGIN {
   $0 = "00E0";
   print $0 ", " ($0 && 1) ", " ($0 != "");
   $1 = "00E0";
   print $1 ", " ($1 && 1) ", " ($1 != "");

Quote:
}

With GAWK 3.1.1 for Windows (compiled by SimTel), I get the following
output...

00E0, 0, 1
00E0, 1, 1

With GAWK 3.1.0 for Windows (compiled by SourceForge), I get...

00E0, 1, 1
00E0, 1, 1

As far as I know, if "$0" isn't blank, the value of "($0 && 1)" should
be "1" (true). I get the same problem if I substitute "00E0" with
"00E1" to "00E9". Other strings don't have have this problem (for
example, "00EA"). The problem occurs whether I use file input or
whether I manually assign "$0" (as above).

Can anyone else reproduce this problem? Does anyone have explanation?

-SB



Sun, 12 Jun 2005 01:30:56 GMT  
 Bug in GAWK 3.1.1?

Quote:
> When I run the following AWK file...

> BEGIN {
>    $0 = "00E0";
>    print $0 ", " ($0 && 1) ", " ($0 != "");
>    $1 = "00E0";
>    print $1 ", " ($1 && 1) ", " ($1 != "");
> }

> With GAWK 3.1.1 for Windows (compiled by SimTel), I get the following
> output...

> 00E0, 0, 1
> 00E0, 1, 1

> With GAWK 3.1.0 for Windows (compiled by SourceForge), I get...

> 00E0, 1, 1
> 00E0, 1, 1

> As far as I know, if "$0" isn't blank, the value of "($0 && 1)" should
> be "1" (true). I get the same problem if I substitute "00E0" with
> "00E1" to "00E9". Other strings don't have have this problem (for
> example, "00EA"). The problem occurs whether I use file input or
> whether I manually assign "$0" (as above).

> Can anyone else reproduce this problem? Does anyone have explanation?

> -SB

Not that this is unexpected, but,
FYI, Win 98 SE, t.awk as specified above...

    C:\>awk -f t.awk
    00E0, 1, 1
    00E0, 1, 1

    C:\>awk --version
    GNU Awk 3.1.0
    ...

Also, RedHat Linux 7.1

    $ awk 'BEGIN {
    >    $0 = "00E0";
    >    print $0 ", " ($0 && 1) ", " ($0 != "");
    >    $1 = "00E0";
    >    print $1 ", " ($1 && 1) ", " ($1 != "");
    > }'
    00E0, 1, 1
    00E0, 1, 1
    $ awk --version
    GNU Awk 3.1.0
    ...

- Dan



Sun, 12 Jun 2005 10:10:01 GMT  
 Bug in GAWK 3.1.1?

Quote:

>When I run the following AWK file...

>BEGIN {
>   $0 = "00E0";
>   print $0 ", " ($0 && 1) ", " ($0 != "");
>   $1 = "00E0";
>   print $1 ", " ($1 && 1) ", " ($1 != "");
>}

>With GAWK 3.1.1 for Windows (compiled by SimTel), I get the following
>output...

>00E0, 0, 1
>00E0, 1, 1

>With GAWK 3.1.0 for Windows (compiled by SourceForge), I get...

>00E0, 1, 1
>00E0, 1, 1

>As far as I know, if "$0" isn't blank, the value of "($0 && 1)" should
>be "1" (true). I get the same problem if I substitute "00E0" with
>"00E1" to "00E9". Other strings don't have have this problem (for
>example, "00EA"). The problem occurs whether I use file input or
>whether I manually assign "$0" (as above).

>Can anyone else reproduce this problem? Does anyone have explanation?

>-SB

It seems to be a problem in the simtel version. Under Linux, all of
gawk, mawk, and the bell labs awk produce the same results as
the sourceforge version.

Arnold
--

P.O. Box 354            Home Phone: +972  8 979-0381    Fax: +1 928 569 9018
Nof Ayalon              Cell Phone: +972 51  297-545
D.N. Shimshon 99785     ISRAEL



Sun, 12 Jun 2005 17:04:23 GMT  
 Bug in GAWK 3.1.1?

Quote:



>>When I run the following AWK file...

>>BEGIN {
>>   $0 = "00E0";
>>   print $0 ", " ($0 && 1) ", " ($0 != "");
>>   $1 = "00E0";
>>   print $1 ", " ($1 && 1) ", " ($1 != "");
>>}

>>With GAWK 3.1.1 for Windows (compiled by SimTel), I get the following
>>output...

>>00E0, 0, 1
>>00E0, 1, 1

>>With GAWK 3.1.0 for Windows (compiled by SourceForge), I get...

>>00E0, 1, 1
>>00E0, 1, 1

>>As far as I know, if "$0" isn't blank, the value of "($0 && 1)" should
>>be "1" (true). I get the same problem if I substitute "00E0" with
>>"00E1" to "00E9". Other strings don't have have this problem (for
>>example, "00EA"). The problem occurs whether I use file input or
>>whether I manually assign "$0" (as above).

>>Can anyone else reproduce this problem? Does anyone have explanation?

>>-SB

>It seems to be a problem in the simtel version. Under Linux, all of
>gawk, mawk, and the bell labs awk produce the same results as
>the sourceforge version.

I get the SimTel behaviour from gawks 3.1.1 (compiled from source) and
3.0.6 (supplied binary) on FreeBSD 4.5, and 3.0.3 (supplied binary) on
RedHat Linux 6.something.  The same Linux box displays the SourceForge
behaviour on a locally compiled version of 3.1.1, as does a RedHat 7.3
box with the supplied 3.1.0 binary.

In short: wierd.

-- don



Mon, 13 Jun 2005 13:30:02 GMT  
 Bug in GAWK 3.1.1?

Quote:

> When I run the following AWK file...

> BEGIN {
>    $0 = "00E0";
>    print $0 ", " ($0 && 1) ", " ($0 != "");
>    $1 = "00E0";
>    print $1 ", " ($1 && 1) ", " ($1 != "");
> }

> With GAWK 3.1.1 for Windows (compiled by SimTel), I get the following
> output...

> 00E0, 0, 1
> 00E0, 1, 1

> With GAWK 3.1.0 for Windows (compiled by SourceForge), I get...

> 00E0, 1, 1
> 00E0, 1, 1

> As far as I know, if "$0" isn't blank, the value of "($0 && 1)" should
> be "1" (true). I get the same problem if I substitute "00E0" with
> "00E1" to "00E9". Other strings don't have have this problem (for
> example, "00EA"). The problem occurs whether I use file input or
> whether I manually assign "$0" (as above).

> Can anyone else reproduce this problem? Does anyone have explanation?

But should ($0 && 1) be 1? If $0 is a string, yes, but if $0 is a number then
00E1 will be converted (by strtod, which might vary between compilers/locales)
to 0, and 0 &&1 is 0.

If $0 itself were being concatenated, then it would be a string but
it is not. Rather, $0 is being &&-ed with 1, and at this point I am not
clever enough to discern whether that makes $0 a string or a number.

Having said that, it would still not be clear why $0 and $1 are sometimes
treated differently.

John.



Mon, 13 Jun 2005 19:33:32 GMT  
 Bug in GAWK 3.1.1?


% > When I run the following AWK file...
% >
% > BEGIN {
% >    $0 = "00E0";
% >    print $0 ", " ($0 && 1) ", " ($0 != "");
% >    $1 = "00E0";
% >    print $1 ", " ($1 && 1) ", " ($1 != "");
% > }
% >
% > With GAWK 3.1.1 for Windows (compiled by SimTel), I get the following
% > output...
% >
% > 00E0, 0, 1
% > 00E0, 1, 1
% >
% > With GAWK 3.1.0 for Windows (compiled by SourceForge), I get...
% >
% > 00E0, 1, 1
% > 00E0, 1, 1

[...]

% But should ($0 && 1) be 1? If $0 is a string, yes, but if $0 is a number then
% 00E1 will be converted (by strtod, which might vary between compilers/locales)
% to 0, and 0 &&1 is 0.

There's a bit of ambiguity around this, but I think the 3.1.0 behaviour
is correct.

Variables in awk can be strings, numbers, or both (numeric strings). The
latest POSIX spec tries to make it clear that how the variable should be
interpreted depends on where the value comes from. If it's the result of
an explicit assignment, the type should be taken from the rhs of the
assignment. If it's taken from the environment, a split operation field
splitting, or I/O (getline or the input loop), the type is treated as
numeric string.

In this case, since $0 and $1 are both assigned values from a string,
they ought to be treated as strings and so should evaluate to `true',
since the length is non-zero. One problem with this interpretation is
that the standard explicitly says that values taken from field variables
are to be treated as numeric strings. The rationale section suggests
to me that this is an editing problem, though. Bolstering the argument
in favour of 3.1.0 is the fact that numeric strings are only relevant
when dealing with comparision operators, so $0 and $1 ought to be
treated as strings when used with &&.

I suggest filing a gawk bug report.
--

Patrick TJ McPhee
East York  Canada



Tue, 14 Jun 2005 02:49:28 GMT  
 Bug in GAWK 3.1.1?

Quote:
> It seems to be a problem in the simtel version. Under Linux, all of
> gawk, mawk, and the bell labs awk produce the same results as
> the sourceforge version.

> Arnold

I tried the Cygwin version of GAWK 3.1.1 for Windows and it doesn't
have this problem. The problem does appear to be specific to the
SimTel version of GAWK 3.1.1.

-SB



Wed, 15 Jun 2005 03:56:57 GMT  
 Bug in GAWK 3.1.1?

Quote:

>>When I run the following AWK file...

>>BEGIN {
>>   $0 = "00E0";
>>   print $0 ", " ($0 && 1) ", " ($0 != "");
>>   $1 = "00E0";
>>   print $1 ", " ($1 && 1) ", " ($1 != "");
>>}

 <....>

Quote:
>It seems to be a problem in the simtel version. Under Linux, all of
>gawk, mawk, and the bell labs awk produce the same results as
>the sourceforge version.

I use three different awks for MSDOS (not Windoze).

$ gawk2156 -W version
Gnu Awk (gawk) 2.15, patchlevel 6

$ gawk2156 -f p.scr
00E0, 0, 1
00E0, 1, 1

$ gawk306 -W version
GNU Awk 3.0.6
Copyright (C) 1989, 1991-2000 Free Software Foundation.

$ gawk306 -f p.scr
00E0, 0, 1
00E0, 1, 1

$ mawk122 -W version
mawk 1.2.2dos+os2 Jan 1996, Copyright (C) Michael D. Brennan
Microsoft C/C++ _MSC_VER 600

$ mawk122 -f p.scr
00E0, 1, 1
00E0, 1, 1

--
John Savage            (for email, replace "ks" with "k" and delete "n")



Wed, 15 Jun 2005 04:09:01 GMT  
 Bug in GAWK 3.1.1?
Hello,


Quote:
> When I run the following AWK file...

> BEGIN {
>    $0 = "00E0";
>    print $0 ", " ($0 && 1) ", " ($0 != "");
>    $1 = "00E0";
>    print $1 ", " ($1 && 1) ", " ($1 != "");
> }

> With GAWK 3.1.1 for Windows (compiled by SimTel), I get the following
> output...

> 00E0, 0, 1
> 00E0, 1, 1

> With GAWK 3.1.0 for Windows (compiled by SourceForge), I get...

> 00E0, 1, 1
> 00E0, 1, 1

0. Preface
----------
First, I'd to emphasize that it's always safer to use length($0) or $0!=""
if we are not sure whether awk knows that the value should be treated as
string.

That said, I'm going to discuss some subtle details of the awk language
and bugs of gawk, even though sane programs should rarely make use of
this deep magic.

1. Language subtleties
----------------------
In this case, the question is whether gawk should treat $0 as number
(because 00E0 = 0e0 = 0 * 10^0 = 0) or as string.

As Aharon has already pointed out, the second behaviour is correct,
as a string constant has been assigned to $0.

Quote:
> I get the same problem if I substitute "00E0" with "00E1" to "00E9".

Sure, it's still the same issue, as 0e1 = 0e9 = 0.

Quote:
> The problem occurs whether I use file input or
> whether I manually assign "$0" (as above).

Oh, that would be another situation!  If the record came from input
file, it *would be recognized as number.*

So the correct output would be:
00E0, 0, 1
00E0, 0, 1

Even the program BEGIN { $0="00E0"; print ($1 && 1) } should print `0,'
as $1 comes from field splitting of the string "00E0" and thus should
be recognized as number.

2. Gawk bugs
------------
As Aharon has pointed out, most gawk builds behave correctly.
But this is in fact double bug.

1) The command
                echo 0e0 | gawk '{print ($0&&1)}'

should print 0 but prints 1, because 0e0 is not recognized as valid number.

Patch (relative to 3.1.1) is attached below: gawk-3.1.1-0e0.patch

2) The command
                gawk 'BEGIN{$0="00"; print ($0&&1)}'

should print 1 as $0 is a string.  gawk-3.1.1 prints 0 as it converts the
string to its numeric value.

Again, patch is below: gawk-3.1.1-maybe_num.patch

It seems that the maintainer of the DJGPP port (distributed via SimTel
archive) has fixed the former bug (perhaps unintentionally).
This exposes the later one.

Regards,
        Stepan

===========>  gawk-3.1.1-maybe_num.patch is short:

diff -urpN gawk-3.1.1.p2/field.c gawk-3.1.1.p3/field.c
--- gawk-3.1.1.p2/field.c       Wed Oct 16 10:44:18 2002

                n->flags = (STRING|STR|MAYBE_NUM|SCALAR|FIELD);
                fields_arr[0] = n;
        }
-       fields_arr[0]->flags |= MAYBE_NUM;
        field0_valid = TRUE;

 #undef INITIAL_SIZE

===========>  gawk-3.1.1-0e0.patch:


        * missing_d/strtod.c (gawk_strtod): Cleanup, changing the logic
          so that ptr is correct.  Fixes the bug that 0e0 is not
          recognized as numeric.

test/ChangeLog:


        * strtod.awk, strtod.in, strtod.ok: Added test for 0e0 and similar.

diff -crpN gawk-3.1.1i.p1/missing_d/strtod.c gawk-3.1.1i.p2/missing_d/strtod.c
*** gawk-3.1.1i.p1/missing_d/strtod.c   Fri Aug  3 07:57:42 2001
--- gawk-3.1.1i.p2/missing_d/strtod.c   Mon Jan  6 07:42:36 2003
***************
*** 1,6 ****
  /*
!  * strtod.c
!  *
   * Stupid version of System V strtod(3) library routine.
   * Does no overflow/underflow checking.
   *
--- 1,7 ----
  /*
!  * gawk wrapper for strtod
!  */
! /*
   * Stupid version of System V strtod(3) library routine.
   * Does no overflow/underflow checking.
   *
***************
*** 25,30 ****
--- 26,33 ----
   *
   * Summer 2001. Try to make it smarter, so that a string like "0000"
   * doesn't look like we failed. Sigh.
+  *
+  * Xmass 2002. Fix a bug in ptr determination, eg. for "0e0".
   */

  #if 0
*************** gawk_strtod(s, ptr)
*** 38,113 ****
  register const char *s;
  register const char **ptr;
  {
-       double ret = 0.0;
        const char *start = s;          /* save original start of string */
        const char *begin = NULL;       /* where the number really begins */
!       int success = 0;

        /* optional white space */
        while (isspace(*s))
                s++;

        /* optional sign */
!       if (*s == '+' || *s == '-') {
                s++;
-               if (*(s-1) == '-')
-                       begin = s - 1;
-               else
-                       begin = s;
-       }

        /* string of digits with optional decimal point */
!       if (isdigit(*s) && ! begin)
!               begin = s;
!
        while (isdigit(*s)) {
-               /* don't succeed on 0x... */
-               if (*s > '0')
-                       success++;
                s++;
        }

        if (*s == '.') {
-               if (! begin)
-                       begin = s;
                s++;
!               while (isdigit(*s))
                        s++;
!               success++;
        }

!       if (s == start || success == 0)         /* nothing there */
!               goto out;

        /*
         *      optional 'e' or 'E'
!        *              followed by optional sign or space
         *              followed by an integer
         */
!
!       if ((*s == 'e' || *s == 'E')
            && (isdigit(s[1])
!               || ((s[1] == '-' || s[1] == '+') && isdigit(s[2])))) {
                s++;
-
                if (*s == '+' || *s == '-')
                        s++;
-
                while (isdigit(*s))
                        s++;
        }

!       /* go for it */
!       ret = atof(begin);
!
! out:
!       if (! success && s == begin)
!               s = start;      /* in case all we did was skip whitespace */
!
        if (ptr)
!               *ptr = s;

!       return ret;
  }

  #ifdef TEST
--- 41,108 ----
  register const char *s;
  register const char **ptr;
  {
        const char *start = s;          /* save original start of string */
        const char *begin = NULL;       /* where the number really begins */
!       int dig = 0;
!       int dig0 = 0;

        /* optional white space */
        while (isspace(*s))
                s++;

+       begin = s;
+
        /* optional sign */
!       if (*s == '+' || *s == '-')
                s++;

        /* string of digits with optional decimal point */
!       while (*s == '0') {
!               s++;
!               dig0++;
!       }
        while (isdigit(*s)) {
                s++;
+               dig++;
        }

        if (*s == '.') {
                s++;
!               while (*s == '0') {
!                       s++;
!                       dig0++;
!               }
!               while (isdigit(*s)) {
                        s++;
!                       dig++;
!               }
        }

!       dig0 += dig;    /* any digit has appeared */

        /*
         *      optional 'e' or 'E'
!        *              if a digit (or at least zero) was seen
!        *              followed by optional sign
         *              followed by an integer
         */
!       if (dig0
!           && (*s == 'e' || *s == 'E')
            && (isdigit(s[1])
!             || ((s[1] == '-' || s[1] == '+') && isdigit(s[2])))) {
                s++;
                if (*s == '+' || *s == '-')
                        s++;
                while (isdigit(*s))
                        s++;
        }

!       /* In case we haven't found a number, set ptr to start. */
        if (ptr)
!               *ptr = (dig0 ? s : start);

!       /* Go for it. */
!       return (dig ? atof(begin) : 0.0);
  }

  #ifdef TEST
diff -crpN gawk-3.1.1i.p1/test/strtod.awk gawk-3.1.1i.p2/test/strtod.awk
*** gawk-3.1.1i.p1/test/strtod.awk      Wed Sep  6 14:07:31 2000
--- gawk-3.1.1i.p2/test/strtod.awk      Mon Jan  6 07:45:13 2003
***************
*** 1 ****
! { x = "0x" $1 ; print x, x + 0 }
--- 1,5 ----
! {
!       x = "0x" $1 ; print x, x + 0
!       for (i=1; i<=NF; i++)
!               if ($i) print $i, "is not zero"
! }
diff -crpN gawk-3.1.1i.p1/test/strtod.in gawk-3.1.1i.p2/test/strtod.in
*** gawk-3.1.1i.p1/test/strtod.in       Wed Sep  6 14:07:36 2000
--- gawk-3.1.1i.p2/test/strtod.in       Mon Jan  6 07:45:58 2003
***************
*** 1 ****
! 345
--- 1 ----
! 345 0 00 0e0 0E1 00E0 000e-5 .0e+0
diff -crpN gawk-3.1.1i.p1/test/strtod.ok gawk-3.1.1i.p2/test/strtod.ok
*** gawk-3.1.1i.p1/test/strtod.ok       Wed Sep  6 14:07:51 2000
--- gawk-3.1.1i.p2/test/strtod.ok       Mon Jan  6 07:47:14 2003
***************
*** 1 ****
--- 1,2 ----
  0x345 0
+ 345 is not zero



Sat, 25 Jun 2005 19:01:26 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. bug in gawk? (found in 3.0.4)

2. bug in gawk 3.1.1?

3. bug in gawk 3.0.3

4. Gawk bug, gawk won't nawk.

5. Bug in GAWK 3.1.0 -> --dump doesn't work with extension()

6. Gawk for win32 slower than Gawk for Dos_32?

7. gawk bug with linux?

8. GAWK bug?

9. gawk 3 patch by A. Robbins for index(.,"")==0 bug

10. Bug: gawk 3.0 fails to index(.,"")

11. GAWK 3.0.95 - Bug (er, problem) report on extension stuff

12. gawk 3.0.95, beta for gawk 3.1.0, now available

 

 
Powered by phpBB® Forum Software