EOF sent/received too early or possible bug in http package (less likely) 
Author Message
 EOF sent/received too early or possible bug in http package (less likely)

Tcl 8.3.1 Bug:  Generated by Ajuba's bug entry form at
        http://www.*-*-*.com/
Responses to this post are encouraged.
------

Submitted by:  Kristoffer Lawson
OperatingSystem:  Linux
OperatingSystemVersion:  Linux {*filter*} 2.2.17 #1 Fri Sep 15 02:22:10 EEST 2000 i586 unknown

Machine:  Various Pentiums
Synopsis:  EOF sent/received too early or possible bug in http package (less likely)

ReproducibleScript:
===============  Server:

variable firstread 1

proc accept {sock addr port} {
    fconfigure $sock -buffering none
    fconfigure $sock -blocking 0
    fileevent $sock r [list stuff $sock]

Quote:
}

proc stuff {sock} {
    fileevent $sock r {}    ;# this was in the original code but with them
                            ;# it's very hard to reproduce!
                            ;# sudo nice -n -20 for sender should do the
                            ;# job... ;)

    if {[eof $sock]} {
                puts "***EOF***"
                exit
    }
    puts -nonewline stdout [read $sock]

    variable firstread
        variable delay

    if {$firstread} {
        set delay 0
        after 2000 {set delay 1}
        vwait delay
        ## send response
        puts $sock " "
        set firstread 0
    }

    fileevent $sock r [list stuff $sock]

Quote:
}

socket -server accept 40010
vwait forever

=================== Client:

set addr [lindex $argv 0]

## called when data from remote end arrives.
proc cb {} {
        variable sock

        read $sock

        puts $sock "<callback firstsend>"
        puts $sock "<callback secondsend>"

        for {set i 0} {$i < 9000} {incr i} {
                puts $sock "test32432 test3211 $i"
        }

        puts $sock "<callback lastnormalsend>"
        puts $sock "all data ok"

        close $sock
        exit

Quote:
}

## Open & configure
variable sock [socket $addr 40010]
fconfigure $sock -blocking 0
fconfigure $sock -buffering none

fileevent $sock r cb
# cb

puts $sock "start"

vwait forever

ObservedBehavior:
I am now fairly convinced this is a bug in Tcl8.3.1. When the client is run on a fast machine and
especially if it is niced to a high priority tha server receives an EOF before all the data is sent. This also
only seems to happen when the server has responded with some data first.

The reason for my suspicions is that the http package seems to be doing exactly the same with Tcl 8.3.1.
Ie. I might try to receive a large image but a chunk from the end is missing. When I do exactly the same
operation with Tcl8.2 or 8.0 I have no problems. Of course, this might be a bug introduced into
http but I couldn't see anything in the code that looked like it would cause that kind of behaviour and as we
have similar behaviour with scripts like the above I believe there really is something fishy in Tcl's IO
(specifically with sockets).

This is rather awkward for us as we are building large systems around Tcl which quite heavily use the IO
facilities. Of course, this might have been fixed in 8.3.2 (I haven't been able to test that yet), but as I
couldn't find this bug listed specifically then I thought it better to report.

DesiredBehavior:
In the given script all of the data should be sent and outputted to the screen before the EOF notification.



Fri, 07 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)

Quote:

> Tcl 8.3.1 Bug:  Generated by Ajuba's bug entry form at
>         http://www.*-*-*.com/
> Responses to this post are encouraged.
> ------

> Submitted by:  Kristoffer Lawson
> OperatingSystem:  Linux
> OperatingSystemVersion:  Linux {*filter*} 2.2.17 #1 Fri Sep 15 02:22:10 EEST 2000 i586 unknown

> Machine:  Various Pentiums
> Synopsis:  EOF sent/received too early or possible bug in http package (less likely)

> ReproducibleScript:
> ===============  Server:

> variable firstread 1

> proc accept {sock addr port} {
>     fconfigure $sock -buffering none
>     fconfigure $sock -blocking 0
>     fileevent $sock r [list stuff $sock]
> }

> proc stuff {sock} {
>     fileevent $sock r {}    ;# this was in the original code but with them
>                             ;# it's very hard to reproduce!
>                             ;# sudo nice -n -20 for sender should do the
>                             ;# job... ;)

>     if {[eof $sock]} {
>                 puts "***EOF***"
>                 exit
>     }
>     puts -nonewline stdout [read $sock]

>     variable firstread
>         variable delay

>     if {$firstread} {
>         set delay 0
>         after 2000 {set delay 1}
>         vwait delay
>         ## send response
>         puts $sock " "
>         set firstread 0
>     }

>     fileevent $sock r [list stuff $sock]
> }

> socket -server accept 40010
> vwait forever

> =================== Client:

> set addr [lindex $argv 0]

> ## called when data from remote end arrives.
> proc cb {} {
>         variable sock

>         read $sock

>         puts $sock "<callback firstsend>"
>         puts $sock "<callback secondsend>"

>         for {set i 0} {$i < 9000} {incr i} {
>                 puts $sock "test32432 test3211 $i"
>         }

>         puts $sock "<callback lastnormalsend>"
>         puts $sock "all data ok"

>         close $sock
>         exit
> }

> ## Open & configure
> variable sock [socket $addr 40010]
> fconfigure $sock -blocking 0
> fconfigure $sock -buffering none

> fileevent $sock r cb
> # cb

> puts $sock "start"

> vwait forever

> ObservedBehavior:
> I am now fairly convinced this is a bug in Tcl8.3.1. When the client is run on a fast machine and
> especially if it is niced to a high priority tha server receives an EOF before all the data is sent. This also
> only seems to happen when the server has responded with some data first.

> The reason for my suspicions is that the http package seems to be doing exactly the same with Tcl 8.3.1.
> Ie. I might try to receive a large image but a chunk from the end is missing. When I do exactly the same
> operation with Tcl8.2 or 8.0 I have no problems. Of course, this might be a bug introduced into
> http but I couldn't see anything in the code that looked like it would cause that kind of behaviour and as we
> have similar behaviour with scripts like the above I believe there really is something fishy in Tcl's IO
> (specifically with sockets).

> This is rather awkward for us as we are building large systems around Tcl which quite heavily use the IO
> facilities. Of course, this might have been fixed in 8.3.2 (I haven't been able to test that yet), but as I
> couldn't find this bug listed specifically then I thought it better to report.

> DesiredBehavior:
> In the given script all of the data should be sent and outputted to the screen before the EOF notification.

I seem to recall that there was a bug in the http package but I can't remember details.

I think that the problem with your server code is that you (like many others) do the
test for [eof] in the wrong place. You test [eof] before you actually read, rather
than after a read has 'failed'.

The correct way to do it is as follows.

        set data [read $sock]
        if { [string length $data] == 0 } {
          # A channel event was generated but there was no data, this means
          # that we have reached the [eof] so test for it.
          if { [eof $sock] } {
            puts "****EOF****"
            exit
          }
        } else {
          puts -nonewline stdout $data
        }



Sat, 08 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)
Paul Duffin schrieb:

Quote:
> I think that the problem with your server code is that you (like many others) do the
> test for [eof] in the wrong place. You test [eof] before you actually read, rather
> than after a read has 'failed'.

Why is this a problem? If [read] failed, the event handler is
called again, and this time [eof] is triggered.

Peter



Sat, 08 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)

Quote:

> Paul Duffin schrieb:
> > I think that the problem with your server code is that you (like many others) do the
> > test for [eof] in the wrong place. You test [eof] before you actually read, rather
> > than after a read has 'failed'.

> Why is this a problem? If [read] failed, the event handler is
> called again, and this time [eof] is triggered.

What if the [eof] is set when the underlying code detects the end of file
but before the buffer is empty, I haven't looked at this code recently. So
I don't know the ordering.

Actually the poster turns off buffering so this may not be the cause,
however the man page for [eof] states
       Returns  1 if an end of file condition occurred during the
       most recent input operation on channelId (such as gets), 0
       otherwise.

Therefore it should really be called after a [read] or [gets] and not before.



Sun, 09 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)

: I think that the problem with your server code is that you (like many others) do the
: test for [eof] in the wrong place. You test [eof] before you actually read, rather
: than after a read has 'failed'.

That isn't what the eof manpage says:

   Returns  1 if an end of file condition occurred during the
          most recent input operation on channelId (such as gets), 0
                 otherwise.

So I should only be getting an eof if the previous read reached an EOF.
I don't see how else that can be interpreted...

Anyway we tried doing it with the extra check and the result was the same.

--
         -     ---------- = = ---------//--+
         |    /     Kristoffer Lawson      |  www.fishpool.fi|.com

             |-- Fishpool Creations Ltd - /         |
             +-------- = - - - = ---------      /~setok/



Sun, 09 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)

: Therefore it should really be called after a [read] or [gets] and not before.

Yes, that ordering was tested aswell. In all cases we would get an EOF
on the server end before all data is read. Note that this doesn't happen
every time -- especially if the client is run without the raising the
priority with the 'nice' line (mentioned in the source). If the priority
is raised on the client end it happens quite frequently.

--
         -     ---------- = = ---------//--+
         |    /     Kristoffer Lawson      |  www.fishpool.fi|.com

             |-- Fishpool Creations Ltd - /         |
             +-------- = - - - = ---------      /~setok/



Sun, 09 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)
Here's the response from Brent Welch:

-----------
I can repeat this with both 8.3.1 and 8.3.2 on Linux, but only
in the loopback case where the client and server are on the same host.
It works fine on Solaris loopback, and between Solaris and Linux boxes.
However, I had to comment-out the
    fileevent $sock r {}    ;# this was in the original code but with them
                            ;# it's very hard to reproduce!
                            ;# sudo nice -n -20 for sender should do the
                            ;# job... ;)
call in the server's fileevent handler.  Without that it works - the
nice'ing doesn't have any effect.  So, it could be possible that setting
fileevents is perturbing the system somehow.
-----------

We also looked at something similar before, where we were getting
different results in loopback tests (as opposed to true peer-to-peer
sockets), with Linux falling short of the behavior that Solaris and
Windows provided.

However, if you can make the changes above, and have everything
working cross-platform, that might just be the right solution, no?

Quote:

> Tcl 8.3.1 Bug:  Generated by Ajuba's bug entry form at
> Submitted by:  Kristoffer Lawson
> OperatingSystem:  Linux
> Synopsis:  EOF sent/received too early or possible bug in http package (less likely)
        ....
> DesiredBehavior:
> In the given script all of the data should be sent and outputted to the screen before the EOF notification.

--
   Jeffrey Hobbs                     The Tcl Guy
   hobbs at ajubasolutions.com       Ajuba Solutions (ne Scriptics)


Sun, 09 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)

: I can repeat this with both 8.3.1 and 8.3.2 on Linux, but only
: in the loopback case where the client and server are on the same host.
: It works fine on Solaris loopback, and between Solaris and Linux boxes.
: However, I had to comment-out the
:     fileevent $sock r {}    ;# this was in the original code but with them
:                             ;# it's very hard to reproduce!
:                             ;# sudo nice -n -20 for sender should do the
:                             ;# job... ;)
: call in the server's fileevent handler.  Without that it works - the
: nice'ing doesn't have any effect.  So, it could be possible that setting
: fileevents is perturbing the system somehow.

Well we did actually test between two Linux boxes and without that
fileevent. The result is still the same.. We remove the fileevent there
because that's what the larger application has to do. Ie. it reads data
from a file and deals with it by entering the event loop, so we don't
want the fileevent script to be called again when new data arrives,
until we've dealt with the old stuff.

The http package bug actually now seems to be a separate issue from
this one. There I also seem to have the problem that I don't always get a
complete binary file with http::geturl (f.ex. with images), but IIRC
this only occurred with 8.3. When I ran it in wish8.2 I didn't
(immediately) get the same behaviour.

--
         -     ---------- = = ---------//--+
         |    /     Kristoffer Lawson      |  www.fishpool.fi|.com

             |-- Fishpool Creations Ltd - /         |
             +-------- = - - - = ---------      /~setok/



Tue, 11 Mar 2003 03:00:00 GMT  
 EOF sent/received too early or possible bug in http package (less likely)
On Wed, 20 Sep 2000 14:17:51 -0700, Jeffrey Hobbs

Quote:

>Here's the response from Brent Welch:

>-----------
>I can repeat this with both 8.3.1 and 8.3.2 on Linux, but only
>in the loopback case where the client and server are on the same host.
>It works fine on Solaris loopback, and between Solaris and Linux boxes.
>However, I had to comment-out the
>    fileevent $sock r {}    ;# this was in the original code but with them
>                            ;# it's very hard to reproduce!
>                            ;# sudo nice -n -20 for sender should do the
>                            ;# job... ;)
>call in the server's fileevent handler.  Without that it works - the
>nice'ing doesn't have any effect.  So, it could be possible that setting
>fileevents is perturbing the system somehow.

Perhaps something is loose in the Tcl socket code and it's only
showing up occasionally. These type of loose failures are really hard
to track down, and it's great when someone comes up with something
that allows you to get a handle on it.

It's more likely that setting that " fileevent $sock r {}" is leaving
something unreset in tcl than it being a system problem, and it's the
kind of thing that drives you nuts when programming. A call like this
looks so innocuous that you'd never suspect it could have side
effects.

Quote:
>We also looked at something similar before, where we were getting
>different results in loopback tests (as opposed to true peer-to-peer
>sockets), with Linux falling short of the behavior that Solaris and
>Windows provided.

I'm sure there's something loose in the Tcl socket code that differs
in its looseness between Windows NT, Windows 98, and Linux. I have had
to add some extra code in http_get that catches some of the failures
that result from this looseness.

Quote:
>However, if you can make the changes above, and have everything
>working cross-platform, that might just be the right solution, no?

No I don't think its a solution. Apriori " fileevent $sock r {}"  here
is safe and innocuous, and if it is having a side effect, then that's
a problem.

Look at the code below, which replaces "flush $s" in http_get.
I find this code is *required* under Windows NT (4.0 SP 6),
but not under Windows 98 or Linux (lo or ppp0 or eth0) ,
If it's not there, I get failures under NT maybe 5% of the time,
under 8.2.3; with it there I get 0.0% failures.

Quote:
>    if {[catch {flush $s} err]} {
>    # Try again once in a half second
>    after 500
>    if {[catch {flush $s} err]} {
>        if {[info exists state(after)]} {after cancel $state(after)}
>        # was catch {close $state(sock)}
>        # httpLinger seems to be required, even under 8.2.1
>        # Otherwise you get repeated usage of the same sock that
>        # always errors on the first flush
>        httpLinger $state(sock)
>        NIMVDebug "http_get-: Error flushing $state(sock)"
>        error $err $errorInfo $errorCode
>    }
>    }
>    fileevent $s readable [list httpEvent $token]

We should be thankful we now have a test case that allows us to see
this intermittent (load dependent) failure, and do everything we can
to find its cause.

Mike.



Fri, 28 Mar 2003 08:23:10 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. Sending and receiving HTTP messages

2. How to receive UDP packages with a response time less than 1 - 20ms under Window

3. cookie get/send with http package

4. Possible TK text widget display bug (at least on 8.3.x)

5. http package bug - no multiple headers in array?

6. Bug in http package in Windows?

7. eof on channel, not eof on transform: [eof] returns true

8. early eof returned by spawned processes in expect on MP Solaris boxes

9. http package: Basic HTTP Authorization?

10. HTTP 1.0 and HTTP 1.1 packages

11. http package does not conform HTTP/1.0

12. Very likely BUGS in tclsh 8.0.5 and 8.1

 

 
Powered by phpBB® Forum Software