Upload of binary files 
Author Message
 Upload of binary files

I am having problems with uploading binary files. I have searched the
archives and found my problem several times but no answers. The problem is
that this little form:

<form action="http://in24/cgi-bin/upload.py" method="post"
enctype="multipart/form-data">
Title:&nbsp;<input type="text" name="Title"><br>
File:&nbsp;<input type="file" name="Image"><br>
<input type="submit" value="-OK-">
</form>

with my very standard python script at the other end will only work for
ascii files. With a binary file only a fragment is uploaded.

The Python is:

if form.has_key("Image"):
    fileinfo=form["Image"]
    tmp = open("C:/temp/temporary.dat","wb")
    lines=0
    while 1:
        line = fileinfo.file.readline()
        if not line: break
        lines += 1
        tmp.write(line)

    tmp.close()

which I have copied from various sources (and tried several variants). One
surprising thing is that a .png file uploads %PNG and no more.  This is
Python 2.2.1c2 on Windows 2000 with Apache 1.3.24 for Windows. I'm using
localhost under various names.

Any comments appreciated

Peter



Mon, 29 Aug 2005 02:35:15 GMT  
 Upload of binary files
Quote:

> I am having problems with uploading binary files. I have searched the
[...]
>     while 1:
>         line = fileinfo.file.readline()

[...]

It is _absolutely not_ a good idea to use the readline() method on binary files.
Use the read() method instead, it should work.

----
levi



Mon, 29 Aug 2005 08:53:25 GMT  
 Upload of binary files
Agreed - I copied the code, mine originally used read(). But read does not
fix it. Here is an even shorter script:

#! C:/python22/python.exe
import cgi
import cgitb; cgitb.enable()

form = cgi.FieldStorage()

if form.has_key("Data"):
    fileinfo=form["Data"]
    data = fileinfo.file.read()

print "Content-type: text/plain\n\n"
print len(data)

----
Text files print their lengths. Graphic files truncate, one to 176 bytes
(actually 176, 177, 176, different each upload). Another gets 3323 bytes up.
A .png file mangages only 4.  The HTML is:

<html><head><title>Upload</title></head><body>

<form action="http://in24/cgi-bin/upload.py"
method="post"
enctype="multipart/form-data">
Text File:&nbsp;<input type="file" name="Data"><br>
<input type="submit" value="-OK-">

</form></body></html>

Peter



Quote:
> > I am having problems with uploading binary files. I have searched the
> [...]
> >     while 1:
> >         line = fileinfo.file.readline()
> [...]

> It is _absolutely not_ a good idea to use the readline() method on binary
files.
> Use the read() method instead, it should work.

> ----
> levi



Mon, 29 Aug 2005 15:58:28 GMT  
 Upload of binary files

Quote:

> if form.has_key("Data"):
>     fileinfo=form["Data"]
>     data = fileinfo.file.read()

This won't do. It is not guaranteed that a single read() returns
all of your data. Rather, it is likely to return a chunk at a
time. So what you probably have to do is extract the Content-Length
and loop over a read() (appending the results) until you've
got a result that is the correct length.

Irmen.



Mon, 29 Aug 2005 17:33:34 GMT  
 Upload of binary files
OK - I've changed the code so it is now:

#! C:/python22/python.exe
import cgi
import cgitb; cgitb.enable()

form = cgi.FieldStorage()

if form.has_key("Data"):
    fileinfo=form["Data"]
    read_in=0
    iteration=0
    while 1:
        chunk = fileinfo.file.read()
        if len(chunk) == 0:
            break
        read_in += len(chunk)
        iteration += 1

print "Content-type: text/plain\n\n"
print "Bytes read:",read_in
print "Read iterations",iteration

With my Apache httpd.conf file I get:
 Bytes read: 36175
 Read iterations 1

With a 3K gra.png file I get:
        Bytes read: 4
        Read iterations 1

With a 66K JPEG file I get:
        Bytes read: 164
        Read iterations 1

So plainly it is _not_ a question of chunked reads. The data is not being
delivered.
This experiment was with XP professional rather than W2K on a different
machine. Python 2.3a2 behaves in the same way. Using Netscape 7 instead of
IE6 makes no difference.

I tried just using standard input:

import sys

read_in=0
iteration=0
while 1:
    chunk = sys.stdin.read()
    if len(chunk) == 0:
        break
    read_in += len(chunk)
    iteration += 1

print "Content-type: text/plain\n\n"
print "Bytes read:",read_in
print "Read iterations",iteration

but the behaviour is exactly the same. The only thing that I have not tried
is using a 'real' network rather than looping back with 127.0.0.1 which is
about all I can think of.

Running out of ideas ..

Peter



Quote:

> > if form.has_key("Data"):
> >     fileinfo=form["Data"]
> >     data = fileinfo.file.read()

> This won't do. It is not guaranteed that a single read() returns
> all of your data. Rather, it is likely to return a chunk at a
> time. So what you probably have to do is extract the Content-Length
> and loop over a read() (appending the results) until you've
> got a result that is the correct length.

> Irmen.



Mon, 29 Aug 2005 21:20:44 GMT  
 Upload of binary files
I used the cgi module under .asp, and the following code snippet is
running. (I have removed unnessecary code.):

     # iisUtils.winFieldStorage is a subclass of the FieldStorage that
     # runs under IIS but works the same as FieldStorage

     import iisUtils

     req = iisUtils.winFieldStorage(Request)
     image = req['image']
     if image.file: # is there a file uploaded?
         filename = image.filename
         f = open('c:/path/to/files/folder/%s' % filename , 'wb')
         f.write(image.file.read())
         f.close()

I hope it is of any help.

--

hilsen/regards Max M Rasmussen, Denmark

http://www.futureport.dk/
Fremtiden, videnskab, skeptiscisme og transhumanisme



Mon, 29 Aug 2005 22:20:56 GMT  
 Upload of binary files
stdin and stdout are opened in text mode, since they are normally used
for text.  On Windows you need to switch them to binary mode if you
will read or write binary data.  I think you need to switch stdin
over before 'import cgi' as that may parse input immediately.  The
following code should do the trick:

try:
    import msvcrt, os
    msvcrt.setmode(0, os.O_BINARY) # stdin = 0
    msvcrt.setmode(1, os.O_BINARY) # stdout = 1
except ImportError:
    pass



Mon, 29 Aug 2005 23:25:44 GMT  
 Upload of binary files

Quote:


> > if form.has_key("Data"):
> >     fileinfo=form["Data"]
> >     data = fileinfo.file.read()

> This won't do. It is not guaranteed that a single read() returns
> all of your data.

Since when?  I know with sockets you don't get a guarantee that
recv() gets all the data, but the docs on read() for a file
object say:

  read ([size])
  Read at most size bytes from the file (less if the read hits EOF before
  obtaining size bytes). If the size argument is negative or omitted,
  read all data until EOF is reached.

That reads like a guarantee to me.  Do I have to go and wrap all
our code that uses read() with a retry loop?

-Peter



Mon, 29 Aug 2005 23:28:25 GMT  
 Upload of binary files

Quote:

>>This won't do. It is not guaranteed that a single read() returns
>>all of your data.

> Since when?  I know with sockets you don't get a guarantee that
> recv() gets all the data, but the docs on read() for a file
> object say:

>   read ([size])
>   Read at most size bytes from the file (less if the read hits EOF before
>   obtaining size bytes). If the size argument is negative or omitted,
>   read all data until EOF is reached.

I *may* indeed be wrong here, but consider this:
I was thinking.. "from your CGI code you're actually reading from
a socket, that's wrapped in a file-like object".
Sockets behave like I said. So I thought: the read() behaves like this too.
Even more so because there is no such thing as "EOF" on sockets.
How can the read() know when all data have been read?

Irmen

PS a read() on a *true* file would read everything, ofcourse, like
you pointed out.



Tue, 30 Aug 2005 06:10:33 GMT  
 Upload of binary files

Quote:


> >>This won't do. It is not guaranteed that a single read() returns
> >>all of your data.

> > Since when?  I know with sockets you don't get a guarantee that
> > recv() gets all the data, but the docs on read() for a file
> > object say:

> >   read ([size])
> >   Read at most size bytes from the file (less if the read hits EOF before
> >   obtaining size bytes). If the size argument is negative or omitted,
> >   read all data until EOF is reached.

> I *may* indeed be wrong here, but consider this:
> I was thinking.. "from your CGI code you're actually reading from
> a socket, that's wrapped in a file-like object".
> Sockets behave like I said. So I thought: the read() behaves like this too.
> Even more so because there is no such thing as "EOF" on sockets.
> How can the read() know when all data have been read?

As _I_ read (no pun intended) the description in the docs, it would
seem that the guts of read() will basically go back and block on the
socket, until recv() eventually returns '' which indicates the
equivalent of EOF (i.e. socket closed).  If that's so, then you
still are guaranteed that read() will get all your data, though it
might have to block to do so.

I haven't tried that though... I've never used makefile() on a
socket.

Quote:
> Irmen

> PS a read() on a *true* file would read everything, ofcourse, like
> you pointed out.

That reassures me. Thanks.  :-)

-Peter



Tue, 30 Aug 2005 07:43:46 GMT  
 Upload of binary files

Quote:

> I *may* indeed be wrong here, but consider this:
> I was thinking.. "from your CGI code you're actually reading from
> a socket, that's wrapped in a file-like object".
> Sockets behave like I said. So I thought: the read() behaves like this too.
> Even more so because there is no such thing as "EOF" on sockets.
> How can the read() know when all data have been read?

The cgi module returns a FieldStorage object.

If you have uploaded a file, cgi will take care of the actual uploading,
and save the uplodad file in a file-like object.

This file is what you get from the FieldStorage object.

So ... you simply worry to much ;-) It's easier than you think.

--

hilsen/regards Max M Rasmussen, Denmark

http://www.futureport.dk/
Fremtiden, videnskab, skeptiscisme og transhumanisme



Tue, 30 Aug 2005 20:14:49 GMT  
 Upload of binary files

Quote:


>>>This won't do. It is not guaranteed that a single read() returns
>>>all of your data.

>> Since when?  I know with sockets you don't get a guarantee that
>> recv() gets all the data, but the docs on read() for a file
>> object say:

>>   read ([size])
>>   Read at most size bytes from the file (less if the read hits EOF before
>>   obtaining size bytes). If the size argument is negative or omitted,
>>   read all data until EOF is reached.

> I *may* indeed be wrong here, but consider this:
> I was thinking.. "from your CGI code you're actually reading from
> a socket, that's wrapped in a file-like object".

You are probably reading from a pipe, actually.  HTTP/1.1 allows multiple
requests to be sent on the same socket, but a CGI process should only see
one of them.  Therefore a web server that supports HTTP/1.1 will probably
send a single request to it down a pipe.

Quote:
> Sockets behave like I said. So I thought: the read() behaves like this too.
> Even more so because there is no such thing as "EOF" on sockets.

There is on stream sockets.  When your peer closes its socket, you see
that as EOF.

Quote:
> How can the read() know when all data have been read?

> Irmen

> PS a read() on a *true* file would read everything, ofcourse, like
> you pointed out.

A read() on any file-like object should read everything.  (Having said
that, EOF on a terminal might not be for real - the reader can ignore
it and the user can keep on typing.)


Tue, 30 Aug 2005 21:18:02 GMT  
 Upload of binary files
The problem described by myself has been solved in Ben Hutchings post: The
code belows works as we would wish - the binary file is uploaded and the
number of bytes returned to the client:

try:
    import msvcrt, os
    msvcrt.setmode(0, os.O_BINARY) # stdin = 0
    msvcrt.setmode(1, os.O_BINARY) # stdout = 1
except ImportError:
    pass

import cgi
import cgitb; cgitb.enable()

form = cgi.FieldStorage()

if form.has_key("Data"):
    fileinfo=form["Data"]
    chunk = fileinfo.file.read()

print "Content-type: text/plain\n\n"
print "Bytes read:",len(chunk)

The point is that without the import of msvcrt (providing low level Windows
support) the cgi module tries to treat all files as text and so the uploads
do not work. I am not a seasoned Python person but I don't think the above
code is very 'Pythonic' but for me it is a perl none the less. Thanks Ben!

Peter


Quote:
> I am having problems with uploading binary files. I have searched the
> archives and found my problem several times but no answers. The problem is
> that this little form:

> <form action="http://in24/cgi-bin/upload.py" method="post"
> enctype="multipart/form-data">
> Title:&nbsp;<input type="text" name="Title"><br>
> File:&nbsp;<input type="file" name="Image"><br>
> <input type="submit" value="-OK-">
> </form>

> with my very standard Python script at the other end will only work for
> ascii files. With a binary file only a fragment is uploaded.

> The Python is:

> if form.has_key("Image"):
>     fileinfo=form["Image"]
>     tmp = open("C:/temp/temporary.dat","wb")
>     lines=0
>     while 1:
>         line = fileinfo.file.readline()
>         if not line: break
>         lines += 1
>         tmp.write(line)

>     tmp.close()

> which I have copied from various sources (and tried several variants). One
> surprising thing is that a .png file uploads %PNG and no more.  This is
> Python 2.2.1c2 on Windows 2000 with Apache 1.3.24 for Windows. I'm using
> localhost under various names.

> Any comments appreciated

> Peter



Wed, 31 Aug 2005 02:22:49 GMT  
 Upload of binary files

Quote:
> The problem described by myself has been solved in Ben Hutchings post: The
> code belows works as we would wish - the binary file is uploaded and the
> number of bytes returned to the client:

> try:
>     import msvcrt, os
>     msvcrt.setmode(0, os.O_BINARY) # stdin = 0
>     msvcrt.setmode(1, os.O_BINARY) # stdout = 1
> except ImportError:
>     pass

> import cgi
> import cgitb; cgitb.enable()

> form = cgi.FieldStorage()

> if form.has_key("Data"):
>     fileinfo=form["Data"]
>     chunk = fileinfo.file.read()

> print "Content-type: text/plain\n\n"
> print "Bytes read:",len(chunk)

> The point is that without the import of msvcrt (providing low level
Windows
> support) the cgi module tries to treat all files as text and so the
uploads
> do not work. I am not a seasoned Python person but I don't think the above
> code is very 'Pythonic' but for me it is a perl none the less. Thanks Ben!

One last thing you might try if your web server honors the shebang
convention (as Xitami and Apache both do, for example, even on Windows) is
to use a first line similar to

#!C:/python22/python -u

This will cause Python to run unbuffered, and should obviate the need to
explicitly use the mscvrt module.

regards
--
Steve Holden                                  http://www.holdenweb.com/
Python Web Programming                 http://pydish.holdenweb.com/pwp/
Register for PyCon now!            http://www.python.org/pycon/reg.html



Fri, 02 Sep 2005 00:51:20 GMT  
 Upload of binary files

Quote:
> One last thing you might try if your web server honors the shebang
> convention (as Xitami and Apache both do, for example, even on Windows) is
> to use a first line similar to

> #!C:/python22/python -u

Well b****r me. This works too.

Peter



Sat, 03 Sep 2005 01:24:27 GMT  
 
 [ 18 post ]  Go to page: [1] [2]

 Relevant Pages 

1. Problem while uploading a binary file using cgi module

2. uploading a binary file with ftplib

3. File Upload Problems with binary files (ASCII works ok)

4. CGI POST binary file upload?

5. Uploading binary files

6. Special Problem and Solution regarding: CGI Binary File Uploading on Windows

7. Problems with uploading binary files using cgi module

8. uploading binary files via cgi

9. uploading binary file - ODBC (SapDB)

10. Reading a binary file / writing binary data to a file

11. SOFTWARE: Uploaded OS/2 binaries of fweb-1.53 to usual sites

12. cgi binary upload

 

 
Powered by phpBB® Forum Software