[Python-Dev] urllib, multipart/form-data encoding and file uploads

Chris AtLee chris at atlee.ca
Mon Jun 30 19:18:17 CEST 2008


On Sat, Jun 28, 2008 at 4:14 AM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>> I didn't see any recent discussion about this so I thought I'd ask
>> here: do you think this would make a good addition to the new urllib
>> package?
>
> Just in case that isn't clear: any such change must be delayed for
> 2.7/3.1. That is not to say that you couldn't start implementing it
> now, of course.

I like a challenge :)

As discussed previously, there are two parts to this: handling
streaming HTTP requests, and multipart/form-data encoding.

I notice that support for file objects has already been added to 2.6's
httplib.  The changes required to support iterable objects are very
minimal:

Index: Lib/httplib.py
===================================================================
--- Lib/httplib.py	(revision 64600)
+++ Lib/httplib.py	(working copy)
@@ -688,7 +688,12 @@
         self.__state = _CS_IDLE

     def send(self, str):
-        """Send `str' to the server."""
+        """Send `str` to the server.
+
+        ``str`` can be a string object, a file-like object that supports
+        a .read() method, or an iterable object that supports a .next()
+        method.
+        """
         if self.sock is None:
             if self.auto_open:
                 self.connect()
@@ -710,6 +715,10 @@
                 while data:
                     self.sock.sendall(data)
                     data=str.read(blocksize)
+            elif hasattr(str,'next'):
+                if self.debuglevel > 0: print "sendIng an iterable"
+                for data in str:
+                    self.sock.sendall(data)
             else:
                 self.sock.sendall(str)
         except socket.error, v:

(small aside, should the parameter called 'str' be renamed to
something else to avoid conflicts with the 'str' builtin?)

All regression tests continue to pass with this change applied.

If this change is not applied, then we have to jump through a couple
of hoops to support iterable HTTP request bodies:

- Provide our own httplib.HTTP(S)Connection classes that override
send() to do exactly what the patch above accomplishes

- Provide our own urllib2.HTTP(S)Handler classes that will use the new
HTTP(S)Connection classes

- Register the new HTTP(S)Handler classes with urllib2 so they take
priority over the standard handlers

I've created the necessary sub-classes, as well as several classes and
functions to do multipart/form-data encoding of strings and files.  My
current work is available online here: http://atlee.ca/software/poster
(tarball here: http://atlee.ca/software/poster/dist/0.1/poster-0.1dev.tar.gz)

Cheers,
Chris


More information about the Python-Dev mailing list