[Python-Dev] Py3K thought: use external library for client-side HTTP

Brett Cannon brett at python.org
Fri Mar 17 23:43:24 CET 2006


On 3/17/06, A.M. Kuchling <amk at amk.ca> wrote:
> Thought: We should drop all of httplib, urllib, urllib2, and ftplib,
> and instead adopt some third-party library for HTTP/FTP/whatever,
> write a Python wrapper, and use it instead.  (The only such library I
> know of is libcurl, but doubtless there are other candidates; see
> http://curl.haxx.se/libcurl/competitors.html for a list.)
>
> Rationale:
>
> * HTTP client-side support is pretty complicated.  HTTP itself
>   has many corners (httplib.py alone is 1420 lines long, and urllib2
>   is 1300 lines).
>
> * There are many possible permutations of proxying, SSL on/off,
>   and authentication.  We probably haven't tested every permutation,
>   and probably lack the volunteer effort to test them all.
>   If you search for 'http' in the bug tracker, you find about 16 or so
>   bugs submitted for httplib/urllib/urllib2, most of them for one
>   permutation or another.
>
>   With a third-party library, the work of maintaining RFC compliance falls
>   to someone else.
>
> * A third-party library might support more features than we have time
>   to implement.
>
> A downside: these libraries would be in C, and might be the source of
> security bugs.  Python code may be buggy, but probably won't fall prey
> to buffer overflow.  We'd also have to keep in sync with the library.
>

There is also the issue that PyPy could have problems since they have
always preferred we keep pure Python versions of stuff around when
possible (I assume IronPython has .NET Internet libraries to use).

> Similar arguments could be made for a server-side solution, but here I
> have no idea what we might choose.  A server-side HTTP implementation
> + a WSGI gateway might be all that Python 3000 needs.
>
> Good idea?  Dumb idea?

Possibly good.  We have the precendent of zlib, expat, etc.  The key
is probably the license is compatible with ours (which libcurl seems
to be: MIT/X derivative).

I know that having fixed urllib bugs I sure wouldn't mind if I didn't
have to read another RFC on URL formats.  =)

But maybe this also poses a larger question of where for Py3K we want
to take the stdlib.  Ignoring its needed cleanup and nesting of the
namespace, do we want to try to use more external tools by importing
them and writing a Pythonic wrapper?  Or do we want to not do that and
try to keep more things under our control and go with the status quo? 
Or do we want to really prune down the stdlib and use more dynamic
downloading ala Cheeseshop and setuptools?

I support the first even though it makes problems for PyPy since it
should allow us to have more quality code with less effort on our
part. I also support the second since we seem to be able to pull it
off.  For the third option I would want to be very careful with what
is and is not included since Python's "batteries included" solution is
an important part of Python and I would not want that to suffer.

-Brett


More information about the Python-Dev mailing list