urllib, urllib2, httplib -- Begging for consolidation?

Norbert Klamann norbert.klamann at klamann-software.de
Wed May 8 01:39:08 EDT 2002


brueckd at tbye.com wrote in message news:<mailman.1020782351.10319.python-list at python.org>...
> On Tue, 7 May 2002, A. Keyton Weissinger wrote:
> 
> > Am I the only one that thinks these need to be pulled together some? I saw a
> > PEP (268?) where there are some rumblings about adding some things to it as
> > well. Maybe a combo project?
> 
> Yes, part of the problem is that it's not obvious when you should use 
> which (e.g. urllib vs. urllib2).
> 
> BUT, if there were to occur some sort of consolidation (meaning, 
> introducing incompatibilities or a whole new module), then we should use 
> that as an opportunity to restructure/redesign that whole set of modules 
> because, IMO, they've evolved past their original design. If we can come 
> up with a good organization, the actual implementation could be handled by 
> various members of the community.
> 
> The original premise of urllib, that it helps your app open any type of 
> URL in roughly the same way, is pretty neat but now both urllib and 
> urllib2 have lots of stuff tacked on that is pretty HTTP-specific. Also, 
> I usually need to support only one protocol and I know in advance which 
> that is (usually HTTP, sometimes FTP), but the httplib docs imply that 
> httplib is more of an internal module.
> 
> So... if we were to change something, I'd like us to build a rich HTTP
> library that supports the super easy use case (gimme the data at this URL,
> optionally posting this data right here first) as well as more complicated
> cases (add in these request headers before sending the request to the
> origin). It would be in this module (or one closely tied to it) that we'd
> capture knowledge about the HTTP protocol, such as parsing and building
> HTTP 1.0 and 1.1 compliant request and response headers, handling cookies,
> basic and digest authentication, '\n' vs. '\r\n' line endings, easy-to-use
> HTTPS, etc. Supporting routines (like quote, urlencode, urlparse) can
> either be imported and exposed through the HTTP module, or kept in a
> module with better definied boundaries.
> 
> We could take the same approach with other protocols, and include modules 
> for FTP, plain files, etc. With all those in place we could still have the 
> "open any type of URL" routine built on top, but it should work only for 
> the simplest of use cases; if you need something more complex then you'd 
> go use the corresponding protocol library yourself.
> 
> I'm not suggesting that we scrap the current protocol modules (they've be 
> very, very useful); it's just that over time they've grown up and are due 
> for some redesign/refactoring (the kind that will not be backwards 
> compatible).
> 
> -Dave

Anywhere in this structure should the handling of firewalls be integrated.

All the best

Norbert



More information about the Python-list mailing list