[Python-Dev] urllib2 and urllib

Moshe Zadka moshez@zadka.site.co.il
Wed, 28 Feb 2001 12:43:08 +0200 (IST)


(Full disclosure: I've been payed to hack on urllib2)

For a long time I've been feeling that urllib is a bit hackish, and
not really suited to conveniently script web sites. The classic example
is the interface to passwords, whose default behaviour is to stop
and ask the user(!).

Jeremy had urllib2 out for about a year and a half, and now that I've
finally managed to have a look at it, I'm very impressed with the
architecture, and I think it's superior to urllib. 

>From the "outside" it's not that different then urllib, in that it
has mainly a "urlopen" function (no urlretrieve, which I always felt
misplaced). It's configurability is much different, though, and
IMHO much more pleasent. 

The code, however, was a bit stale, and a bit too "play-groundish", though.
Fortunately, I've been payed to add some features to the code, and I have
already added most features from urllib which weren't there, and some
features that are not in urllib (for example, proxy authentication).

It will still need some work to be an industrial-strength client library
(e.g., client-side cookie support, referer support in redirections,
support for 303 redirection), but most of these are much easier to do
based on what is currently urllib2. 

A major misfeature of urllib2 up to now was that it was not documented.
Fortunately, my client saw it as a problem too, so I have a rough sketch
of a library reference chapter, and I will write a Python HOWTO before
finishing with this project.

There are several problems with adopting urllib2 as the new 
standard library for client-side writing:

1. Not backwards compatible extension interface with urllib -- that's
   a real problem, because the current interface was *designed* to 
   be different
2. The name: urllib2 is just an awful name for anything. It should
   be changed, and a compat. module named "urllib2" that from import
   *s from the new module. I don't have any strong feelings about
   the new name, as long is there are no numbers inside (<0.9 wink>)
3. Too close to beta: that's a valid concern, and it should be possible
   to say "newurl" is still expereimental in 2.1, and make it the official
   module only in 2.2

This al has to do with the libraries-voting-procedure (PEP-0002), which
Eric has been neglecting lately..<wink> <wink> <nudge> <nudge>
(patch number 404826)
-- 
"I'll be ex-DPL soon anyway so I'm        |LUKE: Is Perl better than Python?
looking for someplace else to grab power."|YODA: No...no... no. Quicker,
   -- Wichert Akkerman (on debian-private)|      easier, more seductive.
For public key, finger moshez@debian.org  |http://www.{python,debian,gnu}.org