[Python-Dev] Finally switch urllib.parse to RFC3986 semantics?

Senthil Kumaran orsenthil at gmail.com
Fri Mar 18 04:19:36 CET 2011


Nick Coghlan wrote:
> > The problem is that it is quite a lot of work to get fully general URI
> > parsing to work correctly, but the overlap with legacy URL parsing is
> > large enough that many (most?) use cases in practice work just fine
> > with the older RFC semantics.

Yes. We can have API which strictly confirms to latest RFC by
definition, but the problem is there is code out there which 'expects'
the parsing behavior remain unchanged so that their existing code does
not break. And with parsing behavior unchanged means conforming to
older RFC parsing rules.

The solution seems to be extra function or an flag in the urlparse
method which will exhibit the more latest behavior.

Guido wrote:

> So would having two different API functions, one legacy and one
> conforming, be a problem? Ideally the conforming API's name would not
> be something lame like urllib2 but something timeless. :-)

:-) Should blame Jeremy for that name!. But urllib2 is long replaced
by urllib.parse, urllib.request and urllib.response. Considering how
you remember urllib2, I think it's name has stood the test of time.

But seriously, I think an additional function or additional flag in the
current functions/method in the parse module is sufficient than going
for another module.

-- 
Senthil


More information about the Python-Dev mailing list