[XML-SIG] Ideas for web/ package

Andrew Kuchling akuchlin@mems-exchange.org
Fri, 15 Feb 2002 12:44:43 -0500


On Fri, Feb 15, 2002 at 12:31:32PM -0500, Fred L. Drake, Jr. wrote:
>Perhaps the urlparse module should be re-written in C, though.  But
>not today.  I think Skip did part of this some time ago as his urlop
>module.

As part of the RELAX NG stuff, I've discovered that urlparse() is
really lenient in its parsing.  For example, the fragment value is ''
if no fragment is supplied, so you can't distinguish between
http://www.amk.ca and http://www.amk.ca# .  Unfortunately this can't
really be fixed without changing the API of urlparse() and breaking
old code.

So I had the idea of creating a new 'web.*' package containing updated
tools for Web-related tasks, so we can make a clean break with the old
APIs.  The two things for the web/ package that I can think of are 1)
a stricter URL parser, and 2) the skeleton of a Web client that
handles cookies and caching sensibly (so you could write
screen-scraping applications on top of it).

Can anyone think of other things that could be part of this package?

--amk                                                  (www.amk.ca)
You stupid stubborn thickheaded numbskull, you were supposed to die in
bed! I could have handled it...
    -- The Doctor, in "Battlefield"