[Web-SIG] Client-side support: what are we aiming for?

Bill Janssen janssen at parc.com
Thu Oct 23 16:12:14 EDT 2003


amk writes:
> What's the scope of improving client-side HTTP support?  
> 
> I suggest aiming for something you could write a web browser or web scraper
> on top of. That means storing and returning cookies from the server, writing
> them to a file, and a page cache that handles HTTP's cache expiration rules.
> HTML formatting is out of scope, but a specialized parser for extracting a
> list of form elements or for picking apart a table might not be.

My original idea was to look at something like cURL
(http://curl.haxx.se/), and make sure anything you could do with that
tool, you could do with Python.  Might be a bit ambitious; here's the
lead paragraph from the cURL web page:

  Curl is a command line tool for transferring files with URL syntax,
  supporting FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and
  LDAP. Curl supports HTTPS certificates, HTTP POST, HTTP PUT, FTP
  uploading, kerberos, HTTP form based upload, proxies, cookies,
  user+password authentication, file transfer resume, http proxy
  tunneling and a busload of other useful tricks.

Currently, for example, there's no way in the Python standard
libraries to do a file upload (a POST with multipart/form-data).

Then there are issues about handling the Web-centric formats you get
back.  There's no CSS parser, for instance.  It's hard to understand a
modern Web page without one.  A Javascript interpreter?

Bill







More information about the Web-SIG mailing list