[Web-SIG] Python and the web

Thu Oct 23 10:59:20 EDT 2003

Hi all,

	First a short introduction of myself.  I work for a web
development company and we have been focused on using python for
projects now for about one year.  I tend to lurk in python mailing lists
I have no business being in (such as the mod_python dev list =) ).  My
views here are to represent the end user of the python web related
tools, i.e. the web programmer.  I have personally launched about 10
sites now that are python driven, many are an
Apache+mod+python+Albatross+MySQL engine.

	The first question I have to everyone is the scope of the group.
There are some tasks common to the web programmer, but might be off
topic from what I'm seeing here.  Two examples are templating and
parsing SGML based documents (HTML, XML, etc).  It would be nice if
python included a basic templating module, but I wouldn't expect it to
be very powerful.  When heavy firepower is needed projects like
Albatross and PSP (being integrated into mod_python) are a better
solution.  However, sometimes a simple system is all that is needed,
lightweight and fast.  The module by Greg Stein, ezt.py, is a good
example of what I think would be handy in the stdlib.

	Parsing files might be too much for this focus, as it's a very
large task.  Still, more and more the web developer is faced with
reading in XML and applying a style sheet to it or otherwise formatting
the data.  Python is billed as batteries included, and granted this is a
car battery of a module, but it would still be nice to help out the web
developer here with the stdlib more.  Too often I find myself developing
a custom parsing engine for reading in some HTML or XML files.  Isn't
the point of a standard format so we can use standard tools with it?
Yes, I'm quite aware that, while XML is a standard, the term is applied
loosely =p.  This is a problem larger than python, as all languages seem
to be wrestling with this; how cool would it be for python to be the
first to have a really powerful, yet simple solution?

	For the CGI module, I can't comment - I've never used it.  Our
decision that python was ready for the big time here was based mostly on
mod_python's ability.  CGI is dead to us as a viable option; it simply
does not scale.  While you can use tools to string it along, like
fastCGI and co., working closely with the server API is going to be the
best gain for effort in the performance area.  For this same reason we
also skipped over mod_python's publisher handler (which is where the
relative URL complains come in - it's worth note that is something not
mod_python as a whole but just publisher).

	For client side http in python, I've been impressed with how
clean and simple it is.  Getting a file across http is no harder (in
fact easier imho) than a local disk file.  Now dealing with the file is
a different story, see the above on parsing.

	For a http server module, this is not a great need for myself
but it would still be good to have.  My idea would be a server class
that you derived a server from, overriding the phases of the request you
needed to work with, al'a the way Apache works.  Something like:

class MyServer(HTTPServer):

	def authhandler(self, req):
		if self.validate(req.user, req.pass):
			return true
		else:
			return false

	def handler(self, req):
		page = req.uri.filename
		try:
			req.send(open(page,'r').read())
			return true
		execpt:
			return false

	That's basic, but if you've worked with the apache API in
mod_python, mod_perl, or C you get the idea.  Also it would be nice if
the default handlers provided a working server, if some options were set
like a DocumentRoot:

class MyServer(HTTPServer):
	documentroot = "/var/www/html"

	I would say that Apache's 1.3 API should be a better goal, and
leave out the new features in 2.0 API.  First is the KISS principle,
next is we should be trying to replace Apache but rather provide a
reasonable useful web server in the stdlib.  Also, if someone needs a
feature of the 2.0 style API, they can always add that in the derived
class.

	The last thing to point out is using a request object is
important, as others mentioned here.  With a standard request object,
other tools like Albatross, can easily tie into this new server.

Look forward to comments and where this goes!

Mike