htmllib, formatter

Richie Hindle richie at entrian.com
Thu Aug 8 05:37:56 EDT 2002


> I'd like to make an almost verbatim copy of a html file. The only change 
> would be in a SRC tag of <IMG SRC>. Is htmllib suitable for this?

I have a module called PyMeld that does exactly this - it's complete
but as yet unpublished (and I don't have it with me right now - if it
sounds like the thing you need, let me know and I'll forward it on
later today).

PyMeld lets Python code manipulate HTML (or XML, informally) in an
object-oriented way - in your case, you'd give your <IMG ...> tag an
id="image_id" attribute, then you could access it DOM-style like this:

   document = PyMeld.Container( myHTMLPage )
   document.image_id.src = 'http://example.com/new.gif'
   newHTMLPage = str( document )

PyMeld works by string substitution rather than by parsing and
rebuilding the document, so it only touches the pieces that you ask it
to touch.  It doesn't care about the structure of the HTML, or whether
it's 'valid' or not, except that the tags you want to modify must have
'id' attributes, and their attribute values must be quoted.

-- 
Richie Hindle
richie at entrian.com




More information about the Python-list mailing list