ANN: pullparser 0.0.2b released

John J. Lee jjl@pobox.com
23 Dec 2003 14:55:03 +0000


http://wwwsearch.sourceforge.net/pullparser

This is the first beta release (and probably the last).

Changes since 0.0.1a:

 * Renamed .tag_iter() to .tags(), and allowed it to take multiple name
   arguments.
 * .get_text() and .get_compressed_text() now no longer raise
   NoMoreTagsError, but return "" instead, which is both more convenient
   and makes the endcase saner.
 * Made a tarball package with setup.py etc.


Requires Python 2.2.

A simple "pull API" for HTML parsing, after Perl's HTML::TokeParser.
Many simple HTML parsing tasks are simpler this way than with the
HTMLParser module.  pullparser.PullParser is a subclass of
HTMLParser.HTMLParser.

Example:

import pullparser, sys
f = file(sys.argv[1])
p = pullparser.PullParser(f)
if p.get_tag("title"):
    title = p.get_compressed_text()
    print "Title: %s" % title


John