html parsing (htmllib)

Gerrit Holl gerrit.holl at pobox.com
Fri Jan 28 07:30:51 EST 2000


Patrick Tufts wrote on 949014904:
> I want to grab the text from a set of web pages.  htmllib seems like
> what I want, except that want to assign the parsed text to a variable
> instead of printing it to stdout or a file.
> 
> What do I need to do?

I don't know about the htmllib module, but if a function takes
a stream (a.k.a. file object), and you want to store the value
in a variable, you van use cStringIO.

>>> import cStringIO
>>> f = cStringIO.StringIO()
>>> f.write("aaa")
>>> f.write("bbb")
>>> print f
<StringO object at 80c57b0>
>>> print f.getvalue()
aaabbb
>>> f.seek(3, 0)
>>> print f.read()
bbb

See also:
http://www.python.org/doc/current/lib/module-StringIO.html

This is the documentation for the StringIO module; the cStringIO
module is identical, except that it's written in C: It's 1000 times
faster, but you can't subclass the StringIO class.

regards,
Gerrit.

-- 
Please correct any bad English you encounter in my email message!
-----BEGIN GEEK CODE BLOCK----- http://www.geekcode.com
Version: 3.12
GCS dpu s-:-- a14 C++++>$ UL++ P--- L+++ E--- W++ N o? K? w--- !O !M !V PS+ PE?
Y? PGP-- t- 5? X? R- tv- b+(++) DI D+ G++ !e !r !y
-----END GEEK CODE BLOCK-----




More information about the Python-list mailing list