How to parse a name out of a web page?

Rune Strand rune.strand at gmail.com
Wed Apr 5 18:19:17 EDT 2006


Haibao Tang wrote:
> with high accuracy...
>
> My temporary plan is to first recognized consecutive two or three
> initial-capitalized words, but certainly we need to do more than that?
> Anyone has suggestions?
>
> Thanks first.

It's not easy to say without seeing the HTML. If you the structure
allows it, a couple of str.split() is probably the easiest way, but you
always have BeautifulSoup.

http://www.crummy.com/software/BeautifulSoup/




More information about the Python-list mailing list