string stripping issues
Larry Bates
larry.bates at websafe.com
Fri Mar 3 11:08:41 EST 2006
orangeDinosaur wrote:
> Hello,
>
> I am encountering a behavior I can think of reason for. Sometimes,
> when I use the .strip module for strings, it takes away more than what
> I've specified. For example:
>
>>>> a = ' <TD WIDTH=175><FONT SIZE=2>Hughes. John</FONT></TD>\r\n'
>
>>>> a.strip(' <TD WIDTH=175><FONT SIZE=2>')
>
> returns:
>
> 'ughes. John</FONT></TD>\r\n'
>
> However, if I take another string, for example:
>
>>>> b = ' <TD WIDTH=175><FONT SIZE=2>Kim, Dong-Hyun</FONT></TD>\r\n'
>
>>>> b.strip(' <TD WIDTH=175><FONT SIZE=2>')
>
> returns:
>
> 'Kim, Dong-Hyun</FONT></TD>\r\n'
>
> I don't understand why in one case it eats up the 'H' but in the next
> case it leaves the 'K' alone.
>
Others have explained the exact problem, I'll make a suggestion.
Take a few minutes to look at BeautifulSoup. It parses HTML code
and allows for extractions of data from strings like this in a
very easy to use way. If this is a one-off thing, don't bother.
If you do this commonly, BeautifulSoup is worth a little study.
-Larry Bates
More information about the Python-list
mailing list