string stripping issues

Larry Bates larry.bates at websafe.com
Fri Mar 3 11:08:41 EST 2006


orangeDinosaur wrote:
> Hello,
> 
> I am encountering a behavior I can think of reason for.  Sometimes,
> when I use the .strip module for strings, it takes away more than what
> I've specified.  For example:
> 
>>>> a = '    <TD WIDTH=175><FONT SIZE=2>Hughes. John</FONT></TD>\r\n'
> 
>>>> a.strip('    <TD WIDTH=175><FONT SIZE=2>')
> 
> returns:
> 
> 'ughes. John</FONT></TD>\r\n'
> 
> However, if I take another string, for example:
> 
>>>> b = '    <TD WIDTH=175><FONT SIZE=2>Kim, Dong-Hyun</FONT></TD>\r\n'
> 
>>>> b.strip('    <TD WIDTH=175><FONT SIZE=2>')
> 
> returns:
> 
> 'Kim, Dong-Hyun</FONT></TD>\r\n'
> 
> I don't understand why in one case it eats up the 'H' but in the next
> case it leaves the 'K' alone.
> 
Others have explained the exact problem, I'll make a suggestion.
Take a few minutes to look at BeautifulSoup.  It parses HTML code
and allows for extractions of data from strings like this in a
very easy to use way.  If this is a one-off thing, don't bother.
If you do this commonly, BeautifulSoup is worth a little study.

-Larry Bates



More information about the Python-list mailing list