RE Module

Simon Forman rogue_pedro at yahoo.com
Fri Aug 25 01:09:07 EDT 2006


Roman wrote:
> I am trying to filter a column in a list of all html tags.

What?

> To do that, I have setup the following statement.
>
> row[0] = re.sub(r'<.*?>', '', row[0])
>
> The results I get are sporatic.  Sometimes two tags are removed.
> Sometimes 1 tag is removed.   Sometimes no tags are removed.  Could
> somebody tell me where have I gone wrong here?
>
> Thanks in advance

I'm no re expert, so I won't try to advise you on your re, but it might
help those who are if you gave examples of your input and output data.
What results are you getting for what input strings.

Also, if you're just trying to strip html markup to get plain text from
a file, "w3m -dump some.html"  works great.  ;-)

HTH,
~Simon




More information about the Python-list mailing list