text representation of HTML
garabik-news-2005-05 at kassiopeia.juls.savba.sk
garabik-news-2005-05 at kassiopeia.juls.savba.sk
Thu Jul 20 10:37:38 EDT 2006
Ksenia Marasanova <ksenia.marasanova at gmail.com> wrote:
> Hi,
>
> I am looking for a library that will give me very simple text
> representation of HTML.
> For example
> <div><h1>Title</h1><p>This is a <br />test</p></div>
>
> will be transformed to:
>
> Title
>
> This is a
> test
>
>
> i want to send plain text alternative of html email, and would prefer
> to do it automatically from HTML source.
something like this:
import re
text = '<div><h1>Title</h1><p>This is a <br />test</p></div>'
text = re.sub(r'[\n\ \t]+', ' ', text)
text = re.sub(r'(?i)(\<p\>|\<br\>|\<h[1-6]\>)', '\n', text)
result = re.sub('<.+?>', '', text)
print result
--
-----------------------------------------------------------
| Radovan Garabík http://kassiopeia.juls.savba.sk/~garabik/ |
| __..--^^^--..__ garabik @ kassiopeia.juls.savba.sk |
-----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!
More information about the Python-list
mailing list