How to convert " " in a string to blank space?

wittempj@hotmail.com martin.witte at gmail.com
Mon Oct 30 13:12:39 EST 2006



On Oct 30, 6:44 pm, "一首诗" <newpt... at gmail.com> wrote:
> Oh, I didn't make myself clear.
>
> What I mean is how to convert a piece of html to plain text bu keep as
> much format as possible.
>
> Such as convert " " to blank space and convert <br> to "\r\n"
>

Then you can explore the parser,
http://docs.python.org/lib/module-HTMLParser.html, like

#!/usr/bin/env python
from HTMLParser import HTMLParser

parsedtext = ''

class Parser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        if tag == 'br':
            global parsedtext
            parsedtext += '\\r\\n'

    def handle_data(self, data):
        global parsedtext
        parsedtext += data

    def handle_entityref(self, name):
        if name == 'nbsp':
            pass

x = Parser()
x.feed('An   text<br>')
print parsedtext


> Gary Herron wrote:
> > 一首诗 wrote:
> > > Is there any simple way to solve this problem?
>
> > Yes, strings have a replace method:
>
> > >>> s = "abc def"
> > >>> s.replace(' ',' ')
> > 'abc def'
>
> > Also various modules that are meant to deal with web and xml and such
> > have functions to do such operations.
> 
> > Gary Herron




More information about the Python-list mailing list