[Tutor] Removing control characters
Mark Tolonen
metolone+gmane at gmail.com
Fri Feb 20 02:16:26 CET 2009
"Kent Johnson" <kent37 at tds.net> wrote in message
news:1c2a2c590902191500y71600feerff0b73a88fb49eed at mail.gmail.com...
> On Thu, Feb 19, 2009 at 5:41 PM, Dinesh B Vadhia
> <dineshbvadhia at hotmail.com> wrote:
>> Okay, here is a combination of Mark's suggestions and yours:
>
>>> # replace unwanted chars in string s with " "
>>> t = "".join([(" " if n in c else n) for n in s if n not in c])
>>> t
>> 'Product ConceptsHard candy with an innovative twist, Internet Archive:
>> Wayback Machine. [online] Mar. 25, 2004. Retrieved from the Internet
>> <URL:
>> http://www.confectionery-innovations.com>.'
>>
>> This last bit doesn't work ie. replacing the unwanted chars with " " -
>> eg.
>> 'ConceptsHard'. What's missing?
>
> The "if n not in c" at the end of the list comp rejects the unwanted
> characters from the result immediately. What you wrote is the same as
> t = "".join([n for n in s if n not in c])
>
> because "n in c" will never be true in the first conditional.
>
> BTW if you care about performance, this is the wrong approach. At
> least use a set for c; better would be to use translate().
Sorry, I didn't catch the "replace with space" part. Kent is right,
translate is what you want. The join is still nice for making the
translation table:
>>> table = ''.join(' ' if n < 32 or n > 126 else chr(n) for n in
>>> xrange(256))
>>> string.translate('here is\x01my\xffstring',table)
'here is my string'
-Mark
More information about the Tutor
mailing list