Delete all not allowed characters..

Adam Donahue adam.donahue at gmail.com
Thu Oct 25 11:35:50 EDT 2007


On Oct 25, 10:52 am, Abandoned <best... at gmail.com> wrote:
> Hi..
> I want to delete all now allowed characters in my text.
> I use this function:
>
> def clear(s1=""):
>     if s1:
>         allowed =
> [u'+',u'0',u'1',u'2',u'3',u'4',u'5',u'6',u'7',u'8',u'9',u' ', u'Þ',
> u'þ', u'Ö', u'ö', u'Ü', u'ü', u'Ç', u'ç', u'Ý', u'ý', u'Ð', u'ð', 'A',
> 'C', 'B', 'E', 'D', 'G', 'F', 'I', 'H', 'K', 'J', 'M', 'L', 'O', 'N',
> 'Q', 'P', 'S', 'R', 'U', 'T', 'W', 'V', 'Y', 'X', 'Z', 'a', 'c', 'b',
> 'e', 'd', 'g', 'f', 'i', 'h', 'k', 'j', 'm', 'l', 'o', 'n', 'q', 'p',
> 's', 'r', 'u', 't', 'w', 'v', 'y', 'x', 'z']
>         s1 = "".join(ch for ch in s1 if ch in allowed)
>         return s1
>
> ....And my problem this function replace the character to "" but i
> want to " "
> for example:
> input: Exam%^^ple
> output: Exam   ple
> I want to this output but in my code output "Example"
> How can i do quickly because the text is very long..

Something like:

import re
def clear( s, allowed=[], case_sensitive=True):
    flags = ''
    if not case_sensitive:
        flags = '(?i)'
    return re.sub( flags + '[^%s]' % ''.join( allowed ), ' ', s )

And call:

clear( '123abcdefgABCdefg321', [ 'a', 'b', 'c' ] )
clear( '123abcdefgABCdefg321', [ 'a', 'b', 'c' ], False )

And so forth.  Or just use re directly!

(This implementation is imperfect in that it's possible to hack the
regular expression, and it may break with mismatched '[]' characters,
but the idea is there.)

Adam




More information about the Python-list mailing list