Delete all not allowed characters..

Tim Chase python.list at tim.thechases.com
Thu Oct 25 12:01:24 EDT 2007


> I want to delete all now allowed characters in my text.
> I use this function:
> 
> def clear(s1=""):
>     if s1:
>         allowed =
> [u'+',u'0',u'1',u'2',u'3',u'4',u'5',u'6',u'7',u'8',u'9',u' ', u'Ş',
> u'ş', u'Ö', u'ö', u'Ü', u'ü', u'Ç', u'ç', u'İ', u'ı', u'Ğ', u'ğ', 'A',
> 'C', 'B', 'E', 'D', 'G', 'F', 'I', 'H', 'K', 'J', 'M', 'L', 'O', 'N',
> 'Q', 'P', 'S', 'R', 'U', 'T', 'W', 'V', 'Y', 'X', 'Z', 'a', 'c', 'b',
> 'e', 'd', 'g', 'f', 'i', 'h', 'k', 'j', 'm', 'l', 'o', 'n', 'q', 'p',
> 's', 'r', 'u', 't', 'w', 'v', 'y', 'x', 'z']
>         s1 = "".join(ch for ch in s1 if ch in allowed)
>         return s1
> 
> ....And my problem this function replace the character to "" but i
> want to " "
> for example:
> input: Exam%^^ple
> output: Exam   ple
> I want to this output but in my code output "Example"
> How can i do quickly because the text is very long..

Any reason your alphabet is oddly entered?

You can speed it up by using a set.  You can also tweak your join 
to choose a space if the letter isn't one of your allowed letters:

   import string
   allowed = set(
     string.letters +
     string.digits +
     ' +' +
     u'ŞşÖöÜüÇçİıĞğ')
   def clear(s):
     return "".join(
       letter in allowed and letter or " "
       for letter in s)

In Python 2.5, there's a ternary operator syntax something like 
the following (which I can't test, as I'm not at a PC with 2.5 
installed)

   def clear(s):
     return "".join(
       letter
         if letter in allowed
         else " "
       for letter in s)

which some find more readable...I don't particularly care for 
either syntax.  The latter is 2.5-specific and makes more sense, 
but still isn't as readable as I would have liked; while the 
former works versions of python back to at least 2.2 which I 
still have access to, and is a well documented idiom/hack.

-tkc







More information about the Python-list mailing list