Efficient, built-in way to determine if string has non-ASCII chars outside ASCII 32-127, CRLF, Tab?

Ian Kelly ian.g.kelly at gmail.com
Mon Oct 31 18:52:53 EDT 2011


On Mon, Oct 31, 2011 at 4:08 PM, Dave Angel <d at davea.name> wrote:
> I was wrong once again.  But a simple combination of  translate() and
> split() methods might do it.  Here I'm suggesting that the table replace all
> valid characters with space, so the split() can use its default behavior.

That sounds overly complicated and error-prone.  For instance, split()
will split on vertical tab, which is not one of the characters the OP
wanted.  I would probably use a regular expression for this.

import re
if re.search(r'[^\r\n\t\040-\177]', string_to_test):
    print("Invalid!")

Cheers,
Ian



More information about the Python-list mailing list