Detect string has non-ASCII chars without checking each char?

John Machin sjmachin at lexicon.net
Sun Aug 22 18:13:57 EDT 2010


On Aug 23, 1:10 am, "Michel Claveau -
MVP"<enleverLesX_XX... at XmclavXeauX.com.invalid> wrote:
> Re !
>
> > Try your code with u"abcd\xa1" ... it says it's ASCII.
>
> Ah?  in my computer, it say "False"

Perhaps your computer has a problem. Mine does this with both Python
2.7 and Python 2.3 (which introduced the unicodedata.normalize
function):

  >>> import unicodedata
  >>> t1 = u"abcd\xa1"
  >>> t2 = unicodedata.normalize('NFD', t1)
  >>> t3 = t2.encode('ascii', 'replace')
  >>> [t1, t2, t3]
  [u'abcd\xa1', u'abcd\xa1', 'abcd?']
  >>> map(len, _)
  [5, 5, 5]
  >>>



More information about the Python-list mailing list