Fastest way to detect a non-ASCII character in a list of strings.

Rhodri James rhodri at wildebst.demon.co.uk
Sun Oct 17 20:04:09 EDT 2010


On Sun, 17 Oct 2010 20:59:22 +0100, Dun Peal <dunpealer at gmail.com> wrote:

> `all_ascii(L)` is a function that accepts a list of strings L, and
> returns True if all of those strings contain only ASCII chars, False
> otherwise.
>
> What's the fastest way to implement `all_ascii(L)`?
>
> My ideas so far are:
>
> 1. Match against a regexp with a character range: `[ -~]`
> 2. Use s.decode('ascii')
> 3. `return all(31< ord(c) < 127 for s in L for c in s)`

Don't call it "all_ascii" when you don't mean that; all_printable
would be more accurate, and would have lead you through more
interesting places in the standard library, like:

   import string
   return set("".join(L)) <= set(string.printable)

I've no idea whether this is faster or slower than any of
your suggestions.  You could "timeit" and see, or you could
wait a bit and not optimise prematurely.

-- 
Rhodri James *-* Wildebeest Herder to the Masses



More information about the Python-list mailing list