Finding non ascii characters in a set of files

Scott David Daniels scott.daniels at acm.org
Sat Feb 24 00:38:24 EST 2007


bg_ie at yahoo.com wrote:
> I'm updating my program to Python 2.5, but I keep running into
> encoding problems. I have no ecodings defined at the start of any of
> my scripts. What I'd like to do is scan a directory and list all the
> files in it that contain a non ascii character. How would I go about
> doing this?


def non_ascii(files):
     for file_name in files:
         f = open(file_name, 'rb')
         if '~' < max(f.read(), ' '):
             yield file_name
         f.close()

if __name__ == '__main__':
     import os.path
     import glob
     import sys
     for dirname in sys.path[1:] or ['.']:
         for name in non_ascii(glob.glob(os.path.join(dirname, '*.py')) +
                              glob.glob(os.path.join(dirname, '*.pyw'))):
             print name


--Scott David Daniels
scott.daniels at acm.org



More information about the Python-list mailing list