Finding non ascii characters in a set of files
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Fri Feb 23 13:41:04 EST 2007
In <ern7jr$rkn$1 at foggy.unx.sas.com>, Tim Arnold wrote:
> Here's what I do (I need to know the line number).
>
> import os,sys,codecs
> def checkfile(filename):
> f = codecs.open(filename,encoding='ascii')
>
> lines = open(filename).readlines()
> print 'Total lines: %d' % len(lines)
> for i in range(0,len(lines)):
> try:
> l = f.readline()
> except:
> num = i+1
> print 'problem: line %d' % num
>
> f.close()
I see a `NameError` here. Where does `i` come from? And there's no need
to read the file twice. Untested:
import os, sys, codecs
def checkfile(filename):
f = codecs.open(filename,encoding='ascii')
try:
for num, line in enumerate(f):
pass
except UnicodeError:
print 'problem: line %d' % num
f.close()
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list