way to remove all non-ascii characters from a file?

Larry Bates lbates at swamisoft.com
Fri Feb 13 15:21:26 EST 2004


Something simple like following will work for files
that fit in memory:

def onlyascii(char):
    if ord(char) < 48 or ord(char) > 127: return ''
    else: return char

f=open('filename.ext','r')
data=f.read()
f.close()
filtered_data=filter(onlyascii, data)

For larger files you will need to loop and read
the data in chunks.

-Larry Bates
----------------------------
"omission9" <rus20376 at salemstate.edu> wrote in message
news:defa238f.0402131112.436997c1 at posting.google.com...
> I have a text file which contains the occasional non-ascii charcter.
> What is the best way to remove all of these in python?





More information about the Python-list mailing list