processing a large utf-8 file

Ivan Voras ivoras at _-_fer.hr
Fri May 20 19:19:12 EDT 2005


Since the .encoding attribute of file objects are read-only, what is the 
proper way to process large utf-8 text files?

I need "bulk" processing (i.e. in blocks - the file is ~ 1GB), but 
reading it in fixed blocks is bound to result in partially-read utf-8 
characters at block boundaries.




More information about the Python-list mailing list