getting fileinput to do errors='ignore' or 'replace'?

Laura Creighton lac at openend.se
Thu Dec 3 11:46:39 EST 2015


In a message of Thu, 03 Dec 2015 15:12:15 +0000, Adam Funk writes:
>I'm having trouble with some input files that are almost all proper
>UTF-8 but with a couple of troublesome characters mixed in, which I'd
>like to ignore instead of throwing ValueError.  I've found the
>openhook for the encoding
>
>for line in fileinput.input(options.files, openhook=fileinput.hook_encoded("utf-8")):
>    do_stuff(line)
>
>which the documentation describes as "a hook which opens each file
>with codecs.open(), using the given encoding to read the file", but
>I'd like codecs.open() to also have the errors='ignore' or
>errors='replace' effect.  Is it possible to do this?
>
>Thanks.

This should be both easy to add, and useful, and I happen to know that
fileinput is being hacked on by Serhiy Storchaka right now, who agrees
that this would be easy.  So, with his approval, I stuck this into the
tracker.  http://bugs.python.org/issue25788  

Future Pythons may not have the problem.

Laura




More information about the Python-list mailing list