getting fileinput to do errors='ignore' or 'replace'?

Laura Creighton lac at openend.se
Thu Dec 3 15:40:16 EST 2015


In a message of Thu, 03 Dec 2015 19:17:51 +0000, Adam Funk writes:
>On 2015-12-03, Laura Creighton wrote:
>
>> In a message of Thu, 03 Dec 2015 15:12:15 +0000, Adam Funk writes:
>>>I'm having trouble with some input files that are almost all proper
>>>UTF-8 but with a couple of troublesome characters mixed in, which I'd
>>>like to ignore instead of throwing ValueError.  I've found the
>>>openhook for the encoding
>>>
>>>for line in fileinput.input(options.files, openhook=fileinput.hook_encoded("utf-8")):
>>>    do_stuff(line)
>>>
>>>which the documentation describes as "a hook which opens each file
>>>with codecs.open(), using the given encoding to read the file", but
>>>I'd like codecs.open() to also have the errors='ignore' or
>>>errors='replace' effect.  Is it possible to do this?
>>>
>>>Thanks.
>>
>> This should be both easy to add, and useful, and I happen to know that
>> fileinput is being hacked on by Serhiy Storchaka right now, who agrees
>> that this would be easy.  So, with his approval, I stuck this into the
>> tracker.  http://bugs.python.org/issue25788  
>>
>> Future Pythons may not have the problem.
>
>Good to know, thanks.

Well, we have moved right along to 'You write the patch, Laura' so I
can pretty much guarantee that future Pythons won't have the problem. :)

Laura




More information about the Python-list mailing list