PEP 249 Compliant error handling

MRAB python at mrabarnett.plus.com
Tue Oct 17 14:35:26 EDT 2017


On 2017-10-17 18:26, Israel Brewster wrote:
> I have written and maintain a PEP 249 compliant (hopefully) DB API for the 4D database, and I've run into a situation where corrupted string data from the database can cause the module to error out. Specifically, when decoding the string, I get a "UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 86-87: illegal UTF-16 surrogate" error. This makes sense, given that the string data got corrupted somehow, but the question is "what is the proper way to deal with this in the module?" Should I just throw an error on bad data? Or would it be better to set the errors parameter to something like "replace"? The former feels a bit more "proper" to me (there's an error here, so we throw an error), but leaves the end user dead in the water, with no way to retrieve *any* of the data (from that row at least, and perhaps any rows after it as well). The latter option sort of feels like sweeping the problem under the rug, but does at least leave an error character in the string to l
>   et them know there was an error, and will allow retrieval of any good data.
> 
> Of course, if this was in my own code I could decide on a case-by-case basis what the proper action is, but since this a module that has to work in any situation, it's a bit more complicated.
> 
If a particular text field is corrupted, then raising UnicodeDecodeError 
when trying to get the contents of that field as a Unicode string seems 
reasonable to me.

Is there a way to get the contents as a bytestring, or to get the 
contents with a different errors parameter, so that the user has the 
means to fix it (if it's fixable)?


More information about the Python-list mailing list