Capturing the bad codes that raise UnicodeError exceptions during decoding

Malcolm Greene python at bdurham.com
Thu Aug 4 14:47:06 EDT 2016


I'm processing a lot of dirty CSV files and would like to track the bad
codes that are raising UnicodeErrors. I'm struggling how to figure out
what the exact codes are so I can track them, them remove them, and then
repeat the decoding process for the current line until the line has been
fully decoded so I can pass this line on to the CSV reader. At a high
level it seems that I need to wrap the decoding of a line until it
passes with out any errors. Any suggestions appreciated.

Thank you,
Malcolm



More information about the Python-list mailing list