csv module and NULL data byte

John Pote johnpote at jptechnical.co.uk
Wed Feb 28 18:40:41 EST 2018


I have a csv data file that may become corrupted (already happened) 
resulting in a NULL byte appearing in the file. The NULL byte causes an 
_csv.Error exception.

I'd rather like the csv reader to return csv lines as best it can and 
subsequent processing of each comma separated field deal with illegal 
bytes. That way as many lines from the file may be processed and the 
corrupted ones simply dumped.

Is there a way of getting the csv reader to accept all 256 possible 
bytes. (with \r,\n and ',' bytes delimiting lines and fields).

My test code is,

     with open( fname, 'rt', encoding='iso-8859-1' ) as csvfile:
         csvreader = csv.reader(csvfile, delimiter=',', 
quoting=csv.QUOTE_NONE, strict=False )
             data = list( csvreader )
             for ln in data:
                 print( ln )

Result

 >>python36 csvTest.py
Traceback (most recent call last):
   File "csvTest.py", line 22, in <module>
     data = list( csvreader )
_csv.Error: line contains NULL byte

strict=False or True makes no difference.

Help appreciated,

John




More information about the Python-list mailing list