csv module and NULL data byte

John Pote johnpote at jptechnical.co.uk
Wed Mar 7 17:43:34 EST 2018


On 07/03/2018 07:59, Andrew McNamara wrote:
>>> 	Last time I read the documentation, it was recommended that
>>> the file be opened in BINARY mode ("rb").
>> It recommends binary mode, but seems to largely work fine with
>> text/ascii mode or even arbitrary iterables.  I've not seen the
>> rationale behind the binary recommendation, but in 10+ years of using
>> the csv module, I've not found any issues in using text/ascii mode
>> that were solved by switching to using binary mode.
> The CSV module was originally written by Dave Cole. I subsequently
> made changes necessary to get it included a standard part of Python.
> I also implemented the "dialect" logic, and I put a lot of time into
> making the parser and generator produce the same results as Excel.
>
> That particular recommendation is necessary because Excel has some
> particular behaviours around CR and LF and quoting. For the parser to
> produce the same result as Excel, it must see the raw bytes with no
> re-ordering or suppression of CRs.
>
> Unfortunately, I haven't had time to be involved in the module for a few
> years. I wasn't involved with the Unicode changes necessary in Python 3,
> and I have not verified that it is still compatible with recent versions
> of Excel.
Any idea why it might throw an exception on encountering a NULL in the 
input stream? It accepts all other 255 byte values. Was this behaviour 
intended? Perhaps a comment should be added to the docs.
Thanks for your work on the module anyway.

John



More information about the Python-list mailing list