[New-bugs-announce] [issue37984] Unable parse csv on latin iso or binary mode
Yhojann Aguilera
report at bugs.python.org
Thu Aug 29 18:17:04 EDT 2019
New submission from Yhojann Aguilera <yhojann.aguilera at gmail.com>:
Unable parse a csv with latin iso charset.
with open('./exported.csv', newline='') as csvFileHandler:
csvHandler = csv.reader(csvFileHandler, delimiter=';', quotechar='"')
for line in csvHandler:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 1032: invalid continuation byte
I try using a binary mode on open() but says: binary mode doesn't take a newline argument. Ok, replace newline to binary char: newline=b'', but says: open() argument 6 must be str or None, not bytes. Ok, remove newline argument: _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?).
Ok, csv module no support binary read mode. Try use latin iso:
with open('./exported.csv', mode='r', encoding='ISO-8859', newline='') as csvFileHandler:
UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: character maps to <undefined>
But the charset is latin iso:
$ file exported.csv
exported.csv: ISO-8859 text, with very long lines, with CRLF line terminators
Ok, change to ISO-8859-8:
UnicodeDecodeError: 'charmap' codec can't decode byte 0xd1 in position 1032: character maps to <undefined>
Unable load the file. Why not give the option to work binary? the delimiters can be represented with binary values.
----------
components: Unicode
messages: 350836
nosy: Yhojann Aguilera, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: Unable parse csv on latin iso or binary mode
type: behavior
versions: Python 3.7
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue37984>
_______________________________________
More information about the New-bugs-announce
mailing list