[issue24787] csv.Sniffer guesses "M" instead of \t or , as the delimiter
Peter Otten
report at bugs.python.org
Sat Aug 8 09:49:08 CEST 2015
Peter Otten added the comment:
Have you considered writing your own little sniffer? Getting it right for your actual data is usually easier to achieve than a general solution.
The following simplistic sniffer should work with your samples:
def make_dialect(delimiter):
class Dialect(csv.excel):
pass
Dialect.delimiter = delimiter
return Dialect
def sniff(sample):
count, delimiter = max(
((sample.count(delim), delim) for delim in ",\t|;"),
key=operator.itemgetter(0))
if count == 0:
if " " in sample:
delimiter = " "
else:
raise csv.Error("Could not determine delimiter")
return make_dialect(delimiter)
Tiago, If you want to follow that path we should take the discussion to the general python mailing list.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24787>
_______________________________________
More information about the Python-bugs-list
mailing list