Odd csv column-name truncation with only one column
Hans Mulder
hansmu at xs4all.nl
Fri Jul 20 12:59:24 EDT 2012
On 19/07/12 23:10:04, Dennis Lee Bieber wrote:
> On Thu, 19 Jul 2012 13:01:37 -0500, Tim Chase
> <python.list at tim.thechases.com> declaimed the following in
> gmane.comp.python.general:
>
>> It just seems unfortunate that the sniffer would ever consider
>> [a-zA-Z0-9] as a valid delimiter.
+1
> I'd suspect the sniffer logic does not do any special casing
> -- any /byte value/ is a candidate for the delimiter.
The sniffer prefers [',', '\t', ';', ' ', ':'] (in that order).
If none of those is found, it goes to the other extreme and considers
all characters equally likely.
> This would allow for usage of some old ASCII control characters --
> things like x1F (unit separator)
If the Sniffer excludes [a-zA-Z0-9] (or all alphanumerics) as
potential delimiters, than control characters such as "\x1F" are
still possible.
> {Next is to rig the sniffer to identify x1F for fields, and x1E
> for records <G>}
The sniffer will always guess '\r\n' as the line terminator.
That should not stop you from creating a dialect with '\x1E' as
the line terminator. Just don't expect the sniffer to recognize
that dialect.
-- HansM
More information about the Python-list
mailing list