Odd csv column-name truncation with only one column

Tim Chase python.list at tim.thechases.com
Thu Jul 19 09:04:55 EDT 2012


On 07/19/12 06:21, Tim Chase wrote:
> tim at laptop:~/tmp$ python
> Python 2.6.6 (r266:84292, Dec 26 2010, 22:31:48)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import csv
>>>> from cStringIO import StringIO
>>>> s = StringIO('Email\nfoo at example.com\nbar at example.org\n')
>>>> s.seek(0)
>>>> d = csv.Sniffer().sniff(s.read())
>>>> s.seek(0)
>>>> r = csv.DictReader(s, dialect=d)
>>>> r.fieldnames
> ['Emai', '']

I think I may have stumbled across the "what the heck is happening"
factor:

>>> import csv
>>> from cStringIO import StringIO
>>> s = StringIO('Email\nfoo at example.org\nbar at test.test\n')
>>> d = csv.Sniffer().sniff(s.read())
>>> s.seek(0)
>>> r = csv.DictReader(s, dialect=d)
>>> r.fieldnames
['Em', 'il']

It appears that it's finding something repeated [ed: Peter's &
Steven's replies came in as I finished typing this].  In my first,
it was the "l" appearing on each line, and in the 2nd example here,
it's the "a" on each line, so the csv module thinks that's the
delimiter.  The source file comes from an Excel-dialect generation:

>>> s = StringIO()
>>> w = csv.writer(s)
>>> w.writerows([["email"], ["foo at example.com"], ["bar at example.org"]])
>>> s.seek(0)
>>> d = csv.Sniffer().sniff(s.read())
>>> d.delimiter
'l'
>>> s.seek(0)
>>> r = csv.DictReader(s, dialect=d)
>>> r.fieldnames
['emai', '']


I guess it then takes the Python community to make the call on
whether the csv module is doing the right thing in the degenerate
case.  I.e., you can't get back out what you put in when you try to
sniff.

-tkc







More information about the Python-list mailing list