Is anyone happy with csv module?

Wed Dec 12 12:01:28 EST 2007

Neil Cerutti wrote:
> On 2007-12-12, je.s.te.r at hehxduhmp.org <je.s.te.r at hehxduhmp.org> wrote:
>   
>> John Machin <sjmachin at lexicon.net> wrote:
>>     
>>> For that purpose, CSV files are the utter pox and then some.
>>> Consider using xlrd and xlwt (nee pyexcelerator) to read
>>> (resp. write) XLS files directly.
>>>       
>> FWIW, CSV is a much more generic format for spreadsheets than
>> XLS. For example, I deal almost exclusively in CSV files for
>> simialr situations as the OP because I also work with software
>> that can't (or in some cases "can't easily") deal with XLS
>> files.  CSV files can be read in by basically anything.
>>     
>
> When I have a choice, I use simple tab-delimited text files.  The
> usually irrelevent limitation is the inability to embed tabs or
> newlines in fields. The relevant advantage is the simplicity.
>   

That is very unnecessary.  You can have your tabs and not eat them, too:

#!/usr/bin/python
"""
EXAMPLE USAGE OF PYTHON'S CSV.DICTREADER FOR PEOPLE NEW TO PYTHON AND/OR
CSV.DICTREADER

Python - Batteries Included(tm)

This file will demonstrate that when you use the python CSV module, you
don't have to remove the newline characters, as between "acorp_ Ac" and
"orp Foundation" and other parts of the data below.

It also demonstrates python's csv.DictReader, which allows you to read a
CSV record into a dictionary.

This will also demonstrate the use of lists ([]s) and dicts ({}s).

If this doesn't whet your appetite for getting ahold of a powertool
instead of sed for managing CSV data, I don't know what will.

"""

####  FIRST: CREATE A TEMPORARY CSV FILE FOR DEMONSTRATION PURPOSES
mycsvdata = """
"Category","0","acorp_ Ac
orp Foundation","","","Acorp Co","(480) 905-1906","877-462-5267 toll
free","800-367-2228","800-367-2228","info at acorp.or
g","7895 East Drive","Scottsdale","AZ","85260-6916","","","","","","Pres
Fred & Linda ","0","0","1","3","4","1"

"Category","0","acorp_ Bob and Margaret Schwartz","","","","317-321-6030
her","317-352-0844","","","","321 North Butler Ave.","In
dianapolis","IN","46219","","","","","","Refrigeration
man","0","1","2","3","4","0"

"Category","0","acorp_ Elschlager,
Bob","","","","","702-248-4556","","","TropBob at aol.com","7950 W.
Flamingo Rd. #2032","Las Vega
s","NV","89117","","","","","","guy I met","0","1","2","3","4","1"

"""

##  NOTE:  IF YOU HAVE A RECORD SEPARATOR WITHIN QUOTES, IT WILL NOT BE
TREATED LIKE A RECORD SEPARATOR!
##   Beef|"P|otatos"|Dinner Roll|Ice Cream

import os, sys
def writefile(filename, filedata, perms=750):
        f = open(filename, "w")
        f.write(filedata)
        os.system("chmod "+str(perms)+" "+filename)
        f.close()

file2write = 'mycsvdata.txt'
writefile(file2write,mycsvdata)

# Check that the file exists
if not os.path.exists(file2write):
    print "ERROR: unable to write file:", file2write," Exiting now!"
    sys.exit()

#   ...so everything down to this point merely creates the
# temporary CSV file for the code to test (below).

####  SECOND:  READ IN THE CSV FILE TO CREATE A LIST OF PYTHON
DICTIONARIES, WHERE EACH
#  DICTIONARY CONTAINS THE DATA FROM ONE ROW.  THE KEYS OF THE
DICTIONARY WILL BE THE FIELD NAMES
#  AND THE VALUES OF THE DICTIONARY WILL BE THE VALUES CONTAINED WITHIN
THE CSV FILE'S ROW.

import csv

### NOTE: Modify this list to match the fields of the CSV file.
header_flds =
['cat','num','name','blank1','blank2','company','phone1','phone2', \

'phone3','phone4','email','addr1','city','state','zip','blank3', \

'blank4','blank5','blank6','blank7','title','misc1','misc2','misc3', \
           'mics4','misc5','misc6']

file2open = 'mycsvdata.txt'

reader = csv.DictReader(open(file2open), [], delimiter=",")
data = []
while True:
    try:
        # Read next "header" line (if there isn't one then exit the loop)
        reader.fieldnames = header_flds
        rdr = reader.next()
        data.append(rdr)
    except StopIteration: break

def splitjoin(x):
    """ This removes any nasty \n that might exist in a field
    (of course, if you want that in the field, don't use this)
    """
    return ''.join((x).split('\n'))

####  THIRD: ITERATE OVER THE LIST OF DICTS (IN WHICH EACH DICT IS A
ROW/RECORD FROM THE CSV FILE)

# example of accessing all the dictionaries once they are in the list
'data':
import string
for rec in data:   # for each CVS record
    itmz = rec.items()  # get the items from the dictionary
    print "- = " * 20
    for key,val in itmz:
        print key.upper()+":  \t\t",splitjoin(val)
                # Note: splitjoin() allows a record to contain fields
with newline characters

-- 
Shane Geiger
IT Director
National Council on Economic Education
sgeiger at ncee.net  |  402-438-8958  |  http://www.ncee.net

Leading the Campaign for Economic and Financial Literacy