RE Help splitting CVS data

Garry ggkraemer at gmail.com
Sun Jan 20 19:41:12 EST 2013


On Sunday, January 20, 2013 3:04:39 PM UTC-7, Garry wrote:
> I'm trying to manipulate family tree data using Python.
> 
> I'm using linux and Python 2.7.3 and have data files saved as Linux formatted cvs files
> 
> The data appears in this format:
> 
> 
> 
> Marriage,Husband,Wife,Date,Place,Source,Note0x0a
> 
> Note: the Source field or the Note field can contain quoted data (same as the Place field)
> 
> 
> 
> Actual data:
> 
> [F0244],[I0690],[I0354],1916-06-08,"Neely's Landing, Cape Gir. Co, MO",,0x0a
> 
> [F0245],[I0692],[I0355],1919-09-04,"Cape Girardeau Co, MO",,0x0a
> 
> 
> 
> code snippet follows:
> 
> 
> 
> import os
> 
> import re
> 
> #I'm using the following regex in an attempt to decode the data:
> 
> RegExp2 = "^(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\[[A-Z]\d{1,}\])\,(\d{,4}\-\d{,2}\-\d{,2})\,(.*|\".*\")\,(.*|\".*\")\,(.*|\".*\")"
> 
> #
> 
> line = "[F0244],[I0690],[I0354],1916-06-08,\"Neely's Landing, Cape Gir. Co, MO\",,"
> 
> #
> 
> (Marriage,Husband,Wife,Date,Place,Source,Note) = re.split(RegExp2,line)
> 
> #
> 
> #However, this does not decode the 7 fields.
> 
> # The following error is displayed:
> 
> Traceback (most recent call last):
> 
>   File "<stdin>", line 1, in <module>
> 
> ValueError: too many values to unpack
> 
> #
> 
> # When I use xx the fields apparently get unpacked.
> 
> xx = re.split(RegExp2,line)
> 
> #
> 
> >>> print xx[0]
> 
> 
> 
> >>> print xx[1]
> 
> [F0244]
> 
> >>> print xx[5]
> 
> "Neely's Landing, Cape Gir. Co, MO"
> 
> >>> print xx[6]
> 
> 
> 
> >>> print xx[7]
> 
> 
> 
> >>> print xx[8]
> 
> 
> 
> Why is there an extra NULL field before and after my record contents?
> 
> I'm stuck, comments and solutions greatly appreciated.
> 
> 
> 
> Garry

Thanks everyone for your comments.  I'm new to Python, but can get around in Perl and regular expressions.  I sure was taking the long way trying to get the cvs data parsed.  

Sure hope to teach myself python.  Maybe I need to look into courses offered at the local Jr College!

Garry
 



More information about the Python-list mailing list