ASCII delimited files

Thomas A. Bryan tbryan at python.net
Thu Nov 11 21:12:09 EST 1999


Al Christians wrote:
> 
> There's a trick with files of delimited fields.  If a string field
> contains a delimiter (e.g. a comma in a csv file), then the field
> gets enclosed in quotes.  

Ah.  That's probably what the poster wanted!  I just posted what I had 
lying around on my machine.  In my case, I had to deal with basically 
those three field types, numeric values in a range, values from a list, 
and values that matched a regex (like a date).  They never contained 
the delimeter (usually a tab or the pipe '|' character).  

I guess I should have said that it helped me, but YMMV. :)

The big problem for me was that the values came from hardware that 
sometimes garbled dates or did other nasty things.  The format of the 
file (number and ordering of the columns) wasn't set in stone either, 
so I needed the capability to detect whether a new data file given 
to me possibly deviated from the one the parser expected.  Thus, the 
ability to build a parser quickly that would validate the data was 
what I wanted.  I still like my idea since, in my opinion, one of the 
worst things is loading corrupt data into a database. The parser code 
should deal with delimiter characters nested in the values.  I don't 
need it, and I lack sufficient leisure time to do it just for fun right 
now.  If somebody else has the need/time, feel free to take the code 
and make something really usable out of it.



---Tom

> What I usually wind up doing to circumvent these problems with delimited
> files is loading them back into a spreadsheet, typically the same one
> that wrote them (I suppose that a database program would
> offer the same option), and then I write them out as tab-delimited.
> Tabs within fields are just about never in a lifetime in my work, so
> far.




More information about the Python-list mailing list