ASCII delimited files
Thomas A. Bryan
tbryan at python.net
Thu Nov 11 21:12:09 EST 1999
Al Christians wrote:
>
> There's a trick with files of delimited fields. If a string field
> contains a delimiter (e.g. a comma in a csv file), then the field
> gets enclosed in quotes.
Ah. That's probably what the poster wanted! I just posted what I had
lying around on my machine. In my case, I had to deal with basically
those three field types, numeric values in a range, values from a list,
and values that matched a regex (like a date). They never contained
the delimeter (usually a tab or the pipe '|' character).
I guess I should have said that it helped me, but YMMV. :)
The big problem for me was that the values came from hardware that
sometimes garbled dates or did other nasty things. The format of the
file (number and ordering of the columns) wasn't set in stone either,
so I needed the capability to detect whether a new data file given
to me possibly deviated from the one the parser expected. Thus, the
ability to build a parser quickly that would validate the data was
what I wanted. I still like my idea since, in my opinion, one of the
worst things is loading corrupt data into a database. The parser code
should deal with delimiter characters nested in the values. I don't
need it, and I lack sufficient leisure time to do it just for fun right
now. If somebody else has the need/time, feel free to take the code
and make something really usable out of it.
---Tom
> What I usually wind up doing to circumvent these problems with delimited
> files is loading them back into a spreadsheet, typically the same one
> that wrote them (I suppose that a database program would
> offer the same option), and then I write them out as tab-delimited.
> Tabs within fields are just about never in a lifetime in my work, so
> far.
More information about the Python-list
mailing list