Escaping commas within parens in CSV parsing?

Edvard Majakari edvard+news at majakari.net
Fri Jul 1 06:52:48 EDT 2005


felciano at gmail.com writes:

> I am trying to use the csv module to parse a column of values
> containing comma-delimited values with unusual escaping:
>
> AAA, BBB, CCC (some text, right here), DDD
>
> I want this to come back as:
>
> ["AAA", "BBB", "CCC (some text, right here)", "DDD"]

Quick and somewhat dirty: change your delimiter to a char that never exists in
fields (eg. null character '\0').

Example:

>>> s = 'AAA\0 BBB\0 CCC (some text, right here)\0 DDD'
>>> [f.strip() for f in s.split('\0')]
['AAA', 'BBB', 'CCC (some text, right here)', 'DDD']

But then you'd need to be certain there's no null character in the input
lines by checking it:

colsep = '\0'

for field in inputs:
    if colsep in field:
        raise IllegalCharException('invalid chars in field %s' % field)

If you need to stick with comma as a separator and the format is relatively
fixed, I'd probably use some parser module instead. Regular expressions are
nice too, but it is easy to make a mistake with those, and for non-trivial
stuff they tend to become write-only.

--
# Edvard Majakari		Software Engineer
# PGP PUBLIC KEY available    	Soli Deo Gloria!

$_ = '456476617264204d616a616b6172692c20612043687269737469616e20'; print
join('',map{chr hex}(split/(\w{2})/)),uc substr(crypt(60281449,'es'),2,4),"\n";



More information about the Python-list mailing list