Regular expression query

Vlastimil Brom vlastimil.brom at gmail.com
Sun Mar 12 17:20:30 EDT 2017


2017-03-12 17:22 GMT+01:00  <rahulrasal at gmail.com>:
> Hi All,
>
> I have a string which looks like
>
> aaaaa,bbbbb,ccccc "4873898374", ddddd, eeeeee "3343,23,23,5,,5,45", fffff "5546,3434,345,34,34,5,34,543,7"
>
> It is comma saperated string, but some of the fields have a double quoted string as part of it (and that double quoted string can have commas).
> Above string have only 6 fields. First is aaaaa, second is bbbbb and last is fffff "5546,3434,345,34,34,5,34,543,7".
> How can I split this string in its fields using regular expression ? or even if there is any other way to do this, please speak out.
>
> Thanks in advance
> --
> https://mail.python.org/mailman/listinfo/python-list

Hi,
would something like the following pattern fulfill the requirements?
(It doesn't handle possible empty fields, as mentioned in other posts,
the surrounding whitespace can be removed separately:

>>>
>>> re.findall(r'(?:(?:"[^"]*"|[^,]))+(?=,|$)', 'aaaaa,bbbbb,ccccc "4873898374", ddddd, eeeeee "3343,23,23,5,,5,45", fffff "5546,3434,345,34,34,5,34,543,7"')
['aaaaa', 'bbbbb', 'ccccc "4873898374"', ' ddddd', ' eeeeee
"3343,23,23,5,,5,45"', ' fffff "5546,3434,345,34,34,5,34,543,7"']
>>>
>>> for field in re.findall(r'(?:(?:"[^"]*"|[^,]))+(?=,|$)', 'aaaaa,bbbbb,ccccc "4873898374", ddddd, eeeeee "3343,23,23,5,,5,45", fffff "5546,3434,345,34,34,5,34,543,7"'): print(field.strip())
...
aaaaa
bbbbb
ccccc "4873898374"
ddddd
eeeeee "3343,23,23,5,,5,45"
fffff "5546,3434,345,34,34,5,34,543,7"
>>>

hth,
   vbr



More information about the Python-list mailing list