Quote-aware string splitting

George Sakkis gsakkis at rutgers.edu
Mon Apr 25 22:37:45 EDT 2005


> "J. W. McCall" <jmccall at houston.rr.com> writes:
> >
> > I need to split a string as per string.strip(), but with a
> > modification: I want it to recognize quoted strings and return them
as
> > one list item, regardless of any whitespace within the quoted
string.
> >
> > For example, given the string:
> >
> > 'spam "the life of brian" 42'
> >
> > I'd want it to return:
> >
> > ['spam', 'the life of brian', '42']
> >
> > I see no standard library function to do this, so what would be the
> > most simple way to achieve this?  This should be simple, but I must
be
> > tired as I'm not currently able to think of an elegant way to do
this.
> >
> > Any ideas?
>
> How about the csv module? It seems like it might be overkill, but it
> does already handle that sort of quoting
>
>   >>> import csv
>   >>> csv.reader(['spam "the life of brian" 42'], delimiter='
').next()
>   ['spam', 'the life of brian', '42']
>


I don't know if this is as good as CSV's splitter, but it works
reasonably well for me:

import re
regex = re.compile(r'''
                   '.*?' |  # single quoted substring
                   ".*?" |  # double quoted substring
                   \S+      # all the rest
                   ''', re.VERBOSE)

print regex.findall('''
                    This is 'single "quoted" string'
                    followed by a "double 'quoted' string"
                    ''')

George




More information about the Python-list mailing list