[Tutor] str.split and quotes
Tony Meyer
tameyer at ihug.co.nz
Wed Apr 6 07:59:23 CEST 2005
> >>> s = 'Hi "Python Tutors" please help'
> >>> s.split()
> ['Hi', '"Python', 'Tutors"', 'please', 'help']
> >>>
>
> I wish it would leave the stuff in quotes in tact:
>
> ['Hi', '"Python Tutors"', 'please', 'help']
You can do this with a regular expression:
>>> import re
>>> re.findall(r'\".*\"|[^ ]+', s)
['Hi', '"Python Tutors"', 'please', 'help']
The regular expression says to find patterns that are either a quote (\")
then any number of any characters (.*)then a quote (/") or (|) more than one
of any character except a space ([^ ]).
Or you can just join them back up again:
>>> combined = []
>>> b = []
>>> for a in s.split():
... if '"' in a:
... if combined:
... combined.append(a)
... b.append(" ".join(combined))
... combined = []
... else:
... combined.append(a)
... else:
... b.append(a)
...
>>> b
['Hi', '"Python Tutors"', 'please', 'help']
(There are probably tidier ways of doing that).
Or you can do the split yourself:
def split_no_quotes(s):
index_start = 0
index_end = 0
in_quotes = False
result = []
while index_end < len(s):
if s[index_end] == '"':
in_quotes = not in_quotes
if s[index_end] == ' ' and not in_quotes:
result.append(s[index_start:index_end])
index_start = index_end + 1
index_end += 1
if s[-1] != ' ':
result.append(s[index_start:index_end])
return result
>>> print split_no_quotes(s)
['Hi', '"Python Tutors"', 'please', 'help']
=Tony.Meyer
More information about the Tutor
mailing list