regular expression for space seperated quoted string
Padraig Brady
padraig at linux.ie
Sun Sep 15 15:50:04 EDT 2002
Padraig Brady wrote:
> Padraig Brady wrote:
>
>> Eric Brunel wrote:
>>
>>> Padraig Brady wrote:
>>>
>>>
>>>> Hi, I'm trying to split a string that is seperated
>>>> by spaces and also contains double quoted words which
>>>> can contain spaces:
>>>
>>>
>>
>> [snip]
>>
>>> What about:
>>>
>>>>>> p = r'[^ \t\n\v\f"]+|"[^"]*"'
>>>>>> re.findall(p, '1 2 3')
>>>>>
>>>>>
>>
>> FYI, I'm using the following:
>>
>> import fileinput, re
>> for line in fileinput.input():
>> #split fields
>> fields = re.findall('[^ "]+|"[^"]+"', line[:-1])
>> #remove quotes
>> fields = map(lambda field: field.replace('"', ''), listLine)
>>
>> thanks again,
>> Pádraig.
>>
>
> Just in case it's useful I might as well make it correct and more
> efficient:
>
> import fileinput, re
> reFieldSplitter = re.compile('[^ "]+|"[^"]+"')
> for line in fileinput.input():
> #split fields
> fields = reFieldSplitter.findall(line[:-1])
> #remove quotes
> fields = map(lambda field: field.replace('"', ''), fields)
Just to be more complete the compilation of the regular expression
above gave a 1.3% speedup. However changing the map(lambda...) to
a list comprehension gives another 4% speedup.
import fileinput, re
reFieldSplitter = re.compile('[^ "]+|"[^"]+"')
for line in fileinput.input():
#split fields
fields = reFieldSplitter.findall(line[:-1])
#remove quotes
fields = [field.replace('"', '') for field in fields)]
Pádraig.
More information about the Python-list
mailing list