string split without consumption

robert no-spam at not-existing.invalid
Sat Feb 2 10:14:18 EST 2008


Tim Chase wrote:
>> this didn't work elegantly as expected:
>>
>>  >>> ss
>> 'owi\nweoifj\nfheu\n'
>>  >>> re.split(r'(?m)$',ss)
>> ['owi\nweoifj\nfheu\n']
> 
> Do you have a need to use a regexp?

I'd like the general case - split without consumption.

> 
>>>> ss.splitlines(True)
> ['owi\n', 'weoifj\n', 'fheu\n']
> 

thanks. Yet this does not work "naturally" consistent in my line 
processing algorithm - the further buffering. Compare e.g. 
ss.split('\n')  ..

 >>> 'owi\nweoifj\nfheu\n'.split('\n')
['owi', 'weoifj', 'fheu', '']
 >>> 'owi\nweoifj\nfheu\nxx'.split('\n')
['owi', 'weoifj', 'fheu', 'xx']

is consistent in that regard: there is always a last empty or half 
line, which can be fed readily as start to the further input 
buffering.
With the .splitlines(True/False) results you need to fiddle, test 
the last result's last char... Or you fail altogether with False.
So I'd call this a "wrong" implementation.


Robert



More information about the Python-list mailing list