newby question: Splitting a string - separator

James Stroud jstroud at mbi.ucla.edu
Sat Dec 10 01:26:24 EST 2005


Steven D'Aprano wrote:
> On Fri, 09 Dec 2005 18:02:02 -0800, James Stroud wrote:
> 
> 
>>Thomas Liesner wrote:
>>
>>>Hi all,
>>>
>>>i am having a textfile which contains a single string with names.
>>>I want to split this string into its records an put them into a list.
>>>In "normal" cases i would do something like:
>>>
>>>
>>>
>>>>#!/usr/bin/python
>>>>inp = open("file")
>>>>data = inp.read()
>>>>names = data.split()
>>>>inp.close()
>>>
>>>
>>>The problem is, that the names contain spaces an the records are also
>>>just seprarated by spaces. The only thing i can rely on, ist that the
>>>recordseparator is always more than a single whitespace.
>>>
>>>I thought of something like defining the separator for split() by using
>>> a regex for "more than one whitespace". RegEx for whitespace is \s, but
>>>what would i use for "more than one"? \s+?
>>>
>>>TIA,
>>>Tom
>>
>>The one I like best goes like this:
>>
>>py> data = "Guido van Rossum  Tim Peters     Thomas Liesner"
>>py> names = [n for n in data.split() if n]
>>py> names
>>['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']
>>
>>I think it is theoretically faster (and more pythonic) than using regexes.
> 
> 
> 
> Yes, but the correct result would be:
> 
> ['Guido van Rossum', 'Tim Peters', 'Thomas Liesner']
> 
> Your code is short, elegant but wrong.
> 
> It could also be shorter and more elegant:
> 
> # your version
> py> data = "Guido van Rossum  Tim Peters     Thomas Liesner"
> py> [n for n in data.split() if n]
> ['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']
> 
> # my version
> py> data = "Guido van Rossum  Tim Peters     Thomas Liesner"
> py> data.split()
> ['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner']
> 
> The "if n" in the list comp is superfluous, and without that, the whole
> list comp is unnecessary.
> 
> 
> 
see my post from 1 hr before this one.



More information about the Python-list mailing list