stupid perl question
John Machin
sjmachin at lexicon.net
Fri May 26 21:11:40 EDT 2006
On 27/05/2006 9:51 AM, BJörn Lindqvist wrote:
>> how can i split a string that contains white spaces and '_'
>>
>> any clue?
>
> If the white spaces and the '_' should be applied equivalently on the
> input and you can enumerate all white space characters, you could do
> like this:
Yes, you could write out the whitespace characters for the 8-bit
encoding of your choice, or you could find them using Python (and get
some possibly surprising answers):
>>> mkws = lambda enc, sz=256: "".join([chr(i) for i in range(sz) if
chr(i).decode(enc, 'ignore').isspace()])
>>> mkws('cp1252')
'\t\n\x0b\x0c\r\x1c\x1d\x1e\x1f \xa0'
>>> mkws('latin1')
'\t\n\x0b\x0c\r\x1c\x1d\x1e\x1f \x85\xa0'
>>> mkws('cp1251')
'\t\n\x0b\x0c\r\x1c\x1d\x1e\x1f \xa0'
>>> mkws('ascii', 128)
'\t\n\x0b\x0c\r\x1c\x1d\x1e\x1f '
and compare the last one with the result for the C locale:
>>> "".join([chr(i) for i in range(256) if chr(i).isspace()])
'\t\n\x0b\x0c\r '
>
> def split_helper(list, delims):
> if not delims:
> return list
> ch = delims[0]
> lst = []
> for item in list:
> lst += split_helper(item.split(ch), delims[1:])
> return lst
>
> def split(str, delims):
> return split_helper([str], delims)
>
>>>> split("foo_bar eh", "_ ")
> ['foo', 'bar', 'eh']
>
> Though I bet someone will post a one-line solution in the next 30
> minutes. :)
Two one-liners, depending on what the OP really wants:
>>> re.split(r"[\s_]", "foo_bar zot plugh _ xyzzy")
['foo', 'bar', '', '', '', '', '', 'zot', 'plugh', '', '', 'xyzzy']
which is what your ever-so-slightly-baroque effort does :-)
or
>>> re.split(r"[\s_]+", "foo_bar zot plugh _ xyzzy")
['foo', 'bar', 'zot', 'plugh', 'xyzzy']
Cheers,
John
More information about the Python-list
mailing list