splitting words with brackets

Qiangning Hong hongqn at gmail.com
Wed Jul 26 16:04:43 EDT 2006


I've got some strings to split.  They are main words, but some words
are inside a pair of brackets and should be considered as one unit.  I
prefer to use re.split, but haven't written a working one after hours
of work.

Example:

"a (b c) d [e f g] h i"
should be splitted to
["a", "(b c)", "d", "[e f g]", "h", "i"]

As speed is a factor to consider, it's best if there is a single line
regular expression can handle this.  I tried this but failed:
re.split(r"(?![\(\[].*?)\s+(?!.*?[\)\]])", s).  It work for "(a b) c"
but not work "a (b c)" :(

Any hint?




More information about the Python-list mailing list