Question about regular expression
Tim Chase
python.list at tim.thechases.com
Wed Sep 30 15:20:15 EDT 2015
On 2015-09-30 11:34, massi_srb at msn.com wrote:
> firstly the description of my problem. I have a string in the
> following form:
>
> s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..."
>
> that is a string made up of groups in the form 'name' (letters
> only) plus possibly a tuple containing 1 or 2 integer values.
> Blanks can be placed between names and tuples or not, but they
> surely are placed beween two groups. I would like to process this
> string in order to get a dictionary like this:
>
> d = {
> "name1":(0, 0),
> "name2":(1, 0),
> "name3":(0, 0),
> "name4":(1, 4),
> "name5":(2, 0),
> }
>
> I guess this problem can be tackled with regular expressions, b
First out of the gate, I suggest you follow Emile's advice and try
using string expressions. However, if you *want* to do it with
regular expressions, you can. It's ugly and might be fragile, but
#############################################################
import re
s = "name1 name2(1) name3 name4 (1, 4) name5(2) ..."
r = re.compile(r"""
\b # start at a word boundary
(\w+) # capture the word
\s* # optional whitespace
(?: # start an optional grouping for things in the parens
\( # a literal open-paren
\s* # optional whitespace
(\d+) # capture the number in those parens
(?: # start a second optional grouping for the stuff after a comma
\s* # optional whitespace
, # a literal comma
\s* # optional whitespace
(\d+) # the second number
)? # make the command and following number optional
\) # a literal close-paren
)? # make that stuff in parens optional
""", re.X)
d = {}
for m in r.finditer(s):
a, b, c = m.groups()
d[a] = (int(b or 0), int(c or 0))
from pprint import pprint
pprint(d)
#############################################################
I'd stick with the commented version of the regexp if you were to use
this anywhere so that others can follow what you're doing.
-tkc
More information about the Python-list
mailing list