Regexp: unexspected splitting of string in several groups
Christos TZOTZIOY Georgiou
tzot at sil-tec.gr
Mon May 31 18:28:53 EDT 2004
On 31 May 2004 04:41:11 -0700, rumours say that pit.grinja at gmx.de (Piet)
might have written:
>vartype is a simple string(varchar, tinyint ...) which might be
>followed by a string in curved brackets. This bracketed string is
>either composed of a single number, two numbers separated by a comma,
>or a list of strings separated by a comma. After the bracketed string,
>there might be a list of further strings (separated by blanks)
>describing some more properties of the column.
>Typical examples are:
>char(30) binary
>int(10) zerofill
>float(3,2)...
>I would like to extract the vartype, the bracketed string and the
>further properties separately and thus defined the following regular
>expression:
Does this RE work for you?
tre= re.compile(r"(\w+)"
r"(?:\(([\d\w]+(?:,[\d\w]+)*)\))?"
r"(\s+\w+)*")
For your examples:
>>> tre.match("char(30) binary").groups()
('char', '30', ' binary')
>>> tre.match("int(10) zerofill").groups()
('int', '10', ' zerofill')
>>> tre.match("float(3,2)").groups()
('float', '3,2', None)
PS1 if you make the re slightly more complex, you can avoid the initial
space in the third "properties" group. I also assumed no space between
the "vartype" and the left parenthesis (if it is there).
PS2 redemo.py somewhere in your python's installation is a good friend
of yours.
PS3 I am a fan of regular expressions for years, and I often overuse
them. Perhaps somebody else might give you a better advice than me.
--
TZOTZIOY, I speak England very best,
"I have a cunning plan, m'lord" --Sean Bean as Odysseus/Ulysses
More information about the Python-list
mailing list