splitting a string into 2 new strings
Andrew Dalke
adalke at mindspring.com
Wed Jul 2 15:17:13 EDT 2003
trp:
> I'm, assuming that these are chemical compounds, so you're not limited to
> one-character symbols.
The problem is underspecified. Usually 2-character (or 3-character for some
elements with high atomic number, and not assuming the newer IUPAC names
like "Dubnium", which was also called Unnilpentium (Unp) or, depending on
your political persuasion, Joliotium (Jl) or Hahnium (Ha)) have the first
letter
capitalized and the rest in lower case.
> re_pat = re.compile('([A-Z]+)(\d+)')
So this should be written ([A-Z][A-Za-z]*)(\d+), where I explicitly allow
both lower and upper case trailing letters to be more accepting. (In some
systems, "CU" is "1 carbon + 1 uranium" and in others it's an alternate way
to
write "1 copper". Though I suspect it's not allowed in the OP's problem.)
Andrew
dalke at dalkescientific.com
More information about the Python-list
mailing list