Regular Expression
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Mon Oct 22 18:56:03 EDT 2007
On Mon, 22 Oct 2007 22:29:38 +0000, patrick.waldo wrote:
> I'm trying to learn regular expressions, but I am having trouble with
> this. I want to search a document that has mixed data; however, the
> last line of every entry has something like C5H4N4O3 or CH5N3.ClH.
> All of the letters are upper case and there will always be numbers and
> possibly one .
>
> However below only gave me none.
>
> […]
>
> test = re.compile('\u+\d+\.')
There is no '\u'. 'u' doesn't have a special meaning so the '\' is
pointless. Your expression matches one or more small 'u's followed by one
or more digits followed by a period. Examples are 'u1.', 'uuuuuuuu42.',
etc.
An expression that matches your first example would be: r'([A-Z]|\d|\.)+'.
That's a non-empty sequence of upper case letters, digits and periods. To
limit this to just one optional period the expression gets a little
longer: r'([A-Z]|\d)+\.?([A-Z]|\d)+'
Does not match your second example because there is a lower case letter in
it.
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list