Regular Expression

Shawn Milochik Shawn at Milochik.com
Tue Oct 23 10:42:37 EDT 2007


On 10/22/07, patrick.waldo at gmail.com <patrick.waldo at gmail.com> wrote:
> Hi,
>
> I'm trying to learn regular expressions, but I am having trouble with
> this.  I want to search a document that has mixed data; however, the
> last line of every entry has something like C5H4N4O3 or CH5N3.ClH.
> All of the letters are upper case and there will always be numbers and
> possibly one .
>
> However below only gave me none.
>
> import os, codecs, re
>
> text = 'C:\\text_samples\\sample.txt'
> text = codecs.open(text,'r','utf-8')
>
> test = re.compile('\u+\d+\.')
>
> for line in text:
>     print test.search(line)
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>


I need a little more info. How can you know whether you're matching
the text you're going for, and not other data which looks similar? Do
you have a specific field length? Is it guaranteed to contain a digit?
Is it required to start with a letter? Does it always start with 'C'?
You need to have those kinds of rules in mind to write your regex.

Shawn



More information about the Python-list mailing list