matching exactly a 4 digit number in python

harijay harijay at gmail.com
Fri Nov 21 18:20:56 EST 2008


Thanks John Machin and Mark Tolonen ..
SO I guess the correct one is to use the word boundary meta character
"\b"

so r'\b\d{4}\b' is what I need since it reads

a 4 digit number in between word boundaries

Thanks a tonne, and  this being my second post to comp.lang.python. I
am always amazed at how helpful everyone on this group is

Hari

On Nov 21, 5:12 pm, John Machin <sjmac... at lexicon.net> wrote:
> On Nov 22, 8:46 am, harijay <hari... at gmail.com> wrote:
>
> > Hi
> > I am a few months new into python. I have used regexps before in perl
> > and java but am a little confused with this problem.
>
> > I want to parse a number of strings and extract only those that
> > contain a 4 digit number anywhere inside a string
>
> > However the regexp
> > p = re.compile(r'\d{4}')
>
> > Matches even sentences that have longer than 4 numbers inside
> > strings ..for example it matches "I have 3324234 and more"
>
> No it doesn't. When used with re.search on that string it matches
> 3324, it doesn't "match" the whole sentence.
>
>
>
> > I am very confused. Shouldnt the \d{4,} match exactly four digit
> > numbers so a 5 digit number sentence should not be matched .
>
> {4} does NOT mean the same as {4,}.
> {4} is the same as {4,4}
> {4,} means {4,INFINITY}
>
> Ignoring {4,}:
>
> You need to specify a regex that says "4 digits followed by (non-digit
> or end-of-string)". Have a try at that and come back here if you have
> any more problems.
>
> some test data:
> xxx1234
> xxx12345
> xxx1234xxx
> xxx12345xxx
> xxx1234xxx1235xxx
> xxx12345xxx1234xxx




More information about the Python-list mailing list