re.compile for names

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Mon May 21 09:56:55 EDT 2007


In <f2s7tu$5im$1 at solaris.cc.vt.edu>, brad wrote:

> I am developing a list of 3 character strings like this:
> 
> and
> bra
> cam
> dom
> emi
> mar
> smi
> ...
> 
> The goal of the list is to have enough strings to identify files that 
> may contain the names of people. Missing a name in a file is unacceptable.

Then simply return `True` for any file that contains at least two or three
ASCII letters in a row.  Easily written as a short re.  ;-)

> I may end up with a thousand or so of these 3 character strings. Is that 
> too much for an re.compile to handle? Also, is this a bad way to 
> approach this problem? Any ideas for improvement are welcome!

Unless you can come up with some restrictions to the names, just follow
the advice above or give up.  I saw a documentation about someone with the
name "Scary Guy" in his ID papers recently.  What about names with letters
not in the ASCII range?

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list