re Questions

Tim Chase python.list at tim.thechases.com
Sun Jan 26 14:41:41 EST 2014


On 2014-01-26 12:15, Roy Smith wrote:
> > The set [A-z] is equivalent to
> > [ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz]  
> 
> I'm inclined to suggest the regex compiler should issue a warning
> for this.
> 
> I've never seen a character range other than A-Z, a-z, or 0-9.
> Well, I suppose A-F or a-f if you're trying to match hex digits
> (and some variations on that for octal).  But, I can't imagine any
> example where somebody wrote A-z and it wasn't an error.

I'd not object to warnings on that one literal "A-z" set, but I've
done some work with VINs¹ where the allowable character-set is A-Z and
digits, minus letters that can be hard to distinguish visually
(I/O/Q), so I've used ^[A-HJ-NPR-Z0-9]{17}$ as a first-pass filter
for VINs that were entered (often scanned, but occasionally
hand-keyed).  In some environments, I've been able to intercept I/O/Q
and remap them accordingly to 1/0/0 to do the disambiguation for the
user.  So I'd not want to see other character-classes touched, as
they can be perfectly legit.

-tkc

¹ http://en.wikipedia.org/wiki/Vehicle_Identification_Number







More information about the Python-list mailing list