how to avoid leading white spaces

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jun 4 01:14:56 EDT 2011


On Fri, 03 Jun 2011 22:30:59 -0400, Roy Smith wrote:

> In article <4de992d7$0$29996$c3e8da3$5496439d at news.astraweb.com>,
>  Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
> 
>> Of course, if you include both case-sensitive and insensitive tests in
>> the same calculation, that's a good candidate for a regex... or at
>> least it would be if regexes supported that :)
> 
> Of course they support that.
> 
> r'([A-Z]+) ([a-zA-Z]+) ([a-z]+)'
> 
> matches a word in upper case followed by a word in either (or mixed)
> case, followed by a word in lower case (for some narrow definition of
> "word").

This fails to support non-ASCII letters, and you know quite well that 
having to spell out by hand regexes in both upper and lower (or mixed) 
case is not support for case-insensitive matching. That's why Python's re 
has a case insensitive flag.


> Another nice thing about regexes (as compared to string methods) is that
> they're both portable and serializable.  You can use the same regex in
> Perl, Python, Ruby, PHP, etc.

Say what?

Regexes are anything but portable. Sure, if you limit yourself to some 
subset of regex syntax, you might find that many different languages and 
engines support your regex, but general regexes are not guaranteed to run 
in multiple engines.

The POSIX standard defines two different regexes; Tcl supports three; 
Grep supports the two POSIX syntaxes, plus Perl syntax; Python has two 
(regex and re modules); Perl 5 and Perl 6 have completely different 
syntax. Subtle differences, such as when hyphens in character classes 
count as a literal, abound. See, for example:

http://www.regular-expressions.info/refflavors.html


> You can transmit them over a network
> connection to a cooperating process.  You can store them in a database
> or a config file, or allow users to enter them on the fly.

Sure, but if those sorts of things are important to you, there's no 
reason why you can't create your own string-processing language. Apart 
from the time and effort required :)


-- 
Steven



More information about the Python-list mailing list