how to avoid leading white spaces

Octavian Rasnita orasnita at gmail.com
Mon Jun 6 04:51:46 EDT 2011


It is not so hard to decide whether using RE is a good thing or not.

When the speed is important and every millisecond counts, RE should be used 
only when there is no other faster way, because usually RE is less faster 
than using other core Perl/Python functions that can do matching and 
replacing.

When the speed is not such a big issue, RE should be used only if it is 
easier to understand and maintain than using the core functions. And of 
course, RE should be used when the core functions cannot do what RE can do.

In Python, the RE syntax is not so short and simple as in Perl, so using RE 
even for very very simple things requires a longer code, so using other core 
functions may appear as a better solution, because the RE version of the 
code is almost never as easy to read as the code that uses other core 
functions (or... for very simple RE, they are probably same as readable).

In Perl, RE syntax is very short and simple, and in some cases it is more 
easier to understand and maintain a code that uses RE than other core 
functions.

For example, if somebody wants to check if the $var variable contains the 
letter "x", a solution without RE in Perl is:

if ( index( $var, 'x' ) >= 0 ) {
    print "ok";
}

while the solution with RE is:

if ( $var =~ /x/ ) {
    print "ok";
}

And it is obviously that the solution that uses RE is shorter and easier to 
read and maintain, beeing also much more flexible.

Of course, sometimes an even better alternative is to use a module from CPAN 
like Regexp::Common that can use RE in a more simple and readable way for 
matching numbers, profanity words, balanced params, programming languages 
comments, IP and MAC addresses, zip codes... or a module like Email::Valid 
for verifying if an email address is correct, because it may be very hard to 
create a RE for matching an email address.

So... just like with Python, there are more ways to do it, but depending on 
the situation, some of them are better than others. :-)

--Octavian

----- Original Message ----- 
From: "Chris Torek" <nospam at torek.net>
Newsgroups: comp.lang.python
To: <python-list at python.org>
Sent: Monday, June 06, 2011 10:11 AM
Subject: Re: how to avoid leading white spaces


> In article 
> <ef48ad50-da06-47a8-978a-47d6f4271e75 at d28g2000yqf.googlegroups.com>
> rurpy at yahoo.com <rurpy at yahoo.com> wrote (in part):
> [mass snippage]
>>What I mean is that I see regexes as being an extremely small,
>>highly restricted, domain specific language targeted specifically
>>at describing text patterns.  Thus they do that job better than
>>than trying to describe patterns implicitly with Python code.
>
> Indeed.
>
> Kernighan has often used / supported the idea of "little languages";
> see:
>
>    http://www.princeton.edu/~hos/frs122/precis/kernighan.htm
>
> In this case, regular expressions form a "little language" that is
> quite well suited to some lexical analysis problems.  Since the
> language is (modulo various concerns) targeted at the "right level",
> as it were, it becomes easy (modulo various concerns :-) ) to
> express the desired algorithm precisely yet concisely.
>
> On the whole, this is a good thing.
>
> The trick lies in knowing when it *is* the right level, and how to
> use the language of REs.
>
>>On 06/03/2011 08:05 PM, Steven D'Aprano wrote:
>>> If regexes were more readable, as proposed by Wall, that would go
>>> a long way to reducing my suspicion of them.
>
> "Suspicion" seems like an odd term here.
>
> Still, it is true that something (whether it be use of re.VERBOSE,
> and whitespace-and-comments, or some New and Improved Syntax) could
> help.  Dense and complex REs are quite powerful, but may also contain
> and hide programming mistakes.  The ability to describe what is
> intended -- which may differ from what is written -- is useful.
>
> As an interesting aside, even without the re.VERBOSE flag, one can
> build complex, yet reasonably-understandable, REs in Python, by
> breaking them into individual parts and giving them appropriate
> names.  (This is also possible in perl, although the perl syntax
> makes it less obvious, I think.)
> -- 
> In-Real-Life: Chris Torek, Wind River Systems
> Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
> email: gmail (figure it out)      http://web.torek.net/torek/index.html
>


--------------------------------------------------------------------------------


> -- 
> http://mail.python.org/mailman/listinfo/python-list
> 




More information about the Python-list mailing list