Return value of an assignment statement?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Fri Feb 22 09:58:54 EST 2008


On Fri, 22 Feb 2008 00:45:59 -0800, Carl Banks wrote:

> On Feb 21, 6:52 pm, Steve Holden <st... at holdenweb.com> wrote:
>> mrstephengross wrote:
>> >> What you can't do (that I really miss) is have a tree of
>> >> assign-and-test expressions:
>> >>         import re
>> >>         pat = re.compile('some pattern')
>> >>         if m = pat.match(some_string):
>> >>             do_something(m)
>>
>> > Yep, this is exactly what I am (was) trying to do. Oh well.... Any
>> > clever ideas on this front?
>>
>> The syntax is the way it is precisely to discourage that kind of clever
>> idea.
> 
> Don't be ridiculous.  Assignment operators are maybe one of the worst
> things in existence, but this particular use case (running a sequence of
> tests like the above) is perfectly useful and good.

I don't understand your reasoning. If assignment operators are so 
terrible, why do you think the terribleness disappears in this specific 
case?

The above idiom leads to one of the most common errors in C code: writing 
= when you mean ==. "Running a sequence of tests" isn't immune to that 
problem, it's especially vulnerable to it.

Compare the suggested pseudo-Python code:

pat = re.compile('some pattern')
if m = pat.match(some_string):
    do_something(m)


with the actual Python code:

pat = re.compile('some pattern')
m = pat.match(some_string)
if m:
    do_something(m)


The difference is exactly one newline plus one extra reference to the 
name "m". And this is a problem?



> Some Pythonistas will swear to their grave and back that should be done
> by factoring out the tests into a list and iterating over it, and NO
> OTHER WAY WHATSOEVER, but I don't buy it.

Well, putting a sequence of tests into a list is the natural way to deal 
with a sequence of tests. What else would you do? 


> That's a lot of boilerplate 

What boilerplate are you talking about?



> --the very thing Python is normally so good at minimizing--
> when it might not be needed.  It would be the right thing for a complex,
> pluggable, customizable input filter; but is rarely a better solution
> for a simple text processing script.

Huh?


> Quick, at a glance, which code snippet will you understand faster
> (pretend you know Perl):
> 
> 
> if (/name=(.*)/) {
>     $name = chop(\1);
> } elsif (/id=(.*)/) {
>     $id = chop(\1);
> } elsif (/phone=(.*)/) {
>     $phone = chop(\1);
> }
> 
> 
> vs.
> 
> 
> def set_phone_number(m):
>     phone = m.group(1).strip()
> 
> def set_id(m):
>     id = m.group(1).strip()
> 
> def set_name(m):
>     name = m.group(1).strip()
> 
> _line_tests = [
>     (r"phone=(.*)", set_phone_number),
>     (r"name=(.*)", set_name),
>     (r"id=(.*)", set_id),
>     ]
> 
> for pattern,func in _line_tests:
>     m = re.match(pattern,line)
>     if m:
>         func(m)
> 
> 
> At this small scale, and probably at much larger scales too, the Perl
> example blows the Python example out of the water in terms of
> readability.  And that's counting Perl's inherent unreadableness.


Why would you do that test in such an overblown fashion, then try to 
pretend it is an apples-and-apples comparison with the Perl code? It 
doesn't even work: you have three functions that set a local name, then 
throw it away when they return.

Pretending I understand Perl, here's a fairer, more direct translation of 
the Perl code:


name, id, phone = [None]*3  # Closest thing to an unset variable in Perl.
name = re.match(r"name=(.*)", line)
if name: name = name.group(1).strip()
else:
    id = re.match(r"id=(.*)", line)
    if id: id = id.group(1).strip()
    else:
        phone = re.match(r"phone=(.*)", line)
        if phone: phone = phone.group(1).strip()

Six lines for Perl against nine for Python, eight if you dump the "unset" 
line. Hardly a big difference.

The main difference is that Python's handling of regexes is a little more 
verbose, and that the indentation is compulsory. But here's a better way 
to do the same test:

tests = [ (r"name=(.*)", 'name'), 
    (r"id=(.*)", 'id'), (r"phone=(.*)", 'phone')]
for (test, name) in tests:
    m = re.match(t, line)
    if m:
        globals()[name] = m.group(1).strip()
        break

Down to seven lines, or six if the I didn't split the tests over two 
lines.



Here's an even better way:

tests = [ "name", "id", "phone"]
for t in tests:
    m = re.match(t + r"=(.*)", line)
    if m:
        globals()[t] = m.group(1).strip()
        break

Six lines for Perl, six for Python, and the Python version is far more 
readable.

Perl's treatment of assignment as an operator tempts the programmer to 
write quick-and-dirty code. Python discourages that sort of behaviour, 
and encourages programmers to factor their code in a structured way. 
That's a feature, not a bug.



-- 
Steven



More information about the Python-list mailing list