2to3 refactoring [was Re: Tuple parameter unpacking in 3.x]

Sat Oct 11 03:23:41 EDT 2008

On Sun, 05 Oct 2008 17:15:27 +0200, Peter Otten wrote:

> Steven D'Aprano wrote:
> 
>> PEP 3113 offers the following recommendation for refactoring tuple
>> arguments:
>> 
>> def fxn((a, (b, c))):
>>     pass
>> 
>> will be translated into:
>> 
>> def fxn(a_b_c):
>>     (a, (b, c)) = a_b_c
>>     pass
>> 
>> and similar renaming for lambdas.
>> http://www.python.org/dev/peps/pep-3113/
>> 
>> 
>> I'd like to suggest that this naming convention clashes with a very
>> common naming convention, lower_case_with_underscores. That's easy
>> enough to see if you replace the arguments a, b, c above to something
>> more realistic:
>> 
>> def function(vocab_list, (result, flag), max_value)
>> 
>> becomes:
>> 
>> def function(vocab_list, result_flag, max_value)
>> 
>> Function annotations may help here, but not everyone is going to use
>> them in the same way, or even in a way that is useful, and the 2to3
>> tool doesn't add annotations.
>> 
>> It's probably impossible to avoid all naming convention clashes, but
>> I'd like to suggest an alternative which distinguishes between a
>> renamed tuple and an argument name with two words:
>> 
>> def function(vocab_list, (result, flag), max_value):
>>     pass
>> 
>> becomes:
>> 
>> def function(vocab_list, t__result_flag, max_value):
>>     result, flag = t__result_flag
>>     pass
>> 
>> The 't__' prefix clearly marks the tuple argument as different from the
>> others. The use of a double underscore is unusual in naming
>> conventions, and thus less likely to clash with other conventions.
>> Python users are already trained to distinguish single and double
>> underscores. And while it's three characters longer than the current
>> 2to3 behaviour, the length compares favorably with the original tuple
>> form:
>> 
>> t__result_flag
>> (result, flag)
> 
> Let's see what the conversion tool does:
> 
> $ cat tmp.py
> g = lambda (a, b): a*b + a_b
> $ 2to3 tmp.py
> RefactoringTool: Skipping implicit fixer: buffer RefactoringTool:
> Skipping implicit fixer: idioms RefactoringTool: Skipping implicit
> fixer: ws_comma --- tmp.py (original)
> +++ tmp.py (refactored)
> @@ -1,1 +1,1 @@
> -g = lambda (a, b): a*b + a_b
> +g = lambda a_b1: a_b1[0]*a_b1[1] + a_b RefactoringTool: Files that need
> to be modified: RefactoringTool: tmp.py
> 
> So the current strategy is to add a numerical suffix if a name clash
> occurs. The fixer clearly isn't in final state as for functions instead
> of lambdas it uses xxx_todo_changeme.
> 
>> What do people think? Is it worth taking this to the python-dev list?
> 
> I suppose that actual clashes will be rare. If there is no clash a_b is
> the best name and I prefer trying it before anything else. I don't
> particularly care about what the fallback should be except that I think
> it should stand out a bit more than the current numerical suffix.
> xxx1_a_b, xxx2_a_b,... maybe?

Possibly you have misunderstood me. I'm not concerned with a clash 
between names, as in the following:

lambda a_b, (a, b): 
maps to -> lambda a_b, a_b: 

as I too expect they will be rare, and best handled by whatever mechanism 
the fixer users to fix any other naming clash.

I am talking about a clash between *conventions*, where there could be 
many argument names of the form a_b which are not intended to be two item 
tuples. 

In Python 2.x, when you see the function signature

def spam(x, (a, b))

it is clear and obvious that you have to pass a two-item tuple as the 
second argument. But after rewriting it to spam(x, a_b) there is no such 
help. There is no convention in Python that says "when you see a function 
argument of the form a_b, you need to pass two items" (nor should there 
be).

But given the deafening silence on this question, clearly other people 
don't care much about misleading argument names.

*wink*

-- 
Steven