Ideas for Python 3

Tue Apr 27 19:39:16 EDT 2004

Mike, thanks for a very thorough and thoughtful review of this
proposal.

On Mon, 26 Apr 2004 22:00:50 -0400, "Mike C. Fletcher"
<mcfletch at rogers.com> wrote:

>David MacQuigg wrote:
>...
>
>>All methods look like functions (which students already understand).
>>  
>>
>I think it might be more proper to say that you've made all functions 
>methods with an implicitly defined target, but I suppose the statement 
>is still technically true.

A function with no instance variables is identical to a function
defined outside a class.  In that case, the only way to tell if it is
a method or a function is to look at the surrounding code.  "Method"
is one of those mystery words that tend to put off non-CIS students.
I use it when it is necessary to distinguish a method from a function,
but otherwise I prefer the term function.  It emphasizes the
similarity to what the students already know.

>>Benefits of Proposed Syntax
>>===========================
>>-- Unification of all function forms ( bound, unbound, static, class,
>>lambda ).  All will have the same form as a normal function
>>definition.  This will make it easier to teach OOP.  Students will
>>already understand functions and modules.  OOP is a small step up.  A
>>prototype will look just like a module ( except for the instance
>>variables ).  See Parallels between Prototypes and Modules below.
>>  
>>
>This is nice.  Not "I'm going to rush out to adopt a language because of 
>it" nice, but nice enough.  I'm curious about one thing:
>
>    proto x( object ):
>        flog :( x, y ):
>           .x = x
>    a = x()
>    b = x()
>    a.flog = b.flog
>    a.flog()
>    print b.x
>
>In other words, how do I hold a reference to a bound method/function if 
>there are no such things and only the "last access" determines what the 
>implicit target is?  Just to be clear, I'm assuming you're going to have 
>storage *somewhere* so that:
>
>    a = module.do
>    a()
>
>works.

Bound and unbound functions work just like in Python.  This is where I
differ with Prothon, on the need for special binding syntax.

>>-- Using an explicit __self__ variable avoids the magic first
>>argument, and makes it easier to explain instance variables.  See the
>>sections below comparing a brief explanation of instance variables in
>>Python vs the simplified form.  A full presentation of OOP, like pages
>>295-390 in Learning Python, 2nd ed. will likely be 1/2 the number of
>>pages.  Not only is the basic presentation simpler, but we can
>>eliminate a lot of discussion of lambda functions, static methods,
>>etc.
>>  
>>
>This is a wash IMO, with the explicit "self" having a slight edge on 
>"Explicit is better than Implicit" grounds.  You now have to explain 
>where the magic __self__ comes from instead of how self is bound when 
>you access the instance's method.  They're both magic, the Python stuff 
>is just explicitly visible.  Still, since you're coding it deep into 
>this new language, it'll be first nature to the Whateverthon programmer.
>
>On a personal note, the moment where I "got" the concept of methods 
>(Python was my first OO language) was seeing "self" in the argument list 
>of a function and realising that it's just a parameter curried into the 
>function by doing x.method lookup.  That is, it just looked like any 
>other function, the parameter was just a parameter, nothing special, 
>nothing requiring any extra knowledge save how it got bound (and that's 
>pretty darn simple).  Coming from a structures+functions background it 
>made complete sense.

I assume no background other than what the students will know from
studying Python up to the point of introducing OOP.  At this point,
they have a good understanding of functions and global variables.

I've seen a lot of discussion on the "explicitness" of self in Python,
and I have to conclude that most of it is missing the real problem,
which is complexity from the fact that some functions have a special
first argument and others don't.  It is hard to compare alternatives
by focusing our microscope on something as small as setting a global
variable vs inserting a special first argument.

What I would do in comparing complexity is look at the length of basic
but complete "textbook explanations" of the alternatives.  I've copied
at the end of this post the explanation of instance variables from my
OOP chapter at http://ece.arizona.edu/~edatools/Python/  I've also
made my best effort to write an equivalent explanation of Python's
instance variables.  Comments are welcome.  Also, if anyone can write
a better explanation for Python's syntax, please post it.

>>-- All attributes of a prototype ( both data and functions ) will be
>>in a neat column, making it easier to find a particular attribute when
>>visually scanning a program.  Understanding the structure of a program
>>will be almost as quick as seeing a UML diagram.
>>  
>>
>Can't say I find it particularly compelling as an argument, not if 
>introducing punctuation-itis is the cost, anyway.  Most people I know 
>use syntax colouring editors, after all.

Do we want to assume syntax coloring is the norm?  This will make a
difference in the readability of :( ):  I use IDLE for my Python
editor, and I love it, so maybe I'm just taking syntax coloring for
granted.

For the function def syntax, we should consider alternative forms,
depending on how much clarity or compactness we want.  Any of these
would be acceptable:

func1 = function( x, y ):
func1 = func( x, y ):
func1 = def( x, y ):
func1 = :( x, y ):
func1 = : x, y :
:x,y:

The last form would be used where we now have lambda x,y:

It seems like the choice of symbols and keywords here is a matter of
personal preference.  The only objective criteria I have is that the
short form should be as short as possible and reasonably close to the
normal form.  Verbosity is one of the reasons I don't use lambda.

>>-- Lambda keyword will be gone.  An anonymous function using normal
>>function syntax can be extremely compact. ( :x,y:x+y )
>>  
>>
>That particular example almost screams "don't do this", doesn't it?  
>:(x,y): x+y  I can see as an improvement, but yawn, really.  Making 
>function definitions expressions rather than statements would have the 
>same effect.  By the way, how do you know when your lambda is finished?  
>I gather the ()s are required if using as an expression?

It took me a long time to realize that lamdas have only one advantage
over named functions - they can be crammed into a tight space.  Here
is an example:

L = [(lambda x: x**2), (lambda x:x**3), (lambda x:x**4), (lambda
x:x**5)]

If the purpose is to save space, wouldn't this be better as:

L = [:x:x**2, :x:x**3, :x:x**4 :x:x**5]

I'm assuming the parens are optional.  Is there a parsing problem I'm
not seeing?  I would add parens simply because I like the appearance
of func1 = :( x, y ):

I would also be happy with just deprecating lambdas entirely.

>>-- Method definitions will be less cluttered and less typing with
>>__self__ as a hidden variable.
>>  
>>
>I personally prefer explicit to implicit, but I know there's lots of 
>people who are big into saving a few keystrokes.

See discussion above on explicitness.  I'm not seeing any advantage in
the explicitness of self.something over .something -- *provided* that
the leading dot is not used for any other purpose than an abbreviation
for __self__.

Keystrokes are not a big issue for me either, but in this case, where
we have such frequent use of the syntax, I can see where it would be
significant.

>>-- Changing numerous attributes of an instance will be more
>>convenient. ( need use case )
>>  
>>
>That's nice, but honestly, if you're doing a lot of this in cases 
>trivial enough to warrant the addition you should likely be refactoring 
>with a domain-modelling system anyway.  Still, if you modify the with to 
>work something like this:
>
>    with x:
>       .this = 32
>       .that = 43
>       temp = 'this'*repeat
>       .something = temp[:55]
>
>i.e. to just alter the implicit target of the block, not force all 
>variables to be assigned to the object, it seems a nice enough feature.

I'm not sure I understand your use of the leading dots above.  Do I
assume that .something gets attached to x, and temp is discarded when
the with block is finished?  This will conflict with the use of
leading dots as an abbreviation for __self__.  Why do we care about
temp variables here?  If it really matters, wouldn't it be easier to
just del x.temp when we are done?

>>Pro2:  Replace lambdas with standard function syntax.
>>
>>Con2:  ???
>>  
>>
>Fine, but no need to redefine the spelling for that save to make the 
>definition itself an expression that returns the function as a value and 
>allows one to drop the name.  i.e. a = def ( y,z ): y+z  would work just 
>as well if you could assign the result to a variable and figured out how 
>you wanted to handle the indentation-continuation thing to know when the 
>function ended.

The tradeoff is compactness vs preference for a keyword over a symbol.
I don't see any objective criteria, except that the lambda syntax
should be similar to the normal sytnax.

>>Explicit __self__
>>
>>Pro1:  Allows the unification of methods and functions.
>>
>>Con1:  ???
>>  
>>
>Is hidden (implicit) magic that requires the user to learn rules as to 
>what the target is when treating functions/methods as first-class 
>objects.  Not a big deal, really.
>
>>Pro2:  Explanation of instance variables is simpler.
>>
>>Con2:  Using __self__ instead of a special first argument is less
>>explicit.
>>  
>Um, can't say I see this as a huge pedagogical win.  A function either 
>takes an argument self and can set attributes of the object, or a 
>function has access to a magical "global" __self__ on which it can set 
>attributes.  I'll agree that it's nice having the same concept for 
>module and class variables, but seeing that as a huge win assumes, I 
>think, that those being taught are coming from a "globals and functions" 
>background rather than a structures and functions background.  One type 
>is accustomed to altering their execution environment, the other to 
>altering solely those things which are passed into the function as 
>parameters.

I am assuming no background at all other than Python up to the point
where we introduce OOP.  At that point, students will understand both
global variables and functions.  I measure simplicity by how much text
it takes to provide a basic explanation. See the samples at the end of
this post.

>>Pro3:  Less typing and less clutter in method definitions.
>>
>>Con3:  Can use "s" or "_" instead of "self" to minimize typing and
>>clutter.
>>  
>>
>That's a counter, not a con.  Similarly "Explicit is better than 
>Implicit" is only a counter, not a con.  A con would be: "presence of 
>variable of implicit origin" or "too much punctuation".  Don't think 
>either is a huge concern.
>
>>"Assignment" Syntax for Function Definitions
>>
>>Pro1:  See all the variables at a glance in one column.
>>
>>Con1:  ???
>>  
>>
>Doesn't seem a particularly strong pro.  IOW seems pretty minimal in 
>benefit.  As for a con, the eye, particularly in a syntax-colouring 
>editor picks out keywords very well, while punctuation tends to blur 
>into other punctuation.
>
>>Pro2:  Emphasize the similarity between data and functions as
>>attributes of an object.
>>
>>Con2:  ???
>>  
>>
>I see the pro, seems approx. the same to me.
>
>>With Block
>>
>>Pro:  Saves typing the object name on each line.
>>
>>Con:  Making it too easy to modify prototypes after they have been
>>created will lead to more undisciplined programming.
>>  
>>
>As specified, makes it only useful for trivial assignments.  If you're 
>going to all the trouble of introducing .x notation to save keystrokes, 
>why not simply have with alter __self__ for the block so you can still 
>distinguish between temporary and instance variables?

I'm using __self__ exclusively for the bind object in a method call.  

My biggest concern is not wanting to make leading dots the norm on
every variable assignment in a prototype definition.  If we are going
to highlight the instance variables, and say to students "This is the
key difference between what you already know (modules) and what you
are going to learn next (prototypes), then I don't want every other
variable in the prototype definition to look just like the instance
variables.

I've included the with blocks in my proposal to please the Prothon
folks, but unless someone can come up with a use case, they are not
worth the confusion they are causing.

>In the final analysis, this really seems like about 3 separate proposals:
>
>    * I like the .x notation's universal applicability, it does seem
>      simple and elegant from a certain point of view
>          o I don't like the implicit __self__, but that's an integral
>            part of the proposal, so a wash

If we can come up with an alternative that doesn't require multiple
function forms, I would like to consider it.

>          o I'd want clarification of how to store a reference to
>            another object's (bound) method (which is *extremely* common
>            in Python code for storing, e.g. callbacks)

bf = cat1.func  # where cat1 is an instance not a prototype.

>    * I really dislike the :( ): function definition notation,
>      "Readability Counts".  Why clutter the proposal with that?

It looks good to me, but I'm probably not the best judge of
aesthetics.  I'll collect some other opinions on this. If enough
people prefer  def ( ):  ( or def : for the lambda form), I'll change
the proposal.

Does it make a difference in your preference that the parens are
optional?  I would use them to enhnace readability on normal
functions, but leave them out on lambdas.

>    * I'm neutral on the with: stuff, I'd much prefer a real block
>      mechanism similar to Ruby with (if we're using implicit targets),
>      the ability to specify the .x target for the block

I've never understood the advantage of Ruby code blocks over Python
functions, but that is a separate discussion.

>So, the .x notation seems like it would be nice enough, but nothing else 
>really makes me jump up and down for it...
>
>That said, I'd probably be willing to use a language that was running on 
>the PythonVM with a parser/compiler that supported the syntax.  I'd be 
>totally uninterested in automated translation of Python code to the new 
>form.  That's the kind of thing that can be handled by running on the 
>same VM just as easily as anything else and you then avoid lots of 
>migration headaches.

I don't understand.  If you need to use modules written in Python 2,
you would need at least some kind of wrapper to make the calls look
like Python 3.  It seems like any changes that are not "backward
compatible" with Python 2 will need to be at least "migratable" from
earlier versions, using some automatic translator.  That is the major
constraint I have assumed in thinking about new syntax.  Is this not a
vital requirement?

>So, just as a marketing data-point; I'm not convinced that this is 
>markedly superior, but I'd be willing to try a language that differed 
>from Python in just the .x aspects to see whether it was worthwhile. 

Thanks again for your time and effort.

-- Dave

Explanation of Instance Variables in Python
===========================================
"""  Some of the variables inside the functions in a class have a
self. prefix.  This is to distinguish local variables in the function
from "instance variables".  These instance variables will be found
when the function is called, by searching the instance which called
the function.  The way this works is that calling the function from an
instance causes that instance to be passed as the first argument to
the function call.  So if you call cat1.talk(), that is equivalent to
Cat.talk(cat1) If you call cat1.set_vars( "Garfield", "Meow"), that is
equivalent to Cat.set_vars(cat1, "Garfield", "Meow")

The "current instance" argument is auto-magically inserted as the
first argument, ahead of any other arguments that you may provide in
calling a method that is "bound" to an instance.  Note: The
distinction between instances and classes is important here.  If you
call a function from a class, that function is not bound to any
instance, and you have to supply the instance explicitly in the first
argument ( Cat.talk(cat1) )

The variable name self is just a convention.  As long as you put the
same name in the first argument as in the body of the definition, it
can be self or s or even _   The single underscore is handy if you
want to maximally suppress clutter.  """

Explanation of Simplified Instance Variables
============================================
""" Some of the variables inside the functions in a prototype have a
leading dot.  This is to distinguish local variables in the function
from "instance variables".  When a function is called from an instance
( cat1.talk() ) a special global variable __self__ is automatically
assigned to that instance ( __self__ = cat1 )  Then when the function
needs an instance variable ( .sound ) it uses __self__ just as if you
had typed it in front of the dot ( __self__.sound )  The leading dot
is just an abbreviation to avoid typing __self__ everywhere.  """