More random python observations from a perl programmer

Neil Schemenauer nascheme at ucalgary.ca
Sat Aug 21 01:41:47 EDT 1999


As no one seems to have written a reply to all of Tom's points, I
will make an effort.  Some of this information has been already
presented in other posts.  Hopefully this will be an accurate
summary.

Tom Christiansen <tchrist at mox.perl.com> wrote:
>GOTCHA: (medium)
>    The whole mutability versus immutability thing causes many 
>    confusions.  For example:
>	t = ([1,2],3)
>	t[0][0] = "fred"	# legal
>	t[1]    = "fred"	# ILLEGAL
>    I don't understand why tuples weren't just lists that weren't
>    somehow marked "read-only" or "constant".   That way you
>    could do the same with dictionaries or any other data types.

IMHO, this is one of the most important ideas to get straight
about Python.  All variables in Python are references.  For
objects that are immutable the semantics are the same as if
variables hold values.  This is good as no one wants the number
one modified when they aren't looking. :)

You have to be careful when modifying mutable objects.  Sometimes
you need to make a copy.  The copy module can do this for you.
If you are copying a list the "list[:]" idiom works just fine.  I
find I never use the copy module.

It is not too hard to remember what is mutable and what is not.
Basicly everything is mutable except numbers, strings, and
tuples.  Tuples are immutable so they can be used as keys in
dictionaries.  Numbers and strings are immutable for performance
reasons.  Check the FAQ for more details.

>GOTCHA: (low)
>    You can't use "in" on dicts.  Instead, you must use 
>	d = { "fred":"wilma", "barney":"betty" }
>	if d.has_key("fred"):	# legal
>	if "fred" in d:	        # ILLEGAL
>    I don't understand why this was done.

In the Python philosophy, every expression should have an obvious
meaning.  "in" in this case does not.  Perl has a different
philosophy here.

>GOTCHA: (low)
>    I don't understand why we have so many C operators, such as "&" and
>    "|" and "~", but we don't have "&&" or "||" or "!"  We also have "<<"
>    and ">>".  It's bizarre that the relational and "?:" are missing.
>    Python uses so much from C to make people see what they expect to see,
>    but this is missing.  Why?

I guess Python is not trying to look as much like C as possible
but instead be as readable as possible.  It does, however, borrow
C syntax.  I believe that Modula 3 has had a large impact on the
design of Python.

>GOTCHA: (high)
>    Local variables are never declared.  They just *happen*.  If a
>    variable is assigned to, then it belongs to that "scope".  If it's
>    only looked at, it may be from the current global scope.

The rule for determining if something is local is quite simple.
If a variable is bound anywhere in a function then it is local
unless declared global by the "global" statement.  Referencing a
local variable before it was bound used to give a strange
exception.  This has been fixed in the CVS version of Python.

>GOTCHA: (medium)
>    If you have a function that expects two arguments, such as:
>	def fn(x,y):
>	    return x + y
>    and you have a tuple of the two elements:
>	t = (1,2)
>    You can't just call that function:
>	print fn(t)
>    Or you get a "TypeError: not enough arguments; expected 2, got 1"
>    error.  You have to use this apply() function to get around this
>    issue:
>	print apply(fn,t)
>	3

I think most people feel this is a feature.  Python does not
generally do things implicitly.  A different philosophy then Perl.

>GOTCHA: (medium) 
>    Assignment is a statement, not an operator.  That means that
>    people are constantly writing loops that aren't really testing
>    the exit condition, as in:
>	while 1:
>	    line = os.readline()
>	    if not line: break
>	    .....
>    Because they can't write:
>	while (line = os.readline()):	# ILLEGAL
>    I feel that this hides the real test condition in a place
>    it doesn't belong.

On the up side, because assignment is a statement you can bind
multiple names at once:

	x = y = z = 0.0

>GOTCHA: (low) 
>    There are no compound assignment operators, like +=.

Again the semantics of such statements is not immediately obvious
so Python avoids them.  They may be eventually added to the
language though.

>GOTCHA: (low)
>    There are no loop labels, and therefore "break" and "continue" are
>    only through the next level.  This encourages the proliferation of
>    spurious boolean condition variables.  It was annoying when Pascal
>    made you do the same thing.  There is no "goto", which is how C
>    works around it.  As with many things in Python, here you force the
>    user to be tangled up with exceptions just to do very simple things.
>    That's not as clear a solution.

I have to agree with you on this one.  Using exceptions in this
case is not a Python-like solution.  In my experience I haven't
wished for this too often.

>GOTCHA: (high)
>    I had hoped that having *only* reference semantics would spare the beginner
>    from having to keep in mind the difference between a reference and its
>    referent as we have in C or Perl.  Alas, it is not to be.
>	 x = [1,2,3]
>	 y = x
>	 x[1] = "fred"
>	 print y
>	[1,"fred",3]

As I mentioned earlier, everything is a reference.  You just have
to know what is mutable and what is not.  Lists, obviously from
your example, are.


>COOLNESS:
>    You can "foreach" across multiple items.
>	a = [ [1,2], [3,4], [5,6] ]
>	for (i,j) in a:
>	    print i+j
>      3
>      7
>      11

This is called tuple unpacking.  It is definitely cool.  In the
newer versions of Python unpacking works on other sequences as
well.  For example:

	x, y, z = (1, 2, 3)
	a, b, c = [1, 2, 3]

This is especially useful when returning multiple values from
functions.

>COOLNESS:
>    Namespaces seem to be *the* thing in this language.  You can always
>    get a listing of anything's names/attributes, even all the built-ins
>    (which you can't do in perl) or all the names in the local scope
>    (which you also can't do in perl).  

Also because everything is looked up dynamicly, by rebinding
variables you can do really cool (or stupid) things.

>DISSIMILARITY:
>    There are no private variables in a module or a class.
>    They can always be fully-qualified and accessed from
>    without.  Perl can do this with file-scoped lexicals,
>    but still can't do so with data members (modulo obscure
>    tricks such as using a non-hash for an object).

In the words of Guido, we're all consenting adults. To make a
variable private proceed its name by one or two underscores.  The
two underscore thing is a new addition to the language.  If
someone monkeys with your private parts they get what they
deserve. :)

>GOTCHA: (medium)
>    Perl's hex() function is the opposite of Python's.  Perl's hex
>    function converts 22 into 37.  Python's hex function converts 37 into
>    0x22.  Oh my!  Like hex, Python has oct() completely backwards from
>    Perl.  Python's oct(44) returns 054, but Perl's oct(44) return 36.

Hmm, I'm not sure what Perl is doing here.  Python seems more
sensible to me.  The string 44 in the above Python example is a
literal.  Written as it is the Python parser treats it as a base
10 number.  If it was written as 054 then it would be treated as
a base 8 number.  oct() and hex() are functions that return a
number as a string in a certain base.  Maybe you want
string.atoi():

	>>> string.atoi('44', 8)
	36

>GOTCHA: (low)
>    Often Python's error messages leave something to be desired.
>    I don't know whether 

I don't find them too bad.  I guess I don't use Perl much so I
don't know what I am missing.  At least they are better than
SEGV. :)

>GOTCHA: (medium)
>    All ranges are up to but *not including* that point.  So range(3)
>    is the list [0,1,2].  This is also true in slices, which is the
>    real gotcha part.  A slide t[2:5] does not include t[5] in it.

This is explained quite well in the tutorial.  It makes a lot of
sense once you get used to it.

>DISSIMILARITY:
>    Python doesn't convert between strings and numbers unless you tell
>    it to.  And if you don't tell it to, you get an exception.  Therefore
>    you have no idea what these 
>	x = y + z
>	x = y - z
>	x = y * z
>    are really doing.  The first would concat strings or lists, the last would
>    repeat them.  Personally, I don't care for this, because I always
>    wonder what subtracting one string or list from another would be.

IMHO, + should not have been overloaded to mean concat.
Something like ++ would have been better.

>GOTCHA: (high)
>    This is a big surprise: strings are not atomic (although they are
>    immutable).  They are instead sequences of characters.  This comes
>    up in strange places.  For example:	
>	>>> for c in ("weird", "bits"):
>	...      print c
>	... 
>	weird
>	bits
>	>>> for c in ("weird"):
>	...      print c
>	... 
>	w
>	e
>	i
>	r
>	d
>    The second case autosplit the characters!

I think you are a bit confused by tuples.  In the first example
you are looping over a tuple of length two.  In the second case
you are looping over a string.  A length one tuple must end with
a comma.  This is weird but necessary for the parser to tell the
difference between a tuple and a parenthesized expression.

for loops over a sequence.  A string is a sequence of length one
strings (ie. characters).  If it wasn't for the tuple confusion I
don't think you would have found this too surprising.

>  Here's another:
>	>>> print map(None, "stuff")
>	['s', 't', 'u', 'f', 'f']
>	>>> print map(None, "stuff", "here")
>	[('s', 'h'), ('t', 'e'), ('u', 'r'), ('f', 'e'), ('f', None)]
>    (Python's map None is like Perl's map $_.) 

This is something a little different.  The map function accepts
more than one argument.  None behaves as the identity function
when passed to map.  I believe this is called zip() in functional
languages.

>GOTCHA: (high)
>    Because everything is a reference, and there's no way to dereference
>    that reference, it turns out that there is no trivial way to copy
>    a list!  This fails:
>	x = [1,2]
>	y = x
>    Because you don't get a new list there, just a copy to an
>    old one.  Suggested work-arounds include
>	y = x[0:]
>	y = x[:]
>	y = x + []
>	y = x * 1
>    This forces people to think about references, again.    
>    So much for lists being first class citizens!  Compare 
>    this with Perl's
>	@x = (1,2);
>	@y = @x;
>    Or even with references:
>	$x = [1,2];
>	$y = [ @$x ]; 
>    or 
>	@y = @$x;

Again with the references. :)  Don't think references.
Everything is a reference.  Forget about that.  Think
mutable/immutable.  Lists are mutable.  If you are modifying them
you better be careful about sharing them.  x[:] is an easy way to
make a copy.  For other stuff use the copy module.  You will use
it rarely.

Does Perl really make a copy of a list every time you assign it
to another variable?  How about other things like instances?
Using references always like Python does seems more natural to
me.  I usually want a reference, not a copy.  Also, what exactly
is a copy?

>GOTCHA: (medium)
>    Slices in Python must be contiguous ranges of a sequence.
>    In Perl, there's no such restriction.  

Use a dictionary if you want this.

>GOTCHA: (medium)
>    You can't slice dictionaries at all in Python.  In Perl, it's
>    easy to slice a hash, and just as sensible.

The semantics of this are not immediately clear.

>GOTCHA: (high)
>    As we saw with lists, because everything is a reference, and there's
>    no way to dereference that reference, this means that again there
>    is also no built-in, intuitive way to copy a dictionary.

There is a copy method in Python 1.5.2.  Also there is the copy
module.  Personally, I have never felt the need for this.  Maybe
my programming style is different than yours.

>GOTCHA: (high)
>    Lists don't autoallocate the way dictionaries do.

This catches a lot of range errors for me.  I like it.

>GOTCHA: (medium)
>    There's no way to set up a permitted exports list.  The caller may have
>    anything they ask for.  

Use underscores (and see above).

>COOLNESS:
>    DBM files seem (?) to automatically know about nested datatypes.   

You probably are talking about the pickle module.  Yes, it is
cool.

>GOTCHA: (medium)
>    Importing a variable only gives you a local copy.  In Perl, it makes
>    two names for the same object.

References, all of them.  Copying does not happen.

>GOTCHA: (low)
>    Because you can't declare a local variable, you can't create variables
>    local to something other than a module (ie. file), class, or function.
>    There are no loop-local variables; they would persist through the end
>    of the function.  You can't have a scope you can't name (well, almost;
>    see lambdas)

Python doesn't have arbitrarily nested scopes.  It would be a
nice feature.  In practice I do not find it to be a problem.
Functions and methods should be short.

>GOTCHA: (high)
>    Scopes don't nest in Python, but they they do in Pascal, Perl, or C.
>    This is supposedly "simpler", but it's very suprising in many ways:
>	x = 1			# global
>	def fn1(): 
>	    x = 10		# implicit local
>	    def fn2():
>		print x		# whose?
>	    fn2()
>	fn1()
>    The answer is that it prints 1, because fn2 has no x in its local
>    scope, so gets the global.  The suggested work-around is
>	    def fn2(x=x):
>    using default parameters.

See above (also the part about determining if a variable is
local).

>GOTCHA: (medium)
>    You can't cut and paste these examples because of the issue
>    of white space significance. :-(

I don't find this a problem.  In Vim >> and << work fine for me.

>GOTCHA: (low)
>    List objects have built-in methods, like 
>	l.append(x)
>    But string objects don't.  You have to import
>    from the string module to get them, and then they're
>    functions, not methods.

This is being fixed.  See the FAQ for more details.

>GOTCHA: (low)
>    You have insert and append methods for lists, but only
>    a del function.  Why is it a function not a method, and
>    why isn't it spelled out?  Apparently the fashionable way to do this is
>    now
>	a[2:3] = []
>    Which of course, deletes only one element.  ARG.

del is a statement.  It has to be so it can remove variable
bindings.  It can also do what you want as in:

	del a[2]
	del a[1:10]

Also see the remove method.  Python is not as orthogonal as it
should be regarding methods on objects.  That is being repaired.

>GOTCHA: (high)
>    There doesn't seem to be away other than using low-level hand-rolling
>    of posix functions to supply things like os.popen and os.system
>    a list of shell-proof arguments.  Looks like it always goes through the
>    shell.  This has security ramifications.  

I think there is a module somewhere that does this.  It is not in
the standard distribution however.

>GOTCHA: (high)
>    Because you can't use readline() to get a number, people seem to enjoy
>    calling eval() just to get a string turned into a number:
>	import sys
>	str = sys.stdin.readline()
>	num = eval(x)
>    This is scary.  Even scarier is the propensity for calling input(),
>    which auto-eval()s its input to make sure it's the right "type".
>    (Funny that a soi-disant "typeless language" should be so danged
>    picky about this.)  That means you see people write:
>	num = input("Pick a number? ")
>    But the person can supply as an answer 0x23 or 2+9 or pow(2,4)
>    or os.system("rm -rf *").  I am stunned.

You want int() or float() instead of eval().  For older versions
of Python use the string.ato* functions.  Things are not as bad
as you make them out.

>GOTCHA: (low)
>    Regexes default to emacs style, not egrep style!  Gads!  This is
>    surprising to anyone except an emacs weenie.  Sigh.  And just *try*
>    to read the manpage on the differences based on passed in arguments
>    establishing the style.  There aren't any manpages!  Have a nice day.

Use the re module for Perl style regexes.

>GOTCHA: (low)
>    An out-of-bounds list reference raises an "IndexError: list index out
>    of range" exception, but not if you use a slice to get at it!
>	t = range(5)
>	print t[2:17]
>      [2, 3, 4]

This sounds funny coming from a Perl guy.  Anyhow, look at the
protocol for slicing: s[2:] actually means s[2:sys.maxint].

>COOLNESS:
>    Relationals stack:
>	x < y < z
>    means 
>	x < y and y < z

Assignments too.

>GOTCHA: (high)
>    Python's lambda's aren't really lambdas, because they are only 
>    expressions, not full functions, and because they cannot see
>    their enclosing scope.  

Guido feels that lambda was a mistake.  I half agree.  We must
live with it now (or at least until 2.0).

>GOTCHA: (medium)
>    This is a tuple of two elements
>	(1,2)
>    But this is a tuple of one element:
>	(1,)
>    Whereas this is merely the expression 1:
>	(1)
>    Yes, the trailing comma counts.

You can always use the trailing comma if you like consistency.

>GOTCHA: (low)
>    Normally, a print statement in python adds a newline.  If you
>    don't want one, you need a trailing comma!

Use sys.stdout.write instead.

>GOTCHA: (low)
>    And I'd jolly well like to know why I wasn't allowed to use
>	print "int of %f is %.0f" % (2.5) * 2
>    or if needed, 
>	print "int of %f is %.0f" % ( (2.5) * 2 )

You've been had by the "tuple comma" gotcha.

	print "int of %f is %.0f" % ((2.5,) * 2)

works just fine.  The extra parentheses are needed because % has
a higher precedent than *.

>GOTCHA: (medium)
>    Anything that python doesn't like, it raises an exception about.
>    There is no fail-soft.  Even non-exception issues raise exceptions.
>    It's pervasive.  K&P curse languages that force people who want
>    to open() a file to wrap everything in exception code, saying that
>    "failing to open a file is hardly exceptional".  

This is a big feature to me.  I am not perfect.  perl -w would
probably help me but Perl didn't have that flag when I started
with Python.

>GOTCHA: (low)
>    sort and reverse are in-place.  This leads to verbose code, such as:
>	old = [1,2,3]
>	new = old
>	new.reverse()
>    Likewise for sort.  

This also leads to efficiency if you don't want a copy.  BTW, your
code is incorrect.  You want:

	new = old[:] # make a copy
	new.reverse()

Not so bad.

>GOTCHA: (low)
>    You have to compiled the definition for a  function before you may
>    compile the call it.
...
>    ADDENDA: It turns out that def() happens *AT RUN TIME* in python.
>    That's the root of this problem.  In Perl, a sub{} is a compile-time
>    thing.

Python does a lot of stuff at runtime.  It is a source of power
and of slowness. :)

>GOTCHA: (low)
>    When you need to make an empty copy of the same type, you write
>	x = y[:0]
>    So much for being typeless. Sigh.

I don't follow you here.  All Python objects have a type.  Can
you give me an example of where you would use this code?


I seem to have missed a few of Tom's points.  Hopefully other
people have addressed these.  Regarding the flames your posts
seem to attract, sorry about that.  Your tone seems to set some
people off, especially those not accustomed to your style.  Try
to be less confrontational, especially when new to a group.
Everyone else, try harder to get along.


    Neil




More information about the Python-list mailing list