.0 in name

Chris Angelico rosuav at gmail.com
Sat May 28 17:15:41 EDT 2022


On Sun, 29 May 2022 at 06:41, Ralf M. <Ralf_M at t-online.de> wrote:
>
> Am 13.05.2022 um 23:23 schrieb Paul Bryan:
> > On Sat, 2022-05-14 at 00:47 +0800, bryangan41 wrote:
> >
> >> May I know (1) why can the name start with a number?
> >
> > The name of an attribute must be an identifier. An identifier cannot
> > begin with a decimal number.
>
> I'm not sure about the first statement. Feeding
>
> [print("locals:", locals()) or c for c in "ab"]
>
> to the REPL, the result is
>
> locals: {'.0': <str_iterator object at 0x0000000002D2B160>, 'c': 'a'}
> locals: {'.0': <str_iterator object at 0x0000000002D2B160>, 'c': 'b'}
> ['a', 'b']
>
> i.e. there is a variable of name .0 in the local namespace within the
> list comprehension, and .0 is definitely not an identifier.
>
> I came across this while investigating another problem with list
> comprehensions, and I think the original post was about list comprehensions.
>

There are a few quirks with comprehensions, and to understand that
".0", you have to first understand two very important aspects of
scoping with regard to comprehensions.

(Note: For simplicity, I'm going to refer in general to
"comprehensions", and I am not going to count Python 2. My example
will be a list comp, but a generator expression also behaves like
this, as do other comprehensions.)

Consider this function:

def spam():
    ham = "initial"
    ham = [locals() for x in "q"]
    return ham

The disassembly module can be very helpful here. The precise output
will vary with Python version, but the points I'm making should be
valid for all current versions. Here's how it looks in a December
build of Python 3.11 (yeah, my Python's getting a bit old now, I
should update at some point):

>>> dis.dis(spam)
  2           0 LOAD_CONST               1 ('initial')
              2 STORE_FAST               0 (ham)

  3           4 LOAD_CONST               2 (<code object <listcomp> at
0x7fb6a0cfa6b0, file "<stdin>", line 3>)
              6 MAKE_FUNCTION            0
              8 LOAD_CONST               3 ('q')
             10 GET_ITER
             12 CALL_FUNCTION            1
             14 STORE_FAST               0 (ham)

  4          16 LOAD_FAST                0 (ham)
             18 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x7fb6a0cfa6b0, file
"<stdin>", line 3>:
  3           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 5 (to 16)
              6 STORE_FAST               1 (x)
              8 LOAD_GLOBAL              0 (locals)
             10 CALL_FUNCTION            0
             12 LIST_APPEND              2
             14 JUMP_ABSOLUTE            2 (to 4)
        >>   16 RETURN_VALUE
>>>

Okay, that's a lot of raw data, but let's pull out a few useful things from it.

Line 2 initializes ham in an unsurprising way. Grab a constant, store
it in a local. Easy.

Line three. We grab the code object for the list comp, and make a
function (that's necessary for closures). Then, *still in the context
of the spam function*, we grab the constant "q", and get an iterator
from it. Leaving that on the top of the stack, we call the list
comprehension's function, and store the result into 'ham'.

The comprehension itself loads the fast local from slot zero (name
".0")  and iterates over it. Slot zero is the first argument, so
that's the string iterator that we left there for the function.

So why IS this? There are a few reasons, but the main one is generator
expressions. Replacing the list comp with a genexp gives this result:

>>> spam()
<generator object spam.<locals>.<genexpr> at 0x7fb6a0780890>

The actual iteration (row 4 in the genexp in the above disassembly of
<listcomp>) doesn't happen until you iterate over this value. But it
would be extremely confusing if, in that situation, errors didn't show
up until much later. What if, instead of iterating over a string, you
tried to iterate over a number? Where should the traceback come from?
Or what if you're iterating over a variable, and you change what's in
that variable?

def wat():
    stuff = "hello"
    ucase = (l.upper() for l in stuff)
    stuff = "goodbye"
    return "".join(ucase)

Does this return "HELLO" or "GOODBYE"? Since stuff gets evaluated
immediately, it returns HELLO, and that's consistent for list comps
and genexps.

But because of that, there needs to be a parameter to carry that
iterator through, and every parameter needs a name. If the generated
name collided with any identifier that you actually wanted, it would
be extremely confusing; so to keep everything safe, the interpreter
generates a name you couldn't possibly want - same as for the function
itself, which is named "<listcomp>" or "<genexpr>", angle brackets
included.

That's a fairly long-winded way to put it, but that's why you can have
variables with bizarre names :)

ChrisA


More information about the Python-list mailing list