Am I the only one who would love these extentions? - Python 3.0 proposals (long)

Mon Nov 10 18:51:13 EST 2003

Georgy Pruss:
> 1) Underscores in numbers. It will help to read long numbers.
> E.g.
>    12_345_678
>    3.14159_26535_89793_23846

Perl has this.  How often do long numbers occur in your code?

When is the workaround of
  int("12_345_678".replace("_", ""))
  float("3.14159_26535_89793_23846".replace("_",""))
inappropriate?  Note also that this allows the more readable
(to some)
  float("3.14159 26535 89793 23846".replace(" ",""))

> 2) Binary constants. Not in great demand, just nice to have,
> half an hour to implement.
> E.g.
>    0b01110011
>    0b1110_0101_1100_0111

Why is it nice enough to make it be a syntax addition,
as compared to

>>> int("01110011", 2)
115
>>> int("1110_0101_1100_0111".replace("_", ""), 2)
58823
>>>

?

> 3) Hex strings. Very useful when you want to initialize long
> binary data, like inline pictures.
> E.g.
>    x'48656C6C6F 21 0D0A'
>    ux'0021 000D 000A'
> They can be combined with other strings: 'Hello!' x'0d0a'

> Now you can use hexadecimal values, but with two times
> longer sequences like '\x..\x..\x..\x..', or do the translation
> during run-time using '....'.decode('hex').

I very rarely include encode binary data in my data files.
Images should usually be external resources since it's hard
to point an image viewer/editor at a chunk of Python code.

A counter example is some of the PyQt examples, which
have pbm (I think) encoded images, which are easy to see
in ASCII.  In that case, there's a deliberate choice to use
a highly uncompressed format (one byte per pixel).

The run-time conversion you don't is only done once.
In addition, another solution is to have the Python spec
require that a few encodings (like 'hex') cannot be changed,
and allow the implementation to preprocess those cases.

So why is it useful enough to warrant a new special-case
syntax?

> 4) Keywords 'none', 'false', 'true'. They should be keywords,
> and they should be lowercase, like all the rest keywords.
> True, False and None can stay for some time as predefined
> identifiers, exactly like they are now.

Why *should* they be lower case?  There was a big dicussion
when True/False came out, which resulted in that "casing".

You argue consistancy to other keywords.  What other
keywords refer to objects?

>>> keyword.kwlist
['and', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif',
'else',
'except', 'exec', 'finally', 'for', 'from', 'global', 'if', 'import', 'in',
'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try',
'while', 'yield']
>>>

None that I can see.

> 5) Enum type.

There are approximations to this, as in
  http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/67107

>    # defines constants AXIS.X=0, AXIS.Y=1,  AXIS.Z=2
>    enum AXIS: X, Y, Z

For that case, I would prefer the following

class AXIS:
  X, Y, Z = range(3)

If there are more terms then here's an alternative.  I've not
seen it before so I think it's a novel one

class Keywords:
  AND, ASSERT, BREAK, CLASS, CONTINUE, DEF,  \
    DEL, ELIF, YIELD, ...  = itertools.count()

That is, the ellipsis at the end of a sequence assignment means
to ignore RHS terms at that point and beyond.

>    # defines color.red, color.green, color.blue
>    enum color
>       red = '#FF0000', green = '#00FF00', blue = '#0000FF'

This can be done already as

class color:
  red = '#FF0000'
  green = '#00FF00'
  blue = '#0000FF'

>    # defines consts A=0, B=1, C=2, D=10, E=11, F=12
>    enum
>      A, B, C
>      D = 10, E
>      F

While I know C allows this, I've found it more confusing
than useful.  What about

class consts:
  A, B, C, ... = itertools.count()
  D, E, F, ... = itertools.count(10)

> 6) The colon is optional after for,if,try,enum etc. if it is
> followed by a new line. Although it's mandatory if there are
> statements on the same line. So it's like ';' -- you MUST
> use it when it's needed. You CAN still use it if you like it,
> like now you can put a semicolon after each statement
> as well.
>
>    def abs(x)
>       if x < 0
>          return -x
>       else
>          return x

The reason for the ":" is to make the block more visible.  This
came out during human factors testing in ABC (as I recall).  It
also simplifies tool development since software can assume
that if the previous line ends in a : then there should be an indent.

BTW, what advantage is there in having an optional syntax
for this?

> 7) Built-in regex'es. It would be nice to have built-in regular
> expressions. Probably, some RE functionality can be shared
> with the built-in string class, like presently we can write
> 'x'.encode('y'). For example $str can produce a regex object
> and then s==re can return true if s matches re. This would be
> very good for the switch construction (see below).
> E.g.
>    id = $ "[A-Za-z_][A-Za-z0-9_]*"
>    if token == id: return token

Regexps play much less a role than in languages like Perl and
Ruby.  Why for Python should this be a syntax-level feature.
Eg, your example can already be done as

id = re.compile("[A-Za-z_][A-Za-z0-9_]*")
if id.match(token): return token

Note that you likely want the id pattern to end in a $.

A downside of the == is that it isn't obvious that ==
maps to 'match' as compared to 'search'.

Suppose in 5 years we decide that Larry Wall is right
and we should be using Perl6 regexp language.  (Just
like we did years ago with the transition from the regex
module to the re module.)  What's the viable migration
path?

Does your new regexp builtin allow addition

  $"[A-Za-z_]" + $"[A-Za-z0-9_]*"

or string joining, like

  $"[A-Za-z_]"  $"[A-Za-z0-9_]*"

or addition with strings, as in

  $"[A-Za-z_]"  + "[A-Za-z0-9_]*"

What about string interpolation?

  $"[%s]" % "ABCDEF"

> 8) Slices. They can have two external forms and they are
> used in three contexts: a) in [index], b) in the 'case' clause,
> c) in the 'for-as' statement. The have these forms:
>    a:b   means  a <= x < b     (a:b:c -- with step c)
>    a..b  means  a <= x <= b   (a..b:c -- with step c)
> E.g.
>    1:100 == 1,2,3,...,99
>    1..100 == 1,2,3,...,100

See http://python.org/peps/pep-0204.html .  Proposed and
rejected.

> 9) For-as loop.

See also http://python.org/peps/pep-0284.html which is for
integer loops.  Since your 8) is rejected your 9) syntax
won't be valid.

> 10) Until loop -- repeat the loop body at least one time
> until the condition is true.
>
>    until <postcond>
>       <stmts>
>
> It's the same as:
>
>    <stmts>
>    while not <postcond>
>        <stmts>

Actually, it's the same as

  while True:
    <stmts>
    if postcond:
      break

Why is this new loop construct of yours useful enough
to warrant a new keyword?

> 11) Unconditional loop. Yes, I'm from the camp that
> finds 'while 1' ugly. Besides, sometimes we don't need
> a loop variable, just a repetition.
>
>    loop [<expr_times_to_repeat>]
>       <stmts>
>
> E.g.
>    loop 10
>       print_the_form()
>
>    loop
>       line = file.readline()
>       if not line
>          break
>       process_line( line )

Do you find 'while True' to be ugly?

How much code will you break which uses the
word "loop" as a variable?  I know you expect it
for Python 3 which can be backwards incompatible,
but is this really worthwhile?  How many times do
you loop where you don't need the loop variable?
Is this enough to warrant a special case construct?

I don't think so.

Note that the case you gave is no longer appropriate.
The modern form is

for line in file:
  process_line(line)

> 12) Selection statement. The sequence (it can also
> occur in the for-as statement, see above) is composed
> of one or more expressions or slices.
>
>    switch <expr>
>    case <sequence>
>       <stmts>
>    case <sequence>
>       <stmts>
>    else
>       <stmts>
>
> The sequence can contain RE-patterns.
>
>    case $pattern,$pattern2 # if expr matches patterns
>       ...

See http://python.org/peps/pep-0275.html which has
not yet been decided

> 13) One line if-else.
>
>    if <cond>: <stmt>; else: <stmt>

Why?  What's the advantage to cramming things on a line
compared to using two lines?

if a: print "Andrew"
else: print "Dalke"

Even this is almost always too compact for readability.

> 14) Conditional expression. Yes, I'd love it.
>
>    cond ? yes : no

rejected.  See
http://python.org/peps/pep-0308.html

> 15) Better formatting for repr()
>
>    repr(1.1) == '1.1'
>
> If the parser recognizes 1.1 as a certain number, I can't see
> any reason why it should print it as 1.1000000000000001 for
> the next parser run.

Then use 'str', or "%2.1f" % 1.1

What you want (a parser level change) makes
a = 1.1
repr(a)

different from
repr(1.1)

That's not good.

Note that 1.1 cannot be represented exactly in IEEE 754
math.  Following IEEE 754 is a good thing.  What is your
alternative math proposal?

> 16) Depreciated/obsolete items:
>
> -- No `...` as short form of repr();

I agree.  I believe this is one of the things to be removed
for Python 3.

> -- No '\' for continuation lines -- you can always use parenthesis;

I agree.  I believe this is one of the things to be removed
for Python 3.

> -- No else for while,for,try etc -- they sound very unnatural;

I've found 'else' very useful for a few cases which once required
ugly flags.

> -- Ellipsis -- it looks like an alien in the language.

As you can see, I proposed a new way to use ellipsis.  In real
life I almost never use the ....

> I do believe that these changes follow the spirit of the language
> and help Python to become an even better language.

Given how many new looping constructs you want, I disagree
with your understanding of the spirit of the language.

                    Andrew
                    dalke at dalkescientific.com