Octal notation: severe deprecation

John Machin sjmachin at lexicon.net
Tue Jan 11 17:16:07 EST 2005


Some poster wrote (in connexion with another topic):
> ... unicode("\347", "iso-8859-1") ...

Well, I haven't had a good rant for quite a while, so here goes:

I'm a bit of a retro specimen, being able (inter alia) to recall octal
opcodes from the ICT 1900 series (070=call, 072=exit, 074=branch, ...)
but nowadays I regard continued usage of octal as a pox and a
pestilence.

1. Octal notation is of use to systems programmers on computers where
the number of bits in a word is a multiple of 3. Are there any still in
production use? AFAIK word sizes were 12, 24, 36, 48, and 60 bits --
all multiples of 4, so hexadecimal could be used.

2. Consider the effect on the newbie who's never even heard of "octal":

>>> import datetime
>>> datetime.date(2005,01,01)
datetime.date(2005, 1, 1)
>>> datetime.date(2005,09,09)
File "<stdin>", line 1
datetime.date(2005,09,09)
^
SyntaxError: invalid token

[straight out of the "BOFH Manual of Po-faced Error Messages"]

3. Consider this extract from the docs for the re module:
"""
\number
Matches the contents of the group of the same number. Groups are
numbered starting from 1. For example, (.+) \1 matches 'the the' or '55
55', but not 'the end' (note the space after the group). This special
sequence can only be used to match one of the first 99 groups. If the
first digit of number is 0, or number is 3 octal digits long, it will
not be interpreted as a group match, but as the character with octal
value number. Inside the "[" and "]" of a character class, all numeric
escapes are treated as characters.
"""

I helped to straighten out this description a few years ago, but I fear
it's still not 100% accurate. Worse, take a peek at the code necessary
to implement this.

===

We (un-Pythonically) implicitly take a leading zero (or even just
\[0-7]) as meaning octal, instead of requiring something explicit as
with hexadecimal. The variable length idea in strings doesn't help:
"\18", "\128" and "\1238" are all strings of length 2.

I don't see any mention of octal in GvR's "Python Regrets" or AMK's
"PEP 3000". Why not? Is it not regretted?




More information about the Python-list mailing list