Annoying octal notation

Derek Martin code at pizzashack.org
Mon Aug 24 09:42:31 EDT 2009


On Sun, Aug 23, 2009 at 06:13:31AM +0000, Steven D'Aprano wrote:
> On Sat, 22 Aug 2009 22:19:01 -0500, Derek Martin wrote:
> > On Sat, Aug 22, 2009 at 02:55:51AM +0000, Steven D'Aprano wrote:
> >> And the great thing is that now you get to teach yourself to stop
> >> writing octal numbers implicitly and be write them explicitly with a
> >> leading 0o instead :)
> > 
> > Sorry, I don't write them implicitly.  A leading zero explicitly states
> > that the numeric constant that follows is octal.
> 
> That is incorrect.

No, it simply isn't.  It is a stated specification in most popular
programming languages that an integer preceded by a leading zero is an
octal number.  That makes it explicit, when used by a programmer to
write an octal literal.  By definition.  End of discussion.

> (Explicitness isn't a binary state

Of course it is.  Something can be either stated or implied... there
are no shades in between.  Perhaps you mean "obvious and intutitive"
where you are using the word "explicit" above (and that would be a
matter of subjective opinion).  The leading zero, however, is
undoubtedly explicit.  It is an explicitly written token which, in
that context, has the meaning that the digits that follow are an octal
number.  One simply needs to be aware of that aspect of the
specification of the programming language, and one will clearly know
that it is octal.

My point in mentioning that many other programming languages, by the
way, was NOT to suggest that, "See, look here, all these other folks
do it that way too, so it must be right."  It was to refute the notion that 
the leading zero as octal was in some way unusual.  It is, in fact,
ubiquitous in computing, taught roughly in the first week of any
beginning computing course using nearly any modern popular programming
language, and discussed within the first full page of text in the
section on numerical literals in _Learning Python_ (and undoubtedly
many other books on Python).  It may be a surprise the first time you
run into it, but you typically won't forget that detail after you run
into it the first time.

> However, octal numbers are defined implicitly: 012 is a legal base 10 
> number, or base 3, or base 9, or base 16. 

Not in any programming language I use today, it's not.  In all of
those, 012 is an octal integer literal, per the language spec.

> There's nothing about a leading zero that says "base 8" apart from
> familiarity. 

That is simply untrue.  What says base 8 about a leading zero is the
formal specification of the programming language.  

The only way using octal could be implicit in the code is if you
wrote something like:

  x = 12 

in your code, and then had to pass a flag to your compiler or
interpreter to tell it that you meant to use octal integer literals
instead of decimal ones.  That, of course, would be insane.  But
specifying a leading zero to represent an octal number zero is very
much explicit, by definition.

> We can see the difference between leading 0x and leading 0 if you
> repeat it: repeating an explicit 0x, as in 0x0xFF, is a syntax
> error, while repeating an implicit 0 silently does nothing
> different:

No, we can't.  Just as you can type 0012, you can also type 0x0FF.
It's not different AT ALL.  In both cases, the marker designated by
the programming language as the base indicator can be followed by an
arbitrary number of zeros which do not impact the magnitude of the
integer so specified.  Identical behavior.  The key is simply to
understand that the first 0 is not a digit -- it's a syntactic marker,
an operator if you will (though Python may not technically think of it
that way).  The definition of '0' is overloaded, just as other
language tokens often are.  This, too, is hardly unusual.

> There are a bunch of languages, pretty much all heavily influenced
> by C, which treat integer literals with leading 0s as oct: C++,
> Javascript, Python 2.x, Ruby, Perl, Java. As so often is the case,
> C's design mistakes become common practice. Sigh.

That it is a design mistake is a matter of opinion.  Obviously the
people who designed it didn't think it was a mistake, and neither do
I.  If you search the web for this topic (I did), you will find no
shortage of people who think the recent movement to irradicate the
leading zero to be frustrating, annoying, and/or stupid.  And by
the way, this representation predates C.  It was at least present in
B.

> FORTRAN 90 uses a leading O (uppercase o) for octal

That clearly IS a design mistake, because O is virtually
indistinguishable from 0, especially considering varying fonts and
people's variable eye sight quality.

> Algol uses an explicit base: 8r12 to indicate octal 10.

This is far better than 0o01.  I maintain that 0o1 is only marginally
better than O01 (from your fortran example) or 0O1, allowed by Python.
The string 8r12 has the nicety that it can easily be used to represent
integers of any radix in a consistent fashion.  But I maintain that
the leading zero is just fine, and changing it after 20 years of
Python seems more than a little arbitrary and brain-damaged to me.

> > Computer languages are not write-only, excepting maybe Perl. ;-) Writing
> > 0o12 presents no hardship; but I assert, with at least some support from
> > others here, that *reading* it does.
> 
> No more so than 0x or 0b literals. If anything, 0o12 stands out as "not 
> twelve" far more than 012 does.

Obviously, I don't agree.

Moving on... I've wasted enough time arguing about this.

-- 
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20090824/ace1cb91/attachment-0001.sig>


More information about the Python-list mailing list