Integers with leading zeroes

Chris Angelico rosuav at gmail.com
Wed Jul 22 00:16:17 EDT 2015


On Wed, Jul 22, 2015 at 12:14 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Wed, 22 Jul 2015 11:10 am, Chris Angelico wrote:
>
>> On Wed, Jul 22, 2015 at 10:55 AM, Steven D'Aprano <steve at pearwood.info>
>> wrote:
>>>> Sometimes these numbers represent codeblocks of a fixed
>>>> number of digits. Always writing those numbers with this
>>>> number of digits helps being aware of this. It is also
>>>> easier for when you need to know how many leading zero's
>>>> such a number has.
>>>
>>> I'm not sure what you mean here. Python ints don't have a fixed number of
>>> digits.
>>
>> Sometimes your numbers carry specific payloads or structures. A few
>> examples:
>>
>> Date: 20150722 [decimal]
>> Unix permissions: 10777 [octal]
>> MAC address: 0014a466fba9 [hex]
>
> I don't see the relevance of any of those examples. Only the date is
> kinda-sort in decimal, the others are in octal and hex and so need to be
> written as octal or hex numbers:
>
> perm = 0o10777  # not 25031 as the above will give
> addr = 0x0014a466fba9  # the above will give a syntax error

Right, I'm just giving examples of structured numbers. I don't have a
good example of a decimal structured number, but there are good
examples in other bases, and the possibility is there for someone to
have one that makes sense in decimal.

> The date example should be a string, not an integer.
>
> today = 20151231
> tomorrow = today + 1
> assert tomorrow == 20160101  # fails

All that proves is that there are certain operations that don't work
on date-stored-as-integer. The same operations equally won't work on
date-stored-as-string. If you want date arithmetic, you MUST use a
proper date/time library; but if all you want is simple and efficient
comparisons, integers work fine. So do strings, but integers are
right-justified. If you imagine a situation in which it's not dates
with four digit years, but some other starting figure - maybe it's the
year in some arbitrary calendar on which today is the 6th of Cuspis in
the year 411 of the Common Reckoning. Those dates can go back before
year 100, so the date numbers would lose a digit compared to today's
4110206. Hence it's useful to be able to right-justify them.

Dates aren't a great example (because good date/time libraries do
exist), but they're more universally understood than domain-specific
examples.

> I guess you can have 0 as Unix permissions, there might even be a 0 MAC
> address, but would you write them in decimal as 0000 (etc.) when all the
> other perms and addresses are written in oct or hex?
>
> addresses = [
>     0x0014a466fba9,
>     0x0014a00b3fb1,
>     000000000000,
>     0x003744a9012a,
>     ]

Right, so those aren't ideal examples either, because they're not decimal.

> Postcodes, or zip codes, also should be written as strings, even if they
> happen to be all digits.

Hmm, maybe. I'm on the fence about that one. Of course, most of the
time, I advocate a single multi-line text field "Address", and let
people key them in free-form. No postcode field whatsoever.

> I'm still looking for an example of where somebody would write the int zero
> in decimal using more than one 0-digit. While I'm sure they are fascinating
> in and of themselves, examples of numbers written as strings, in hex or
> octal, non-zero numbers written without leading zeroes, or zero written
> with only a single digit don't interest me :-)

Frankly, I'm in broad agreement: using 00000000000 to represent 0
isn't particularly useful, given that 0001 is an error. But since
C-like languages (and Py2) use the leading zero to mean octal, and
mathematics ignores the leading zero, there's no way to avoid
confusing people other than by having an instant error. There's
probably code out there that uses 000 to mean 0, but personally, I
wouldn't be against deprecating it.

One thing that's really REALLY annoying is running into something that
uses virtually the same syntax to mean something almost, but not
entirely, identical... and completely incompatible. If Py3 allowed
0009 to mean 9, we would have nightmares all over the place, even
without Py2/Py3 conversion. Unadorned octal still shows up in one
place in Py3, and that's string escapes:

>>> "\33"
'\x1b'
>>> b"\33"
b'\x1b'

I hope this *never* gets changed to decimal or hex. If it's considered
a problem, the only solution is to excise it altogether. Please do NOT
do what BIND9 did, and have "\033" mean 33 decimal... it bugged me no
end when I tried to move some binary around between DNS and other
systems...

ChrisA



More information about the Python-list mailing list