Numeric literals in other than base 10 - was Annoying octal notation

James Harris james.harris.1 at googlemail.com
Tue Aug 25 17:26:05 EDT 2009


On 24 Aug, 03:49, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:

...

> > Here's another suggested number literal format. First, keep the
> > familar 0x and 0b of C and others and to add 0t for octal. (T is the
> >thirdletter of octal as X is thethirdletter of hex.) The numbers
> > above would be
>
>         The thing is -- "x" and "hex" have similar pronunciations: (h)ecks;
> the name of the radix is its own reminder for the character to use
> without thinking such conventions as "thirdletter of the radix name".
>
>         But "t" (tee) has no pronunciation resemblance to "oct" (o'kt)
> whereas the unlovely "o" at least if taken as a short vowel sound is
> similar to the "o" of "oct" given the short stop between it and the
> "ct".
>
> >   0b1011, 0t7621, 0xc26b
>
>         And "b" for binary breaks the "thirdletter of radix name"
> convention... You should be using "n" for that (and "c" for decimal <G>)

I wasn't proposing a convention of using the third character of the
base name. I was saying that t is not too unreasonable given that we
use x for hex (rather than h).

>
>         Or we use b, o, d, h (as the HP calculator) and drop the "x"
> version.
>
>
>
> > where the three characters "0.(" begin the sequence.
>
>         Ugly...
>
> > Comments? Improvements?
>
>         Retaining which ever character is finally decided, I'd make all
> radix specified literals follow a quoted format:
>
>         "digits"radix
>
>         "01110"b
>         "123"d (of course, just 123 would be allowed for simplicity)
>         "7C"x
>         "327"o

The quoting is good. For hex, decimal, octal and binary, however, I
can't see a good reason to change away from the conventional prefix
form. And, in general, it's easier for a human to parse if the base is
specified first.

>
>         Probably wouldn't need that much change to the parser as it would,
> left to right, see a string, and then when the string is not followed by
> one white space character, find a radix marker -- the parser would then
> parse the string using the specified radix, and emit the appropriate
> numeric value instead of a string value.

Maybe. I think, though, that having the base as prefix would make the
parser's job easier as well as the job of humans. It's easier if we
know what we are parsing before we parse it rather than afterwards.

>        It only adds one extra
> character (instead of leading 0r, one has ""r), and the leading " serves
> as a warning to a reader that this is not directly readable as a number.
>
>         The alternative syntax of radix"digits" is the same length, but adds
> to the parsing as it initially looks like a name entity, then hits the
> quote, and has to back up to interpret the result as a radix marker.

True. The beginning of a number should be numeric. Using your scheme,
though, instead of radix"digits" you could have 0radix"digits".

> 0r
> format starts as a number, hits a radix marker while the
> "conversion/accumulator" is still a 0 value (0 is 0 in all radix) and
> switches the converter to accept the digits in the specified radix.

Sounds like you are suggesting 0radix"digits" but I'm not sure.

>
>         Whereas all prefix versions that don't start with a 0r tend to
> require more complex parsing:
>
>         0.(
>
> starts out looking like a number (the 0)... a floating point number (the
> .)... a function/method call on a floating point 0... WAIT? floating
> point numbers aren't callables (yet! I'm sure someone is going to show a
> way to define a variable bound to a number as a callable, though not to
> a literal number)... throw away this parse tree branch, back up and
> reparse as special numeric radix prefix...

You've laid it on thick but I agree in principle. What about
radix"digits" where radix is numeric: So 2"1101" or 3"122"? (Not to
replace 0b1101 etc but to supplement it for arbitrary bases.)

>
>         Of course, one still has to consider what will be used for \
> character encoding... \x0F vs \013 vs \b0001111?

The plans I had did not allow for the suggestions above so I have no
comments on character encoding yet but it's good that you mentioned
it.

James



More information about the Python-list mailing list