Literal concatenation, strings vs. numbers (was: Numeric literals in other than base 10 - was Annoying octal notation)

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Mon Aug 24 11:36:24 EDT 2009


On Mon, 24 Aug 2009 12:45:25 +1000, Ben Finney wrote:

> greg <greg at cosc.canterbury.ac.nz> writes:
> 
>> J. Cliff Dyer wrote:
>>
>> > What happens if you use a literal like 0x10f 304?
>>
>> To me the obvious thing to do is concatenate them textually and then
>> treat the whole thing as a single numeric literal. Anything else
>> wouldn't be sane, IMO.

Agreed. It's the only sane way to deal with concatenating numeric 
literals. It makes it simple and easy to understand: remove the 
whitespace from inside the literal, and parse as normal.

123 4567 => 1234567  # legal
0xff 123 => 0xff123  # legal
123 0xff => 1230xff  # illegal

The first two examples would be legal, the last would raise a syntax 
error, for obvious reasons. This would also work for floats:

1.23 4e5 => 1.234e5  # legal
1.23 4.5 => 1.234.5  # illegal
1e23 4e5 => 1e234e5  # illegal



> Yet, as was pointed out, that behaviour would be inconsistent with the
> concatenation of string literals::
> 
>     >>> "abc" r'def' u"ghi" 'jkl'
>     u'abcdefghijkl'

Unicode/byte conversion is obviously a special case, and arguably should 
have been prohibited, although "practicality beats purity" suggests that 
a single unicode string in the sequence should make the lot unicode. 
(What else could it mean?)

In any case, numeric concatenation and string concatenation are very 
different beasts. With strings, you have to interpret each piece as 
either bytes or characters, you have to treat escapes specially, you have 
to deal with matching delimiters. For numeric concatenation, none of 
those complications is relevant: there is no equivalent to the byte/
character dichotomy, there are no escape sequences, there are no 
delimiters.

Numeric literals are much simpler than string literals, consequently the 
concatenation rule can be correspondingly simpler too. There's no need to 
complicate it by *adding* complexity: you can't have mixed bases in a 
single numeric literal without spaces, why would you expect to have mixed 
bases in one with spaces?




-- 
Steven



More information about the Python-list mailing list