Why can't I xor strings?

Sun Oct 10 18:24:57 EDT 2004

On Sun, 10 Oct 2004 22:03:08 +0000, Bengt Richter wrote:
> What's right about accepting 3^7 ? Why should xor be defined for integers?
> IMO we have implicit subtyping of integers as vectors or column matrices
> of bools and operations element by element and implicit re-presentation
> of the result as integer.

I'm intrigued but torn by your arguments.

For positive numbers, a number really is its vector of bits. In the
mathematical sense of "equal" (which I usually express in English as "two
equal things are fully substitutable with each other in all relevant
contexts"), where all base numbers are written in base 10:

10 base 10 = 11 base 9 = 22 base 4 = 1010 base 2

The fact that it happens to really be a bit vector in the computer in this
case actually doesn't count for anything, because just as that bit vector
*is*, in every conceivable mathematical way, a base 10 number, it is also
a base 3 number. I can and have defined ternary math before, and
experimental computers have been built on ternary bases and ternary logic,
and 44 in a binary computer is 44 in a ternary computer.

Only our choice of negative numbers, practically speaking, makes a
difference between the number-as-bitstring and number-qua-number, and
something as high level as Python can conceptually use an extra "negative"
bit so even that distinction goes away. In fact, playing with xor'ing
longs makes me wonder if Python doesn't *already* do that.

>     What is boolvec(388488839405842L) ^ boolvec("hello")?

I'd rather see this as the fairly meaningless "388488839405842L base 2",
meaningless because the base of a number does not affect its value, and
bitwise-xor, as the name implies, operates on the number base two.

Since I reject the need to cast ^ in terms of boolvec, I don't feel
compelled to try to define "boolvec" for strings. Cast it into a number
explicitly, refusing the temptation to guess.

> Except where there is an accepted legacy of such things being done already ;-)
> 
> BTW, what is the rationale behind this:
> 
>  >>> ['',(),[],0, 0.0, 0L].count(0)
>  3
>  >>> ['',(),[],0, 0.0, 0L].count(())
>  1
>  >>> ['',(),[],0, 0.0, 0L].count([])
>  1
>  >>> ['',(),[],0, 0.0, 0L].count('')
>  1
>  >>> ['',(),[],0, 0.0, 0L].count(0.0)
>  3
>  >>> ['',(),[],0, 0.0, 0L].count(0L)
>  3

Again, considered abstractly as numbers, 0 is zero no matter how you slice
it. Abstractly, even floats have a binary representation, just as they
have a decimal representation, and we could even xor them. Realistically,
it is much less useful and there is the infinite-length problem of floats
to deal with.

Theoretically, no float should ever equal an int or a long, since an Int
or Long conceptually identifies exactly one point and a float should be
seen as representing a range of numbers depending on the precision that
can be made arbitrarily small but never truly identifies one point. This
is another one of those "practicality beats purity" things.