python 2.7 and unicode (one more time)

Marko Rauhamaa marko at pacujo.net
Mon Nov 24 02:40:51 EST 2014


Chris Angelico <rosuav at gmail.com>:

> Py3's byte strings are still strings, though.

Hm. I don't think so. In a plain English sense, maybe, but that kind of
usage can lead to confusion.

For example,

   A subscription selects an item of a sequence (string, tuple or list)
   or mapping (dictionary) object:

   subscription ::=  primary "[" expression_list "]"

   [...]

   A string’s items are characters. A character is not a separate data
   type but a string of exactly one character.

   <URL: https://docs.python.org/3/reference/expressions.html#subscripti
   ons>


The text is probably a bit buggy since it skates over bytes and byte
arrays listed as sequences (by <URL:
https://docs.python.org/3/reference/datamodel.html>). However, your
Python3 implementation would fail if it interpreted bytes objects to be
strings in the above paragraph:

   >>> "abc"[1]
   'b'
   >>> b'abc'[1]
   98

The subscription of a *string* evaluates to a *string*. The subscription
of a *bytes* object evaluates to a *number*.


Marko



More information about the Python-list mailing list