[Tutor] SQLite3 DB Field Alphabetizing

Steven D'Aprano steve at pearwood.info
Wed Oct 13 15:27:36 CEST 2010


On Thu, 14 Oct 2010 12:13:50 am David Hutto wrote:
> I see it now. I knew that the u outside ' ' is the coding for the
> string, but I thought I had to strip it before using it since that
> was how it showed up. The bug of course would be that graphs that
> start with u would go to the second letter, but the u would still be
> used in alphabetization, because the alphabetizing is prior to
> stripping.

No, the u is not part of the string, it is part of the *syntax* for the 
string, just like the quotation marks. The first character of "abc" is 
a and not ", and the first character of u"abc" is also a and not u.

Another way to think of it... the u" " of Unicode strings is just a 
delimiter, just like the [ ] of lists or { } of dicts -- it's not part 
of the string/list/dict.

In Python 3 this confusion is lessened. Unicode strings (characters) are 
written using just " as the delimiter (or ' if you prefer), instead of 
u" " as used by Python 2. Byte strings are written using b" " instead. 
This makes the common case (text strings) simple, and the uncommon case 
(byte strings) more complicated.



-- 
Steven D'Aprano


More information about the Tutor mailing list