Article on the future of Python

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Sep 26 03:23:47 EDT 2012


On Tue, 25 Sep 2012 23:35:39 -0700, wxjmfauth wrote:

> Py 3.3 succeeded to somehow kill unicode and it has been transformed
> into an "American" product for "American" users.

For the first time in Python's history, Python on 32-bit systems handles 
strings containing Supplementary Multilingual Plane characters correctly, 
and it does so without doubling or quadrupling the amount of memory every 
single string takes up.

Strings are ubiquitous in Python -- every module, every variable, every 
function, every class is associated with at least one and often many 
strings, and they are nearly all ASCII strings. The overhead of using 
four bytes instead of one for every string is considerable.

Python finally has correct unicode handling for characters beyond the BMP, 
and it does so with more efficient strings that potentially use as little 
as one quarter of the memory that they otherwise would use, at the cost 
of a small slowdown in the artificial and unrealistic case that you 
repeatedly create millions of strings and then just throw them away 
immediately. Most realistic cases of string handling are unchanged in 
speed, either trivially faster or trivially slower. The real saving is in 
memory.

According to wxjmfauth, this has "killed" unicode. Judge for yourself his 
credibility. The best I can determine, he believes this because Americans 
aren't made to suffer for using mostly ASCII strings.



-- 
Steven



More information about the Python-list mailing list