[Python-Dev] Proposal: from __future__ import unicode_string_literals

Eric Smith eric+python-dev at trueblade.com
Fri Mar 21 11:57:07 CET 2008


Eric Smith wrote:
> This proposal is to add "from __future__ import 
> unicode_string_literals", which would make all string literals in the 
> importing module into unicode objects in 2.6.

I'm going to withdraw this, for 2 reasons.
1) The more I think about it, the less sense it makes.
2) Without some extreme measures, it's not implementable.

It's not implementable because the work has to occur in ast.c (see 
Py_UnicodeFlag).  It can't occur later, because you need to skip the 
encoding being done in parsestr().  But the __future__ import can only 
be interpreted after the AST is built, at which time the encoding has 
already been applied.  There are some radical things you could do to 
work around this, but it would be a gigantic change.

As for it not making sense, this is really in the realm of 2to3.  I'm 
beginning to really believe this statement in PEP 3000:

"There is no requirement that Python 2.6 code will run unmodified on 
Python 3.0. Not even a subset. (Of course there will be a tiny subset, 
but it will be missing major functionality.)"

For this particular issue, just use u'' in 2.6 and let 2to3 deal with 
it.  If you have some 2.6 code that you want to run in 3.0 (by way of 
2to3), I think all of your string literals should either be b'' or u''. 
  Don't use plain ''.

Eric.



More information about the Python-Dev mailing list