Simpler transition to PEP 3000 "Unicode only strings"?

Tue Sep 20 05:21:59 EDT 2005

Hi all, 

My question is: How do you tackle with mixing
Unicode and non-Unicode parts of your application?

Context: 
========

The PEP 3000 says 
"Make all strings be Unicode, and have a separate bytes() type."

Until then, I am forced to write 
  # -*- coding: cp123456 -*- 
(see 2.1.4 Encoding declarations) and use...
  myString = u'text with funny letters'

This leads to a source polution that will be
difficult to remove later.

The idea:
=========

What do you think about the following proposal
that goes the half way

  If the Python source file is stored in UTF-8 (or
  other recognised Unicode file format), then the
  encoding declaration must reflect the format or
  can be omitted entirely. In such case, all
  simple string literals will be treated as
  unicode string literals.

Would this break any existing code?

Thanks for your time and experience,
  pepr

-- 
Petr Prikryl (prikrylp at skil dot cz)