[Python-Dev] Python-3.0, unicode, and os.environ

Adam Olsen rhamph at gmail.com
Wed Dec 10 19:31:45 CET 2008


On Wed, Dec 10, 2008 at 3:39 AM, Ulrich Eckhardt
<eckhardt at satorlaser.com> wrote:
> On Tuesday 09 December 2008, Adam Olsen wrote:
>> The only thing separating this from a bikeshed discussion is that a
>> bikeshed has many equally good solutions, while we have no good
>> solutions.  Instead we're trying to find the least-bad one.  The
>> unicode/bytes separation is pretty close to that.  Adding a warning
>> gets even closer.  Adding magic makes it worse.
>
> Well, I see two cases:
> 1. Converting from an uncertain representation to a known one.
> 2. Converting from a known representation to a known one.

Not quite:
1. Using a garbage file name locally (within a single process, not
talking to any libs)
2. Using a unicode filename everywhere (libs, saved to config files,
displayed to the user, etc.)

Note that if you have a GUI doing the former, all you technically need
is a placeholder like "<undecodable filename>".  You might try to
extract some ASCII out of it, but that's just a minor bonus.

On linux the bytes/unicode separation is perfect for this.  You decide
which approach you're using and use it consistently.  If you mess up
(mixing bytes and unicode) you'll consistently get an error.

We currently don't follow this model on windows, so a garbage file
name gets passed around as if it was unicode, but fails when passed to
a lib, saved to a config file, is displayed to a user, etc.
(Depending on the API, as many won't validate either.)


-- 
Adam Olsen, aka Rhamphoryncus


More information about the Python-Dev mailing list