Unicode and Zipfile problems

Thu Nov 6 01:19:40 EST 2003

Peter Otten <__peter__ at web.de> writes:

> I'm not aware if there has been a discussion before, but I think it would be
> worth the overhead if every string were aware of its encoding, so that
> together with the -*- comment in the script you'd never again have to
> explicitly go through the decode/encode routine *and* could avoid those
> funny filenames - or two-character umlauts that accidentally made it into
> your ISO-8859-1 files.

Bill Janssen first suggested that a before Unicode was introduced. I
believe it won't help much, as, in many cases, Python can't know what
encoding a byte string is, e.g. if you read from a file, or a socket.
In some cases, you have binary data proper.

Regards,
Martin