[Python-Dev] Proposed Python 3.0 schedule

Hrvoje Nikšić hrvoje.niksic at avl.com
Tue Oct 7 13:01:24 CEST 2008


On Tue, 2008-10-07 at 11:30 +0200, Victor Stinner wrote:
> >   - I'd think "find . -type f -print0 | xargs -0 python -c 'pass'"
> > ought to work (with files with "bad" bytes being returned by find),
> 
> First, fix your home directory :-) There are good tools (convmv?) to fix 
> invalid filenames.

Fixing the home directory doesn't help in the long run because files
with non-UTF-8 file names on a nominally UTF-8 system are not that
exceptional, they crop up all over the place in non-ASCII countries.
One can obtain them simply by copying stuff from a DVD someone else
burned, by downloading a Japanese-released torrent, or by copying files
from a shared hard drive.

> > which means that Python shouldn't blow up and refuse to start when
> > there's a non-properly-encoding argv ("Could not convert argument 1 to
> > string" and exiting isn't appropriate behavior)
> 
> Why not? It's a good idea to break compatibility to refuse invalid bytes 
> sequences. You can still uses the command line, an input file or a GUI to 
> read raw bytes sequences.

Maybe I am misunderstanding you, but if python blows up at startup when
unable to encode argv to Unicode, then how can you still use the command
line to access the actual file name?



More information about the Python-Dev mailing list