[Python-Dev] PEP 383 update: utf8b is now the error handler

Stephen J. Turnbull stephen at xemacs.org
Wed May 6 15:33:17 CEST 2009


"Martin v. Löwis" writes:

 > > Yeah, yeah, this is the same old same old from PEP 3131.  Anything
 > > that handles the various attacks based on ASCII-alike characters
 > > should at least rule out invalid Unicode, too!
 > > 
 > > And where is this U+DC2F supposed to be coming from, anyway?  The
 > > user's *local* environment or the user's *local* filesystem! 
 > 
 > Why is that not a threat? Suppose you have a setuid application, and
 > you pass some string on the command line that decodes to /../. Then
 > the setuid application will be tricked into modifying files it didn't
 > mean to modify.

Of course this is a threat, assuming that the application takes no
precautions.  But first, it should be stopped by any of several
standard precautions.  For example, applying os.path.realpath (come to
think of it, PEP 383 should say something about realpath, shouldn't
it?) and os.path.normpath (PEP 383 should definitely say something
about this function; maybe PEP 3131 should, too) before checking
access restrictions.  If you're not running your paths through those,
you're already vulnerable to symlink attacks, and maybe other forms of
spoofing.

Second, it's a threat already enabled by your restricted version of
PEP 383.  Access control applies to subdirectories as well as to
parent directories.  Since you can insert arbitrary non-ASCII bytes
into the path using the current definition of 'utf8b', name-based
access restrictions can be bypassed in exactly the same way for any
directory whose name is not 100.00% ASCII, and the setuid application
will be tricked into modifying files it didn't mean to modify.

Also, on Mac OS X, system directories, including directories
containing system libraries, frameworks, and executables, may be
accessible via locale-specific names (I don't have a Japanese-
localized Mac at hand to check, but I'm pretty sure in my old Mac the
Japanese names appeared in ls in Terminal.app, which means it may be
possible to access system directories containing libraries,
frameworks, and executables this way).  Those can be spoofed in
exactly the same way.

 > Nothing is lost at the moment.

Nothing is lost compared to 'strict', true, but under the PEP as it is
a large fraction of Shift JIS and Big5 filenames cannot be read under
ASCII-compatible file system encodings using 'utf8b'.  Yet it is those
users who are placed at risk by PEP 383.

 > In any case, Python 3.1b1 may get released today, so it's way too late
 > for new features in the PEP. They can wait for Python 3.2.

You have convinced me that the PEP should wait as well.

In its current form it is incomplete and dangerous.



More information about the Python-Dev mailing list