[Python-Dev] Relaxing Unicode error handling

Martin v. Loewis martin at v.loewis.de
Tue Jan 6 16:49:01 EST 2004


M.-A. Lemburg wrote:

> So you are only talking about the case where the application
> uses the standard default encoding (ASCII) and does not
> make use of any other codecs ?

Yes, I'd like to change the error handling in the case of
an implicit conversion - in particular for the unicode-to-ascii
case, but also for the (non-)ascii to unicode case.

Application authors *think* they got rid of all non-trivial
instances of such conversions, only to find out that their
customers can produce endless series of application crashes
by entering funny characters in all imaginable places.

>> Sure, but they would also not be the default codecs.
> 
> 
> Why not ? What about Asian users who set the default encoding
> to one of the encodings supported by e.g. the JapaneseCodecs
> package ?

If that was a proper Python feature, I'm sure the cjkcodecs
would support it instantly. However, perhaps people also take
our advise and avoid changing the default encoding (because that
*doesn't* work); so even in applications where cjkcodecs are
heavily used, I would hope that the system encoding remains
at us-ascii. But if it doesn't, cjkcodecs should also implement
the change I'm proposing. This is a red herring.

> Oh, sorry, that's the term I use for PyArg_ParseTuple() format
> arguments.

Ah, right: They should make use of the default error handling
as well.

> Given the scenario you mention above, wouldn't that also be
> possible by providing a customized codec for "ascii" under
> a new name "all-things-ascii" and then setting the default
> encoding to "all-things-ascii" ?

No. Changing the system default encoding is not possible for
applications - it is the system administrator that needs
to make this change (in site.py). I'm proposing a change that
applications can make at run-time.

> Just think of the issues this could cause in multi-user systems
> such as Zope that are not prepared for these changes:
> a script could easily change the settings to have
> the server execute code under different user ids if threads
> executing their requests generate codec errors (the errors
> parameter can be set to a callback now that we have the new
> logic in place...).

This is also a red herring. Zope can give very controlled
access to builtins, and could just dis-allow scripts to change
the setting - Zope applications would need to find a different
way.

Remember, it is a work-around - so it clearly has limitations.
I'm proposing it anyway, and I'm fully aware of the limitations.

Regards,
Martin





More information about the Python-Dev mailing list