[Python-Dev] utf8 issue

Guido van Rossum guido@python.org
Thu, 05 Sep 2002 09:51:49 -0400


> > Please do.  Bumping MAGIC is a no-no between dot releases.  But I
> > don't understand why that is necessary?
> 
> It would be necessary since marshal uses UTF-8 for storing
> Unicode literals.

Do you mean that in 2.2 it doesn't?

> Even though it's highly unlikely that the problem cases are used in
> Python Unicode literals, there's a tiny chance. Without the MAGIC
> change this could result in PYC files failing to load.

Ha.  You may have missed the start of this thread, but the whole
problem was that a PYC file *did* fail to load!  (The .py file had a
lone surrogate in it.)  So I'm not sure this argument holds much
water.

Can someone please explain what change would be necessary to what part
of the code to prevent a lone surrogate in a string literal from
creating a PYC file from blowing up?

--Guido van Rossum (home page: http://www.python.org/~guido/)