[issue1272] Decode __file__ and co_filename to unicode using fs default

Christian Heimes report at bugs.python.org
Sun Oct 14 01:27:30 CEST 2007


Christian Heimes added the comment:

Guido van Rossum wrote:
> - You added a removal of hotshot from setup.py to the patch; but that's
> been checked in in the mean time.

Oh, the change shouldn't make it into the patch. I guess I forgot a svn
revert on setup.py

> - Why add an 'errors' argument to the function when it's a fatal error
> to use it?

I wanted the signature of the method be equal to the other methods
PyUnicode_Decode*. I copied the FatalError from
*_PyUnicode_AsDefaultEncodedString().

> - Using 0 to autodetect the length is scary.  Normally we have two APIs
> for that, one ..._FromString and one ...FromStringAndSize.  If you
> really don't want that, please use -1, which is at least an illegal value.

Oh right, -1 is *much* better for autodetect than 0. What do you prefer,
a second method or -1 as auto detect?

> - Why is there code in codeobject.c::PyCode_New() that still accepts a
> PyString for the filename?

Because it's my fault that I've overseen it. :/

> - In that file (and possibly others, I didn't check) your code uses
> spaces to indent while the surrounding code uses tabs.  Moreover, your
> space indent seems to assume there are 4 spaces to a tab, but all our
> code (Python and C) is formatted assuming tabs are 8 spaces.  (The
> indent isn't always 8 spaces -- but ASCII TAB characters always are 8,
> for us.)

Some C files like unicodeobject.c are using 4 spaces while other files
are using tabs for indention. My editor may got confused by the mix.
I've manually fixed it in the patch but I may have overseen a line or two.

> - Why copy the default encoding before mangling it?  With a little extra
> care you will only have to copy it once.  Also, consider not mangling at
> all, but assuming the encoding comes in a canonical form -- several
> other functions assume that, e.g. PyUnicode_Decode() and
> PyUnicode_AsEncodedString().

My C is a bit rusty and still need to learn news tricks. I'm trying to
see if I can remove the extra copy without causing a problem.
The other part of your question was already answered by Alexandre. The
aliases map is defined in Python code. It's not available so early in
the boot strapping process.
We'd have to redesign the assignment of co_filename and __file__
completely if we want to use the aliases and other codecs. For example
we could store a PyString at first and redo all names once the codecs
are set up.

Christian

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1272>
__________________________________


More information about the Python-bugs-list mailing list