[Python-Dev] PEP 383 (again)

Thu Apr 30 00:17:42 CEST 2009

On approximately 4/29/2009 1:06 PM, came the following characters from 
the keyboard of Martin v. Löwis:

 > Thanks, fixed.

Thanks for your fixes.  They are helpful.

> I'm at a loss how to make the text more clear than it already is. I'm
> really not good at writing long essays, with a lot of
> explanatory-but-non-normative text. I also think that explanations do
> not belong in the section titled specification, nor does a full
> description of the status quo belongs into the PEP at all. The reader
> should consult the current Python source code if in doubt what the
> status quo is.

The status quo is what justifies the existence of the PEP.  If the 
status quo were perfect, there would be no need for the PEP.

The status quo should be described in the Rationale.  Some of it is. 
The rest of it should be.  If there is a need for this PEP for POSIX, 
but not Windows, the reason why should be given (Para 2 in Rationale 
seems to try to describe that, but doesn't go far enough), and also the 
reason that cross-platform code can install this PEP's error handler on 
both platforms, yet it won't affect bytes interfaces on Windows.  These 
are two omissions that have both caused large amounts of discussion.

Attempting to understand the Python source code is a good thing, but 
there is a lot to understand, and few will achieve a full understanding.

>> The 4th paragraph is now confusing too... would it not be the decode
>> error handler that returns the byte strings, in addition to the Unicode
>> strings?
> 
> No, why do you think so? That's intended as stated.

Here, a use case, or several, in the PEP could help clarify why it would 
be the encode error handler that would return both the bytes string and 
the Unicode string.  And why the decode error handler would not need to.

Seems that if the decode handler preserved the bytes from the OS, and 
made them available as well as the decoded Unicode, that could be 
interesting to the application that is wanting to manipulate the file.

Seems that if the encode handler is given the Unicode, so not clear why 
it should also return it.  I guess if there is an error during the 
encode process (can there be?) then the bytes and Unicode for comparison 
could be useful for error reporting.

But I shouldn't have to guess.  The PEP should explain how these things 
are useful.  The discussion section could be extended with use cases for 
both the encode and decode cases.

-- 
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking