[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Aahz aahz at pythoncraft.com
Thu Apr 30 15:42:36 CEST 2009


[top-posting for once to preserve full quoting]

Glenn,

Could you please reduce your suggestions into sample text for the PEP?
We seem to be now at the stage where nobody is objecting to the PEP, so
the focus should be on making the PEP clearer.

If you still want to create an alternative PEP implementation, please
provide step-by-step walkthroughs, preferably in a new thread -- if you
did previously provide that, it's gotten lost in the flood of messages.

On Thu, Apr 30, 2009, Glenn Linderman wrote:
> On approximately 4/29/2009 8:46 PM, came the following characters from  
> the keyboard of Terry Reedy:
>> Glenn Linderman wrote:
>>> On approximately 4/29/2009 1:28 PM, came the following characters 
>>> from 
>>
>>>> So where is the ambiguity here?
>>>
>>> None.  But not everyone can read all the Python source code to try to 
>>> understand it; they expect the documentation to help them avoid that. 
>>> Because the documentation is lacking in this area, it makes your  
>>> concisely stated PEP rather hard to understand.
>>
>> If you think a section of the doc is grossly inadequate, and there is 
>> no existing issue on the tracker, feel free to add one.
>>
>>> Thanks for clarifying the Windows behavior, here.  A little more  
>>> clarification in the PEP could have avoided lots of discussion.  It  
>>> would seem that a PEP, proposed to modify a poorly documented (and  
>>> therefore likely poorly understood) area, should be educational about 
>>> the status quo, as well as presenting the suggested change.
>>
>> Where the PEP proposes to change, it should start with the status quo.  
>> But Martin's somewhat reasonable position is that since he is not  
>> proposing to change behavior on Windows, it is not his responsibility 
>> to document what he is not proposing to change more adequately.  This  
>> means, of course, that any observed change on Windows would then be a  
>> bug, or at least a break of the promise.  On the other hand, I can see  
>> that this is enough related to what he is proposing to change that  
>> better doc would help.
>
>
> Yes; the very fact that the PEP discusses Windows, speaks about  
> cross-platform code, and doesn't explicitly state that no Windows  
> functionality will change, is confusing.
>
> An example of how to initialize things within a sample cross-platform  
> application might help, especially if that initialization only happens  
> if the platform is POSIX, or is commented to the effect that it has no  
> effect on Windows, but makes POSIX happy.  Or maybe it is all buried  
> within the initialization of Python itself, and is not exposed to the  
> application at all.  I still haven't figured that out, but was not (and  
> am still not) as concerned about that as ensuring that the overall  
> algorithms are functional and useful and user-friendly.  Showing it  
> might have been helpful in making it clear that no Windows functionality  
> would change, however.
>
> A statement that additional features are being added to allow  
> cross-platform programs deal with non-decodable bytes obtained from  
> POSIX APIs using the same code that already works on Windows, would have  
> made things much clearer.  The present Abstract does, in fact, talk only  
> about POSIX, but later statements about Windows muddy the water.
>
> Rationale paragraph 3, explicitly talks about cross-platform programs  
> needing to work one way on Windows and another way on POSIX to deal with  
> all the cases.  It calls that a proposal, which I guess it is for  
> command line and environment, but it is already implemented in both  
> bytes and str forms for file names... so that further muddies the water.
>
> It is, of course, easier to point out deficiencies in a document than to  
> write a better document; however, it is incumbent upon the PEP author to  
> write a PEP that is good enough to get approved, and that means making  
> it understandable enough that people are in favor... or to respond to  
> the plethora of comments until people are in favor.  I'm not sure which  
> one is more time-consuming.
>
> I've reached the point, based on PEP and comment responses, where I now  
> believe that the PEP is a solution to the problem it is trying to solve,  
> and doesn't create ambiguities in the naming.  I don't believe it is the  
> best solution.
>
> The basic problem is the overuse of fake characters... normalizing them  
> for display results is large data loss -- many characters would be  
> translated to the same replacement characters.
>
> Solutions exist that would allow the use of fewer different fake  
> characters in the strings, while still having a fake character as the  
> escape character, to preserve the invariant that all the strings  
> manipulated by python-escape from the PEP were, and become, strings  
> containing fake characters (from a strict Unicode perspective), which is  
> a nice invariant*.  There even exist solutions that would use only one  
> fake character (repeatedly if necessary), and all other characters  
> generated would be displayable characters.  This would ease the burden  
> on the program in displaying the strings, and also on the user that  
> might view the resulting mojibake in trying to differentiate one such  
> string from another.  Those are outlined in various emails in this  
> thread, although some include my misconception that strings obtained via  
>  Unicode-enabled OS APIs would also need to be encoded and altered.  If  
> there is any interest in using a more readable encoding, I'd be glad to  
> rework them to remove those misconceptions.
>
> * It would be nice to point out that invariant in the PEP, also.
>
>
> -- 
> Glenn -- http://nevcal.com/
> ===========================
> A protocol is complete when there is nothing left to remove.
> -- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/aahz%40pythoncraft.com

-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair


More information about the Python-Dev mailing list