[Python-ideas] Python 3000 TIOBE -3%

Fri Feb 17 03:22:44 CET 2012

Barry Warsaw writes:

 > I really hope you do this, but note that it would be very helpful to have
 > guidelines and recommendations even for advanced, knowledgeable Python
 > developers.

 > I have participated in many discussions in various forums with
 > other Python developers where genuine differences of opinion or
 > experience, leads to different solutions.  It would be very helpful
 > to point to a document and say "here are the best practices for
 > your [application|library] as recommended by core Python experts in
 > Unicode handling."

I'll see what I can do, but for *best practices* going beyond the
level of Paul Moore's use case is difficult for the reasons elaborated
elsewhere (by others as well as myself): basic Unicode handling is no
harder than ASCII handling as long as everything is Unicode.  So the
real answer is to insist on valid Unicode for your text I/O, failing
that, text labeled *as* text *with* an encoding[1], and failing that
(or failing validation of the input), reject the input.[2]

If that's not acceptable -- all too often it is not -- you're in a
world of pain, and the solutions are going to be ad hoc.  The WSGI
folks will not find the solutions proposed for email acceptable, and
vice versa.

Something like the format Nick proposed, where the tradeoffs are
described, would be useful, I guess.  But the tradeoffs have to be
made ad hoc.

Footnotes: 
[1]  Of course it's OK if these are implicitly labeled by requirements
or defaults of a higher-level protocol.

[2]  This is the Unicode party line, of course.  But it's really the
only generally applicable advice.