[Python-Dev] Python-3.0, unicode, and os.environ

Glenn Linderman v+python at g.nevcal.com
Mon Dec 8 10:54:54 CET 2008


On approximately 12/8/2008 12:57 AM, came the following characters from 
the keyboard of Stephen J. Turnbull:

> "Internal decoding" is (or should be) an oxymoron.  Why would your
> software be passing around text in any format other than internal?  So
> decoding will happen (a) on I/O, which is itself almost certainly
> slower than making a few checks for Unicode hygiene, or (b) on receipt
> of data from other software that whose sanitation you shouldn't trust
> more than you trust the Internet.
> 
> Encoding isn't a problem, AFAICS.


So I can see validating user supplied data, which always comes in via I/O.

But during manipulation of internal data, including file and database 
I/O, there is a need for encoding and decoding also.  If all the data 
has already been validated, then there would be no need to revalidate on 
every conversion.

I hear you when you say that clever coding can make the validation 
nearly free, and I applaud that: the UTF-8 coder that I wrote predated 
most of the rules that have been created since, so I didn't attempt to 
be clever in that regard.

Thanks to you and Adam for your explanations; I see your points, and if 
it is nearly free, I withdraw most of my negativity on this topic.


-- 
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking


More information about the Python-Dev mailing list